Now that the dust has settled a bit on Ignite, it’s time to go back and look at the announcements made around the Calls & Meeting API. Hopefully, I can answer some common questions I’ve been asked over the past few days. If there’s a question that I haven’t answered though, let me know and I’ll update this post.
What just got announced?
A new API, for Teams. An API is an Application Programming Interface – it’s a way for developers to interact with a system using code. In this instance, it allows developers to control some aspects of Teams, specifically around calls and meetings. The API is in Public Preview right now – meaning you can use it but if it breaks, oh well.
What can it do?
APIs consist of a set of functionalities – things that can be done via the API. For this API, that set is split into 3 sections – Basic Calling, Meetings and IVR:
- Receive a call – i.e. be told when someone is ringing your code
- Answer a call (the next step from receiving a call)
- Making a call
- Transferring a call
- Hanging up a call
- Joining a meeting
- Viewing participants in a meeting
- Adding participants to a meeting
- Muting or Unmuting a participant
- Creating meetings
- Playing an audio file (for instance a welcome prompt)
- Recording audio (for instance someone saying ‘Sales’ or ‘Make a complaint’)
- Recognising DTMF tones
- Altering audio routes (within a meeting, adjust the standard audio routes to support things like Whisper)
So, is this the “UCMA for Teams”?
I get this question a lot. UCMA is the API for Skype for Business On-Premise which allows developers to write bots, IVRs and other applications which interact with Skype for Business.
It’s not really appropriate to think about this new API as an equivalent to UCMA. UCMA provided deep and wide-ranging control over how Skype for Business operated. Using UCMA you could impersonate users, even sending IMs and calls from them, and have complete control over many parts of the system. No API for Teams is ever going to enable that level of control because Teams is a multi-tenanted, hosted solution managed and run by Microsoft – which is very different from a single Skype for Business environment run and managed by a single organisation.
However, UCMA was used by developers to enable a number of different types of applications – such as IVRs, contact centre solutions, meeting assistants etc. It’s these scenarios which the Calls & Meeting API enables in Teams, via the functionality above.
How different is this new API to use from UCMA?
Pretty different. UCMA is a .NET-based stateful wrapper. Sure, under the hood, UCMA sent and received SIP messages, but developers dealt at a higher level to that. UCMA relied a lot of asynchronous calls and events to abstract the messaging that was happening.
The Calls & Meeting API is much more like UCWA, which came along a lot later. It’s a stateless, HTML-based, REST-like API. You make HTML calls to get stuff done. When the API wants to tell you something (for instance, about a new call), it will call you, on a pre-defined API call you’ve set up. It’s a bit like both sides in the system (Microsoft and you) set up a number of endpoints for the other side to ‘tag’ whenever something happens or needs to get done.
The nice thing about this is that this is a really common model and is well understood by developers everywhere. This is a Good Thing because it reduces the barrier to entry for people wanting to build solutions with #MicrosoftTeams.
There is a downside though. Doing something complicated like making a call or transferring someone into a meeting involves a lot of different calls, with a lot of different IDs and codes each time. Because it’s a stateless API, all the identifying information has to be passed each time which can quickly get really confusing. This was also the case with UCWA.
The solution to this (as was also the case with UCWA) is a stateful wrapper, which abstracts all that complication and provides a base that’s easier to work with. The Calls & Meeting API ships which exactly such a wrapper SDK (in .NET only today) which makes working with the API a lot easier. Unless you need to go completely stateless (for instance you’re building a micro-services architecture) then there’s no reason not use the wrapper SDK to make development easier and faster.
I keep hearing about Graph. What is it, and why do I care?
Several years ago, as Office 365 was maturing, Microsoft started to make a number of APIs available to developers, enabling them to perform tasks in Office against Office 365. These were things like reading Exchange items, adding calendar items, looking up people in AD etc. Each Office application had its own API, its own (slightly different) method of authentication to the service, and it’s own (slightly different) way of getting things done.
Apart from the messiness, the problem with this is that lots of solutions need to make use of more than one service at a time. If you want to create a Word document, upload it to OneDrive, find a user’s direct reports and then email them a link to the document – that’s 4 different systems. Your application (and therefore your user) has to go through 4 different authentication methods. There had to be a better way.
Graph is that better way. At it’s most simple, Graph is just a container for all the APIs. There’s a single authentication method and a common structure to how the APIs are structured and called. It’s made working with Office 365 APIs much more simple.
The new Calls & Meeting API comes to Public Preview in Graph. This is great news because it means that the authentication process is common, the structure of API calls is common, and Graph tools (such as Graph Explorer) are available to use. With the benefit of hindsight, it was never going to be any other way.
What is Service-Hosted Media and Application-Hosted Media all about?
Sometimes called Remote Media and Local Media, there are actually two different ways of dealing with media.
Service-Hosted Media (Remote Media) means that the underlying media flows (the buffers, the codes, the mess) are handled for you by Teams. You can call API endpoints in order to play an audio file into a meeting, or record what someone is saying. Those things happen, but you don’t get too involved in them. It’s an arms-length, hassle-free way of working: “hey, API, play welcome.wav and tell me when you’re done”. For many applications, this is all that’s needed and is a much easier way of working.
Sometimes though, that’s not enough control. If you need access to the actual audio and video stream itself (the bits and bytes) because you want to do some real-time media processing, then the approach above won’t work because you’re not close enough to it. That’s when you need Application-Hosted Media (Local Media). In this model, you have full access to the raw media that make up a call leg, and the opportunity to process, manipulate or inject your own media. It’s way more complicated and uses local resources on a server that you’ll have to pay for, but for niche scenarios, it provides developers with the power they need.
My advice? Stick with Service-Hosted Media until you can’t for whatever reason – as there’s no point going more complex than you need to and you’ll have to pay for computing power to do so.
[Interesting aside: the Teams VTC Interop announced at Ignite 2018 is built on this same model.]
What does the Auth model look like?
Remembering that Teams is multi-tenanted, the authentication model is more involved than you might be used to. The good news though is that the Calls & Meeting API is using a lot of already-established identification and authentication methods.
Before you can do anything, you need to register your application. This is done as an Azure AD Application (which is also how you register other Graph-based applications). I’ll be doing a full walkthrough of exactly how this works in the coming weeks, but the net result is that you end up with an Application ID and Application Secret. This is your applications “ID badge” when talking to Graph.
Secondly, you (or your tenant admin) gives your application a number of permissions. These permissions define what your application is allowed to do. These don’t have to be Teams-specific (it’s all Graph remember) so you could have an application which has permission to make and receive calls but also read all user calendars. (if you want to build some sort of automated assistant bot).
Thirdly, you need to create a bot instance in Azure and enable it for calling. You provide the Application ID from the first step and fine-tune things like the Display Name, icon, etc. This is also where you specify where your code is hosted. This is important: as I said earlier – the API will callback to notify you of events, and this is where you tell it how to do that.
Finally, you have your code, which includes your Application ID and Secret and is hosted at the URL you specified in the third step. This magic combination of things means that when you try and do something like make a call, Graph recognises that your code has been registered, has the permissions to do what you’re asking, and then uses the callback API to notify you about the state of the call.
It’s a rational set of authentication and authorisation steps which probably sounds more complicated in text than it is. If you’re totally confused, don’t worry. I’ll be doing a full run-through in a future blog post soon!
Where can I find out more?
If this has whetted your appetite and made you keen to start building Teams applications – good! There are a number of places you can go to help support you in building on Teams:
- Teams Dev Centre – this is where you can find Microsoft-produced training and tutorials, including samples.
- Developer Support – this is where you can get help, as well as announcements on changes and updates.
- Success with Teams Developer Guidance – documentation about all aspects of Teams development, including transition guidance
- This blog! I’m going to be doing a series of blog posts, samples and videos about this exciting new API. The landing page for everything Teams Development is thoughtstuff.co.uk/teams.
About the Author:
Tom is a Microsoft Teams and Skype for Business developer and Microsoft MVP with over 10 years experience in the software development industry. For the last 7 years he has worked at Modality Systems, a specialist provider of Universal Communications services, as Product Innovation Architect.
Tom is passionate about creating great software that people will find useful. He enjoys blogging and speaking about Skype for Business and Microsoft Teams development, Office365, Bot Framework, Cognitive Services and AI, and the future of the communications industry. He blogs at thoughtstuff.co.uk and tweets at @tomorgan.
Morgan, T (2018). Calls & Meetings API in Graph – What does it ACTUALLY mean? (and some other common questions). Available at: https://blog.thoughtstuff.co.uk/2018/10/calls-meetings-api-in-graph-what-does-it-actually-mean-and-some-other-common-questions/ [Accessed 5 November 2018]