...

/

Zoom API Design Evaluation and Latency Budget

Zoom API Design Evaluation and Latency Budget

Analyze the non-functional requirements and estimate the response time of the Zoom meeting API.

Introduction

Modeling a complex service is a time-consuming process that may require many rounds of fine-tuning. In this lesson, we’ll discuss how we can achieve the non-functional requirements, especially real-time communication, and estimate the response time of our proposed Zoom meeting API.

Non-functional requirements

Let's discuss the non-functional requirements for our Zoom API one by one:

Availability and reliability

We ensure the availability of our services by dividing servers according to different roles. For example, the meeting service handles requests to create, update, add participants, and so on, while the media controller handles client requests for managing meeting sessions. By adopting a role-based style, we can separate different workflows. In the event of a failure, if one service goes down, the other can still run normally, making our system resilient to complete outages. Additionally, services and data are replicated across different geological regions to avoid single points of failure (SPOF). We also have API monitoring and circuit breakers to identify and handle bad situations as quickly as possible. We limit concurrent meeting requests based on the account type for efficient resource management. For free users, we also limit the maximum time for a meeting to avoid the overloading of servers.

Press + to interact
Improving availability and reliability with separation of workflows
Improving availability and reliability with separation of workflows

Security

We use TLS/1.3 for normal communication, and to exchange AES keys for multimedia transmission. After successfully sharing the key, the connection is upgraded to WebSockets for AES-encrypted data transfers. We implement authentication/authorization using a login mechanism and OAuth, and OpenID Connect with PKCE flows for third-party interactions (see: the authorization framework). Connecting to the media router requires an access token. Guest (unregistered) participants can also join using their access token, which is only issued when the host accepts their join request.

Scalability

Locally distributed media routers make scaling services easier. We also have decoupled media routers and media controllers, which allow us to deploy multiple media routers in an area controlled by a single controller, making this a cost-effective solution. Stateless communication between the conferencing service and the media controller allows efficient resource management during workload peaks.

Point to Ponder

1.

What determines the maximum number of users a service like Zoom can handle in a single meeting?

Show Answer
Q1 / Q1
Did you find this helpful?

Optimization and tradeoffs

  • The stateful nature of WebSockets can be a scalability issue for our service, which is inevitable due to the two-way and real-time nature of the service. However, we may scale our service by increasing the number of regional media servers, which is an expensive solution, but there is always some sort of tradeoff.

  • Additionally, because we’ve learned from a previous lesson ...