...

/

YouTube API Design Evaluation and Latency Budget

YouTube API Design Evaluation and Latency Budget

Learn how our API meets non-functional requirements and what is the estimated response time for streaming a video.

Introduction

In the previous lesson, we learned how a streaming service's functional requirements are met. In this lesson, we’ll focus on some interesting aspects of the non-functional requirements of the design. We’ll try to answer some of the common questions that might have come to your mind regarding API performance.

Non-functional requirements

The subsequent sections discuss how the non-functional requirements are met.

Scalability

YouTube-like systems need to scale in both aspects, that is, horizontally and vertically. We provide loosely coupled services by executing independent tasks statelessly and in parallel. Since it is not possible to serve numerous users requesting large-sized videos simultaneously, YouTube uses Internet exchange pointsInternet exchange points are common grounds of IP networking, allowing participant Internet service providers to exchange data destined for their respective networks. Wikipedia. 2001. “Wikipedia.” Wikipedia.org. January 15, 2001. https://www.wikipedia.org/. to populate CDNs and the Google global cache (GGC)Google keeps certain viral static content at the ISP level to serve its users in densely populated areas. YouTube videos are remotely managed in the servers placed at the ISP level. These servers act as a cache for serving clients. to serve end users effectively to scale its services.

Press + to interact
Hierarchy for serving static viral content to end users
Hierarchy for serving static viral content to end users

Availability

In cases of load spikes, such as epidemic events, unexpected viral videos, or DDoS attacks, we fan out client requests by adding a queuing system, allowing servers to respond when they have free capacity, rather than processing them directly. This may add some response delay, but the system will remain available under these circumstances. We also offload some processing to the client machine, such as managing playtime and other events, by sending only the most necessary events to the server.

Additionally, routing popular content through Content Delivery Networks (CDNs) allows us to reduce latency, avoid single points of failure, and increase fault tolerance. Below, we provide the CDN workflow for ...

Access this course and 1400+ top-rated courses and projects.