Grokking the Product Architecture Interview/

...

YouTube API Design Evaluation and Latency Budget

Learn how our API meets non-functional requirements and what is the estimated response time for streaming a video.

We'll cover the following...

Introduction
Non-functional requirements
Latency budget
- The manifest file
  - Response time
- Audio and video segments
  - Request and response size
  - Response time
Optimization and tradeoffs

Introduction

In the previous lesson, we learned how a streaming service's functional requirements are met. In this lesson, we’ll focus on some interesting aspects of the non-functional requirements of the design. We’ll try to answer some of the common questions that might have come to your mind regarding API performance.

Non-functional requirements

The subsequent sections discuss how the non-functional requirements are met.

Scalability

YouTube-like systems need to scale in both aspects, that is, horizontally and vertically. We provide loosely coupled services by executing independent tasks statelessly and in parallel. Since it is not possible to serve numerous users requesting large-sized videos simultaneously, YouTube uses Internet exchange pointsInternet exchange points are common grounds of IP networking, allowing participant Internet service providers to exchange data destined for their respective networks. Wikipedia. 2001. “Wikipedia.” Wikipedia.org. January 15, 2001. https://www.wikipedia.org/. to populate CDNs and the Google global cache (GGC)Google keeps certain viral static content at the ISP level to serve its users in densely populated areas. YouTube videos are remotely managed in the servers placed at the ISP level. These servers act as a cache for serving clients. to serve end users effectively to scale its services.

Press + to interact

Availability

In cases of load spikes, such as epidemic events, unexpected viral videos, or DDoS attacks, we fan out client requests by adding a queuing system, allowing servers to respond when they have free capacity, rather than processing them directly. This may add some response delay, but the system will remain available under these circumstances. We also offload some processing to the client machine, such as managing playtime and other events, by sending only the most necessary events to the server.

Additionally, routing popular content through Content Delivery Networks (CDNs) allows us to reduce latency, avoid single points of failure, and increase fault tolerance. Below, we provide the CDN workflow for a client requesting a video ...

Introduction to the Course

Network Intricacies

Different Ways of Client-Server Communication

Common Data Formats for Web APIs

Comparison of API Architectural Styles

API Design Security

Important Concepts in Product Architecture

Back-of-the-Envelope Calculations for Latency

What Are the Foundational API Designs?

Design a Search Service

Design a File Service

Design a Comment Service

Design a Pub-Sub Service

Concluding Foundational Design Problems

YouTube Streaming API Design

YouTube

Facebook Messenger API Design

Google Maps API Design

Google Maps

Learn to Design a Chess API with AI Mentor

Zoom API Design

Zoom

Leetcode API Design

LeetCode

Payment Gateway API Design—Stripe

Stripe

Twitter API Design

Uber API Design

Uber

CamelCamelCamel API Design

CamelCamelCamel (C3)

Gaming API Design

API Failures and Mitigations

Evernote

Conclusion

YouTube API Design Evaluation and Latency Budget

Introduction

Non-functional requirements

Scalability

Availability