The purpose is to evaluate a candidate’s ability to design scalable, efficient, and maintainable systems. It tests their problem-solving skills, understanding of architecture, and ability to communicate complex ideas.
Over my 10+ years as a systems engineer and hiring manager at Microsoft and Facebook, I led hundreds of software engineer candidates through System Design interviews.
Surprisingly, I found that even the best developers often struggled with System Design problems. Why? I think it's because System Design questions can be open-ended, and therefore require creativity and problem solving skills not practiced in other coding interview challenges.
While SDI questions tend to evolve over time, many have remained popular across the industry. These questions are well-suited to evaluate candidates on two important levels:
Test the candidate's understanding of System Design fundamentals
Evaluate the candidate's ability to apply those fundamentals in real-world applications
Today, we’ll break down the top 25 System Design Interview questions for 2024. These are essential questions asked at top companies like Google, Amazon, Meta, and more. Mastering these problems, and their solutions, will give you a huge leg up in your System Design interview prep.
Finally, I will leave you with a few battle-tested strategies that you can use to confidently take on any System Design question you encounter.
I have divided these 25 essential System Design problems into three difficulty levels:
Design an API rate limiter for sites like Firebase or GitHub
Design a pub/sub system like Kafka
Design a URL-shortening service like TinyURL or bit.ly
Design a scalable content delivery network (CDN)
Design a web crawler
Design a distributed cache
Design a chat service like Facebook Messenger or WhatsApp
Design a mass social media service like Facebook or Instagram
Design a proximity service like Yelp or nearby places/friends
Design a search engine-related service like Typeahead
Design a video streaming service like YouTube or Netflix
Design a ride-sharing service like Uber or Lyft
Design a recommendation service
Design a file-sharing service like Google Drive
Design a social network and message board like Reddit or Quora
Design a social media newsfeed service
Design a collaborative editing service like Google Docs
Design Google Maps
Design a payment gateway like Stripe
Design a food-delivery service like Uber Eats or DoorDash
Design a distributed locking service like Google Chubby locking
Design a coordination system like ZooKeeper
Design a scalable distributed storage system like Bigtable
Design an online multiplayer game system
Design video conference service
Before we start breaking down specific questions, I want to give you some high-level System Design tips that will enable you to confidently approach any problem.
Start each problem by stating what you know: List all required features of the system, common problems you expect to encounter with this sort of system, and the traffic you expect the system to handle. The listing process lets the interviewer see your planning skills and correct misunderstandings before you begin the solution.
Narrate any trade-offs: Every System Design choice matters. At each decision point, list at least one positive and negative effect of that choice.
Ask your interviewer to clarify: Most System Design questions are purposefully vague. Ask clarifying questions to show the interviewer how you view the question and your knowledge of the system’s needs. Also be sure to state your assumptions before diving into the components.
Know your architectures: Most modern services are built upon a flexible microservice architecture. Unlike the monolithic architectures of tech companies in the past, microservices allow smaller, agile teams to build independently from the larger system. Some older companies will have legacy systems, but microservices can function parallel to legacy code and help refresh the company architecture.
Discuss emerging technologies: Conclude each question with an overview of how and where the system could benefit from machine learning. This will demonstrate that you’re prepared for not only current solutions but also future solutions.
System Design interviews are now part of every Engineering and Product Management Interview. Interviewers want candidates to exhibit their technical knowledge of core building blocks and the rationale of their design approach. This course presents carefully selected system design problems with detailed solutions that will enable you to handle complex scalability scenarios during an interview or designing new products. You will start with learning a bottom-up approach to designing scalable systems. First, you’ll learn about the building blocks of modern systems, with each component being a completely scalable application in itself. You'll then explore the RESHADED framework for architecting web-scale applications by determining requirements, constraints, and assumptions before diving into a step-by-step design process. Finally, you'll design several popular services by using these modular building blocks in unique combinations, and learn how to evaluate your design.
Note: For info on how ML can boost your SDI performance, check out my blog How Machine Learning gives you an edge in System Design.
Here is the system design interview cheat sheet
Now let's dive into the specifics of the top System Design Interview questions, starting with the easy-level problems.
Problem statement: Design an API rate limiter that caps the number of API calls the service can receive in a given period to avoid an overload.
Follow these requirements for a rate limiter system:
Functional requirements
Nonfunctional requirements
According to the following high-level rate limiter, the client’s requests are passed through an ID builder, which assigns unique IDs to the incoming requests. The ID could be a remote IP address, login ID, or other attributes. The decision maker fetches the throttling rules from the database and decides according to them. It either forwards the requests to application servers via the requests processor or discards them and provides the client an error message (429 Too many requests). If some requests are throttled due to a system overload, the system keeps those requests in a queue to be processed later.
Knowledge test!
How does your system measure requests per minute? If a user makes 10 requests at 00:01:20 and then another 10 at 00:02:10, they’ve made 20 in the same one-minute window despite the minute change.
In the event of a failure, a rate limiter would be unable to perform the task of throttling. Should the request be accepted or rejected in such a scenario?
What changes would you make to the design while considering the rate limiter design for a distributed system rather than a local one?
Note: Look at the detailed design of the rate limiter to find the answers to the questions above.
Problem statement: Design a scalable and distributed pub/sub system like Kafka that can handle massive message throughput. It should also ensure reliable message delivery and support various messaging semantics (at most once, at least once, exactly once).
Follow these requirements for the pub/sub design:
Functional requirements
Nonfunctional requirements
The brokers are responsible for storing the messages sent from the producers and allowing the consumer to read them. Similarly, the cluster manager is to keep an eye on the broker’s health and to spin another broker in case one goes down. The consumer’s details include subscription information, retention period, etc. The consumers are managed by the consumer manager, who manages consumers’ access to messages in the existing topics.
Knowledge test!
How can message delivery be ensured and semantics guaranteed at least once or at most once in the pub/sub design?
How can you guarantee message orders for specific consumers?
Note: To answer the above technical questions, you can examine the detailed design of pub/sub.
Problem statement: Design a scalable and distributed system that shortens long URLs like TinyURL or bit.ly
. The system takes a long URL and generates a new, unique short URL. It should also take a shortened URL and return the original full-length URL.
Follow these requirements for the URL-shortening system:
Functional requirements
Nonfunctional requirements
A load balancer is the first intermediary between the clients and the server, ensuring even distribution of incoming requests to maintain availability and reliability. When a new URL-shortening request comes in, the load balancer forwards it to a server where the rate limiter checks if the client is within the allowed request rate.
The server leverages a sequencer to generate a unique numeric ID for the URL requests. This ID is passed to an encoder, which converts it into a more readable alphanumeric string. The original URL and its corresponding shortened version are stored in a database. To enhance performance, recently accessed URLs are kept in a cache, allowing quick retrieval without repeatedly querying the database.
Knowledge test!
What if two users input the same custom URL?
What if there are more users than expected?
How does the database regulate storage space?
Note: To explore in depth to get the answer to the above questions, check out the detailed chapters on the TinyURL System Design.
Problem statement: Design a scalable content delivery network (CDN) system to efficiently distribute and cache content across globally distributed servers, minimizing latency and ensuring reliable end user content delivery.
Follow these requirements for a CDN system:
Functional requirements
Nonfunctional requirements
When a client requests content, a request routing system kicks in to find the address of the nearest or fastest server, ensuring minimal wait time. A load balancer then routes the request to this optimal server. If the requested content is cached on that server, it is immediately delivered to the client. If not, the server fetches the content from the origin server, caches it locally for more such requests, and then serves it to the user.
The CDN system ensures that frequently accessed content remains readily available while less popular content is periodically purged. The system also includes monitoring and analytics to track performance, optimize routing, and ensure high availability and reliability.
Knowledge test!
How would you determine which content to be cached on edge servers?
How would you distribute traffic evenly across multiple edge servers?
How would you ensure the CDN infrastructure’s scalability, availability, and fault tolerance?
How would you optimize the delivery and reduce the latency while streaming?
Note: Check out the chapter on the design of a content delivery network to help you understand and get answers to the above questions.
Problem statement: Design a web crawler that systematically browses the internet to discover and index web pages. The crawler should efficiently navigate websites, retrieve content, and follow links to discover new pages.
Follow these requirements for the web crawler system:
Functional requirements
Nonfunctional requirements
A web crawler begins by assigning a worker to a URL. Once the DNS is resolved, the worker sends the URL and IP address to an HTML fetcher to establish the connection. The URL and HTML content are extracted from the page and stored in the cache for processing. This content is then tested by duplicate eliminator service to ensure no duplicate content is transferred to blob storage. Once this cycle is complete for a single URL, it moves on to the next address in the queue.
Knowledge test!
What functionalities must be added to extract all formats (images and video)?
Real web crawlers have multiple workers handling separate URLs simultaneously. How does this change the queuing process?
How can you account for crawler traps?
Note: To get the answers to the above questions, check out the detailed chapters on the web crawler System Design.
Problem statement: Design a distributed caching system that provides fast, scalable, and reliable data retrieval across multiple servers. The system should efficiently manage cache consistency, handle high volumes of read and write requests, ensure data availability, and provide mechanisms for cache eviction and expiration.
Follow these requirements for the distributed cache system:
Functional requirements
Nonfunctional requirements
A distributed caching system begins by partitioning the data across multiple cache nodes to balance the load and improve access speed. When a client requests data, an application server determines the appropriate cache node based on a consistent hashing algorithm, ensuring an even distribution of requests and quick lookups.
If the data is found in the cache (a cache hit), it is returned to the client immediately, significantly reducing latency. If the data is not found (a cache miss), the system retrieves it from the primary data store, caches it, and then serves it to the client. Cache eviction policies, such as least recently used (LRU) or time-to-live (TTL), manage the removal of stale data to free up space.
Knowledge test!
How do you ensure data consistency across multiple cache nodes, especially during updates and deletions?
What strategies can be implemented to handle cache misses efficiently without overloading the primary data store?
What methods can maintain low latency and high throughput under heavy load conditions?
How do you secure the cache data against unauthorized access and ensure privacy?
Note: To answer such conceptual questions, check out the detailed design of the distributed cache.
Problem statement: Design a scalable, reliable, and secure real-time chat service like Facebook Messenger or WhatsApp to support instant messaging, group chats, notifications, and multimedia sharing.
Follow these requirements for the WhatsApp System Design:
Functional requirements
Nonfunctional requirements
In a real-time communication system, senders and receivers are connected to chat servers. Chat servers deliver messages from sender to receiver via a messaging queue. Various protocols, such as WebSocket, XMPP, MQTT, and real-time transport protocol, can be utilized for real-time communication. For this purpose, a manager establishes real-time connections between clients and chat servers; for instance, assume the WebSocket manager to establish WebSocket connections between users and different chat servers. Similarly, the messages can be persistently stored in the database.
Knowledge test!
What happens if a message is sent when the user isn’t connected to the internet? Is it sent when the connection is restored?
How will you encrypt and decrypt the message without increasing latency?
How do users receive notifications?
Are messages pulled from the device (the server periodically prompts the devices if they’re waiting to send a message) or are pushed to the server (the device prompts the server that it has a message to send)?
Note: Look at the detailed design of real-time chat service to get answers to such questions.
Problem statement: Design a social media service used by several million users like Instagram. Users should be able to view a newsfeed with posts by following users and suggesting new content the user may like.
Follow these requirements for the Instagram system:
Functional requirements
Nonfunctional requirements
Based on the above requirements, let’s create a high-level design of a feed-based social system like Instagram.
The high-level design of a feed-based social network includes posts, timeline generation, feed publishing service, and feed ranking and recommendation engine. The post-service handles the clients’ posts, and the post is published on the client’s wall (page). Similarly, the timeline generation service generates feeds for friends and followers by the timeline generation service. The timeline generation service utilizes the feed ranking and recommendation engine, which ranks and recommends the top N posts to followers based on their interests, searches, and watch history. The generated feed is stored in the database, and the feed publishing service is responsible for publishing and showing the generated feeds to followers. Since the feed could contain videos, the CDN is responsible for delivering the videos to followers with low latency.
Knowledge test!
Influencers or celebrities will have millions of followers; how are they handled vs. standard users?
How does the system weight posts by age? Old posts are less likely to be viewed than new posts.
What’s the ratio of read
and write
focused nodes? Are there likely to be more read requests (users viewing posts) or write requests (users creating posts)?
How can you increase availability? How does the system update? What happens if a node fails?
How do you efficiently store posts and images?
Note: Look at the detailed design of Instagram for better understanding.
Problem statement: Design a proximity server that stores and reports the distance to places like restaurants. Users can search nearby places by distance or popularity. The database must store data for hundreds of millions of businesses across the globe.
Follow these requirements for a System Design like Yelp:
Functional requirements
Nonfunctional requirements
The system handles search requests by using load balancers to distribute read requests to the read service, which then queries the quadtree service to identify relevant places within a specified radius. The quadtree service also refines the result before being sent to the clients. For adding places or feedback, write requests are similarly routed through load balancers to the writing service, which updates a relational database and stores images in blob storage. The system also involves segmenting the world map into smaller parts, storing places in a key-value store, and periodically updating these segments to include new places, although this update happens monthly due to the low probability of new additions.
Knowledge test!
How do you store lots of data and retrieve search results quickly?
How should the system handle different population densities?
Can we optimize commonly searched locations?
Note: Look at the detailed design of Yelp to get answers to the above questions.
Problem statement: Design a typeahead suggestion system that provides real-time, relevant autocomplete and autocorrect suggestions as users type, ensuring low latency and scalability to efficiently handle a large volume of queries.
Follow these requirements for the system:
Functional requirements
Nonfunctional requirements
When a user starts typing a query, each individual character is sent to an application server. A suggestion service gathers the top N suggestions from a distributed cache, or Redis, and returns the list to the user. An alternate service called the data collector and aggregator takes the query, analytically ranks it, and stores it in a NoSQL database. The trie builder is a service that takes the aggregated data from the NoSQL database, builds tries, and stores them in the trie database.
Knowledge test!
How strongly do you weigh spelling mistake corrections?
How do you update selections without causing latency?
How do you determine the most likely completed query? Does it adapt to the user’s searches?
What happens if the user types very quickly? Do suggestions only appear after they’re done?
Note: Look at the detailed design of Typeahead system for a better understanding of the system.
Problem statement: Design a video streaming service like YouTube or Netflix that allows users to upload and stream videos. The service should efficiently store many videos and their metadata and return accurate and quick results for user search queries.
Follow these requirements for a streaming service System Design:
Functional requirements
Nonfunctional requirements
A load balancer first handles video upload requests by sending them to the application servers. The applications server interacts with the video service which triggers transcoders to convert the video to different formats. These typically range from 140p to 1440p but can reach 4K resolutions. The formatted video is then saved to the blob store, and its metadata is stored on the metadata database. The video service sends the transformed video to CDNs for quick content delivery to end users. Popular and recent uploads are held in a CDN. A content delivery network, or CDN, reduces latency when delivering video to users. In conjunction with colocation sites, the CDN stores and delivers requested data to users.
Knowledge test!
How will your service ensure smooth video streaming on various internet qualities?
How are the videos stored?
How will the system provide a personalized experience to each user with recommendations?
How does the system react to a sudden drop in the network, shifting to low-quality, buffering content, etc.?
Note: Check out the detailed chapter on YouTube System Design that answers the above concerns during the design.
Problem statement: Design a system for a ride-sharing service similar to Uber, where users can request rides and drivers can accept these requests. The system should efficiently match drivers to riders based on location and availability, handle real-time updates on ride statuses, manage payments securely, and ensure a smooth user experience from booking to completion of the ride.
Follow these requirements for the System Design:
Functional requirements
Nonfunctional requirements
A user’s request is sent to the application server via a load balancer and API gateway. The system accepts the rider’s request and the trip service or manager provides an estimated time of arrival (ETA) based on different vehicle types. The drivers and location manager use a matching algorithm to find the nearest available drivers and send the request to those drivers by notifying them via a notification service. When a driver matches with a rider, the application should return the trip and rider information. The driver’s location is regularly recorded and communicated to relevant users through a pub-sub service.
Once the ride is complete, the trip manager ensures payment is securely processed through a payment gateway. We leverage a database that stores user and driver profiles, ride history, and payment information. We also use caching mechanisms to speed up access to frequently requested data, and constant monitoring ensures the service runs smoothly.
Knowledge test!
How can you keep latency low during busy periods?
How is the driver paired with the user? Iterating all drivers to find Euclidean distance would be inefficient.
What happens if the driver or user loses connection?
How would you update the ETA during a ride in peak hours?
Note: Check out our guide to designing Uber’s backend for more information on the Uber interview process.
Problem statement: Design a recommendation engine that suggests personalized content or products to users based on their preferences and behavior. The system should efficiently analyze user data, such as past interactions and ratings, to provide accurate and relevant recommendations.
Follow these requirements for a recommendation service:
Functional requirements
Nonfunctional requirements
The recommendation engine’s System Design comprises two stages: data collection, data processing, and recommendation. When users interact with the application, the data collector service collects data from application servers, such as search, viewing history, ratings, watch times, etc. This data is logged into Kafka for immediate processing.
We use real-time processors to process data and recommend content accordingly. We also use batch processors for periodic offline processing to perform detailed analyses and improve accuracy. Once the data is processed, the ML/AI engine uses different algorithms, such as collaborative filtering, content-based filtering, hybrid approaches, and some advanced techniques to recommend personalized suggestions.
Knowledge test!
How will you handle the cold start problem for new users and content?
How would you update recommendations in real time?
How would you ensure the recommendation system scales to ever-increasing users?
What strategies would you employ to adjust recommendations dynamically based on real-time changes in user behavior or preferences?
How can you optimize recommendation accuracy without compromising on scalability and performance?
You can check out the following course for more details!
System Design interviews are now part of every Engineering and Product Management Interview. Interviewers want candidates to exhibit their technical knowledge of core building blocks and the rationale of their design approach. This course presents carefully selected system design problems with detailed solutions that will enable you to handle complex scalability scenarios during an interview or designing new products. You will start with learning a bottom-up approach to designing scalable systems. First, you’ll learn about the building blocks of modern systems, with each component being a completely scalable application in itself. You'll then explore the RESHADED framework for architecting web-scale applications by determining requirements, constraints, and assumptions before diving into a step-by-step design process. Finally, you'll design several popular services by using these modular building blocks in unique combinations, and learn how to evaluate your design.
Functional requirements
Nonfunctional requirements
In a high-level design of a file-sharing service like Google Drive, the user’s request to upload or download a file passes through a load balancer to the application servers. The application server sends the upload request to a chunk service for splitting large files into smaller, more easily manageable chunks. These files are then sent to a processing queue that sends and receives requests to store metadata and ensure that files are synchronized between users and accounts. Files are stored in a cloud-based block storage platform, like Amazon S3 (or in-premises blob storage). When a user wants to upload or download files, they contact this storage service through a web server.
Knowledge test!
Where are the files stored?
How do you handle updates? Do you re-upload the entire file again?
Do small updates require a full file update?
How does the system handle two users updating a document simultaneously?
Note: To further your learning, explore the detailed design of distributed file systems of tech giants like Google and Facebook (Meta).
Problem statement: These social network sites operate on a forum-based system that allows users to post questions and links. For simplicity’s sake, focus more on designing Quora. It is unlikely that you’ll need to walk through the design of something like Reddit’s subreddit or karma system in an interview.
Follow these requirements for a System Design like Quora:
Functional requirements
Nonfunctional requirements
In Quora’s high-level design, users interact through a web server, which communicates with an application server to handle actions such as posting questions, answers, and comments. Content like images and videos is stored in blob storage, and question-and-answer data, along with user profiles and interactions, are stored in a MySQL database.
A machine learning engine analyzes user interactions and content to rank answers based on relevance and quality. This engine continuously learns from user feedback to improve its ranking algorithms. For personalized user experiences, a recommendation system utilizes machine learning models to tailor content based on individual interests and behaviors.
Knowledge test!
How can you ensure the system’s scalability to handle millions of simultaneous users posting questions and answers?
What strategies can efficiently store and retrieve large multimedia content in blob storage?
How would you design the database schema to manage the relationships between users, questions, answers, and comments in a scalable way?
What techniques can be used to rank answers effectively, ensuring that high-quality content is prioritized for users?
How can you optimize the performance of the machine learning engine to rank answers quickly and accurately?
Note: Check out the detailed chapter on Quora System Design to help you understand the system.
Problem statement: Design a scalable and efficient social media newsfeed system that delivers personalized, real-time content updates to users, ensuring low latency, high availability, and scalability.
Follow these requirements for the design:
Functional requirements
Nonfunctional requirements
In the following high-level design of a newsfeed system, clients post or request their newsfeed through the app, which the load balancer redirects to a web server for authentication and routing. Whenever a post is created via the post service and available from the friends (or followers) of a user, the notification service informs the newsfeed generation service, which generates newsfeeds from the posts of the user’s friends (followers) and keeps them in the newsfeed cache. Similarly, the generated feeds are published by the newsfeed publishing service to the user’s timeline from the news feed cache. If required, it also appends multimedia content from the blob storage with a news feed.
Knowledge test!
The creation and storage of newsfeeds for each user in the cache requires an enormous amount of memory. Is there any way to reduce this memory consumption?
What mechanisms would you implement to prioritize and filter content in the newsfeed to prevent information overload for users?
How can the system ensure consistency and order of posts in the newsfeed, especially in a distributed environment with multiple data centers?
Note: If you need answers to such questions, look at the detailed design of a newsfeed service.
Problem statement: Design a collaborative editing service that lets users remotely and simultaneously make changes to text documents. The changes should be displayed in real time. Much like other cloud-based services, documents should be consistently available to any logged-in user on any machine. Your solution must be scalable to support thousands of concurrent users.
Follow these requirements for the Google Docs system:
Functional requirements
Nonfunctional requirements
Clients’ requests are forwarded to the operations queue, where conflicts are resolved between different collaborators, and the data is stored in the time series database and blob storage (responsible for storing media files). Autocomplete suggestions are made via the typeahead service. This service resides on the Redis cache to enable low latency suggestions and enhance the speed of the regular updates process. The application servers perform a number of important tasks, including importing and exporting documents. Application servers also convert documents from one format to another. For example, a .doc
or .docx
document can be converted into .pdf
or vice versa.
Knowledge test!
How do you minimize latency when multiple users are distant from the server?
What techniques for conflict resolution are best for ensuring consistency?
Note: If you need answers to such questions, look at the detailed design of Google Docs.
Problem statement: Design a service that can map the route between two locations. The system should map several optimal paths to a destination based on the mode of travel. Each route should display the total mileage and an estimated time of arrival.
Follow these requirements for the Google Maps system:
Functional requirements
Nonfunctional requirements
In the Google Maps system, clients request location-based services, such as finding a route or searching for nearby points of interest. The load balancer directs requests to various services based on the nature of the query.
For routing requests, the route finder service calculates optimal paths between two or more points using real-time and historical data. It relies on the graph processing service to perform complex calculations on the road network graph stored in the graph database. The location finder service provides the user’s current location or identifies the location of a specified point of interest. The area search system enables users to find nearby places, such as restaurants or gas stations, by querying both the graph database and third-party road data sources.
Knowledge test!
How do you collect the world map data? What third-party source will you use?
How do you segment the map to avoid long loading times?
How do you ensure the accuracy of ETA calculations for high-traffic times of day?
Note: Look at the detailed design of Google Maps to get answers to the questions above.
Problem statement: Design a payment gateway like Stripe capable of securely performing online or card transactions and handling millions of users simultaneously.
Follow these requirements for the system:
Functional requirements
Nonfunctional requirements
Initially, a customer selects a product or service via the merchant’s online store and proceeds to the checkout page to provide their payment details, including card number, cardholder name, CVV or CVC, and expiration date. Upon clicking the pay button, an event is generated that hits the payment service, which stores the event, performs initial security checks, and forwards the payment details to the payment service provider for further operations. The payment gateway performs extensive security checks, moves money from the customer’s account to the merchant’s, and provides secondary services like handling refunds and generating invoices. The card network verifies the card information via APIs provided by the card network. Once the payment is processed, the wallet and ledger service updates the merchant’s wallet in the database to track total revenue and processes each order separately in case of multiple sellers. The reconciliation system matches and verifies financial records to ensure accurate transaction accounting, identifying and resolving any discrepancies.
Knowledge test!
Where are the customer’s payment details encrypted during a purchase?
How does the card network authorize a debit/credit card?
You can check out the following course for more details on System Design:
System Design interviews are now part of every Engineering and Product Management Interview. Interviewers want candidates to exhibit their technical knowledge of core building blocks and the rationale of their design approach. This course presents carefully selected system design problems with detailed solutions that will enable you to handle complex scalability scenarios during an interview or designing new products. You will start with learning a bottom-up approach to designing scalable systems. First, you’ll learn about the building blocks of modern systems, with each component being a completely scalable application in itself. You'll then explore the RESHADED framework for architecting web-scale applications by determining requirements, constraints, and assumptions before diving into a step-by-step design process. Finally, you'll design several popular services by using these modular building blocks in unique combinations, and learn how to evaluate your design.
Problem statement: Design a food-delivery service like Uber Eats or DoorDash that efficiently connects hungry customers with diverse restaurants, ensuring timely and accurate order fulfillment while optimizing delivery routes and driver earnings.
Follow these requirements for the DoorDash system:
Functional requirements
Nonfunctional requirements
The following is a level design of DoorDash consisting of several services for different purposes. Let’s describe the workflow and the interaction of the different services involved in the design.
Customers’ requests are routed through the API gateway and directed to different services via the load balancer. The search service searches for menu items, cuisines, restaurants, etc. It is one of the busiest services most customers use when searching the website or application. The ordering service handles menu selection, managing the shopping cart, and placing food orders. Additionally, it facilitates payment processing through an external payment gateway and stores the outcomes in the relevant database. The order fulfillment service is used to manage the orders that the restaurants have accepted. It also keeps track of orders being prepared.
Customers and restaurant staff use the user management service to create and manage their profiles. The dispatch service displays the orders ready to be picked. It is also used to view delivery information and facilitate communication between customers and restaurant staff.
Knowledge test!
How would you handle a sudden surge in orders during peak hours, like on Super Bowl Sunday?
How would you leverage customer and delivery data to personalize recommendations, improve order accuracy, and optimize pricing?
How would you protect sensitive customer and payment information from breaches?
Problem statement: Design a highly available, fault-tolerant distributed locking service like Google Chubby to coordinate access to shared resources in a large-scale distributed system.
Follow these requirements for the Google Chubby locking system:
Functional requirements
Nonfunctional requirements
The Chubby cell is composed of multiple servers (usually five), all replicas of each other. One of these servers is a leader with whom the clients must communicate. Each server has a namespace that is composed of directories and files that contain data that is relevant to different applications. In addition to this namespace, the server contains an
Knowledge test!
How does Chubby recover from server failures and network partitions while maintaining data consistency?
How does Chubby handle client failures and session timeouts?
Note: Look at the detailed design of Google Chubby locking to get answers to the above questions.
Problem statement: Design a highly available, fault-tolerant, and scalable coordination system like ZooKeeper to manage configuration, naming, synchronization, and group services in a distributed system.
Follow these requirements for the ZooKeeper system:
Functional requirements
Nonfunctional requirements
The clients are the applications that use ZooKeeper as a coordination service for their application processes. ZooKeeper client library (API) provides functions such as create()
, delete()
, exists()
, and many more to manage and use the coordination data. Through this API, the client request is forwarded to the ZooKeeper server. The ZooKeeper server represents a process that provides the ZooKeeper coordination service. It stores all the coordination data from different applications and their processes in memory. The namespace for applications/clients and their coordination data are organized in a hierarchy (in the form of a tree). The client application processes store their coordination data on znodes. These processes can perform all the operations provided in the ZooKeeper client API. Each znode can be accessed through its path in the standard UNIX notation (like having /
for the root directory). There is a set of ZooKeeper servers called ZooKeeper Ensemble. All are replicas. One is elected as the leader, while others become the followers.
Knowledge test!
We have a collection of servers in the ZooKeeper ensemble. What should be the minimum number of servers, and why?
Note: If you need answers to such questions, look at the detailed design of ZooKeeper.
Problem statement: Design a massively scalable distributed storage system like Bigtable capable of handling petabytes of structured and unstructured data with low latency reads and writes, supporting flexible schema, efficient query patterns, and high availability while ensuring data consistency and durability.
Follow these requirements for the Bigtable system:
Functional requirements
Nonfunctional requirements
The following illustration shows that the Bigtable implementation consists of three main parts: a library linked to each client, one Bigtable manager server, and several tablet servers. A library is a component that all clients share. This library enables clients to communicate with Bigtable. The manager server allocates tablets to table servers, identifies tablet server addition and expiration, regulates tablet-server traffic, and garbage collection of files in GFS (a distributed file system). It also supports schema changes like table and column family formation. All tablet servers are in charge of a certain group of tablets, generally around 10 to 1000 tablets. Each tablet server provides reads and writes of the data to the tablets to whom it is allocated. Servers can be added or removed in a Bigtable cluster as needed. New tablets can be made and assigned, old ones can be merged, and they can be reassigned from one server to the other to accommodate changes in demand.
Knowledge test!
How does Bigtable efficiently support schema changes without impacting performance?
How does Bigtable ensure data distribution and replication across multiple servers?
Note: If you need answers to such questions, look at the detailed design of BigTable.
Problem statement: Design an online multiplayer game system that allows players to connect and play in real time. The system should handle player matchmaking, maintain low-latency communication, ensure synchronization between players, and consistently manage game state.
Follow these requirements for such a system:
Functional requirements
Nonfunctional requirements
In an online multiplayer game system, players connect to the game server, which handles matchmaking by pairing players based on skill levels and preferences. Once matched, the server maintains low-latency communication between players, ensuring smooth and real-time interactions using a pub-sub service. The game state, including player positions and actions, is synchronized across all players’ devices through a central game state manager. The session service manages sessions and synchronizes the players. The play service will handle all the game-related tasks like updating stats, checking player’s availability, etc. The payment service facilitates in-app purchases of assets.
For a better user experience, we can separate real-time operations, such as gameplay, from non-real-time operations, such as invites and in-app purchases.
Knowledge test!
How can you ensure system stability and prevent crashes when millions of users play simultaneously?
How would you implement lag compression and data buffering to handle network delays and ensure smooth gameplay?
What are the benefits of using a virtual private cloud (VPC)?
How can you maintain low latency for real-time communication, especially during peak usage?
How would you limit the number of requests to the server without compromising the real-time gaming experience?
Note: To learn more about the details of gaming service design, explore gaming API design chapter.
Problem statement: Design a video conference service that allows users to host and join high-quality video meetings with multiple participants. The system should support real-time audio and video communication, manage user interactions and meeting controls, and handle varying network conditions.
Follow these requirements to design the system:
Functional requirements
Nonfunctional requirements
All requests in a video conference service pass through a load balancer and API gateway to the backend services. Users can schedule meetings using the scheduling service, which organizes and manages meeting details. When a meeting starts, the media server handles real-time audio and video communication, encoding and decoding media to ensure high-quality streaming. The meeting management service oversees participant roles and permissions, managing who can join or leave the meeting.
The media server with a dedicated screen-sharing module captures and streams the participant’s screen to others for screen sharing. A chat service facilitates real-time text communication among participants and provides chat functionality. To ensure security, the system implements encryption and other measures to protect communication and user data.
Knowledge test!
How can you ensure high-quality audio and video streams for all participants, especially when network conditions are unpredictable?
What strategies can be used to efficiently handle and scale up to thousands of simultaneous video streams in a single conference?
How would you implement real-time user interaction features, such as chat and screen sharing, without impacting the performance of video streaming?
What methods can optimize video and audio quality while minimizing latency and buffering?
How can you ensure fault tolerance and high availability of the video conference service to avoid disruptions?
Note: Explore the Zoom API design chapter to learn more about designing a video conference service and determine the answers to the above questions.
Mastering these 25 questions is a fantastic first step towards comprehensive System Design interview preparation.
However, there are plenty more System Design concepts you'll need to know for a real-world System Design Interview. For more detailed questions and answers – as well as the opportunity to actually get hands-on practice – Educative has created an exhaustive course: Grokking Modern System Design Interview.
This interactive course covers all of the building blocks of modern System Design concept, coupled with more than a dozen real-world questions currently used in the industry. By the end the course, you will understand what clarifying questions to ask and tradeoffs to make for each question. Ultimately, you will learn exactly what it takes to stand out to interviewers in the current hiring market.
That's why if I had to pick just one System Design prep resource to give you, this would be it.
I wish you the best of luck with your interviews. I am confident that with a little hard work and strategic preparation, you will be successful.
Happy learning!
Cracking the Machine Learning Interview: System Design approaches
Anatomy of a machine learning system Design interview question
Following are some relevant courses that will help you to further your learning in the System Design and distributed systems domain:
Free Resources