Load balancing distributes incoming traffic across multiple servers, preventing any single server from becoming overburdened. It helps maintain optimal performance during high-traffic events.
Imagine a folder on your computer where you can drop any file, and it automatically syncs with your teammates so everyone can access the latest version. This is the power of Dropbox, a cloud-based file storage and synchronization service. Dropbox lets you collaborate by storing and sharing entire files online, ensuring the most current version is accessible from any device in real-time.
Some other cloud storage, file synchronization, and collaboration tools similar to Dropbox are:
Google Drive
Microsoft OneDrive
Box
Jupyter
According to
Despite fierce competition from Google Drive, which holds 31.19% of the market share, Dropbox remains a leader in collaboration tools with 18.61%—a testament to its focus on simplicity and robust syncing capabilities.
To begin with the Dropbox System Design, you must scope the problem by defining its requirements. For Dropbox, let’s assume the following functional and nonfunctional requirements:
Functional requirements
Nonfunctional requirements
In System Design interviews, asking targeted questions is crucial to understanding the expectations and narrowing the problem’s scope. For example, clarifying functional and nonfunctional requirements, like those outlined here for Dropbox, can set the foundation for a solid design. If you are preparing for a System Design interview, you might find it valuable to explore the System Design interview process at FAANG/MAANG.
After defining the requirements, you should estimate the resources required to effectively handle expected user demand. This includes estimating storage, bandwidth, and the number of servers.
To estimate different resources, you need to roughly estimate the number of users, their requests, and the files they share each second. Let’s assume that Dropbox has 50 million paying users.
Actually, according to
, Dropbox has 18.22 million paying users as of Q2 2024. We can design the system for future assuming their users grow over time. Backlinko https://backlinko.com/dropbox-users
Suppose that the average storage assigned to each user is 2 TB. The total number of storage estimations would be:
The total number of storage required to handle 50 million users would be 100,000 PB (petabytes) or 100 EB (exabytes).
To estimate the required bandwidth, let’s assume that the average daily usage of each user is 1 GB.
First, we need to estimate the number of queries per second (QPS) to estimate the number of servers. For this, let’s assume that each active user sends 5 requests per minute, which would yield us the following QPS:
Note: According to the back-of-the-envelope estimations, a 64-core server executes approximately 64000 requests per second.
So, the total number of required servers will be:
So, to summarize, for 50 M users, the following is the estimation for different resources:
Estimated storage =
Estimated bandwidth =
QPS =
The number of servers =
You can estimate these resources for different numbers of users in the following calculator:
The number of users | 50 | Million |
Average storage assigned to each user | 2 | TB |
Total estimated storage required | f100000 | PB |
Average daily usage of each user | 1 | GB |
Total bandwidth | f72.34 | Gb/s |
Requests a users sends per minute | 5 | Requests |
QPS | f4.2 | Million |
The total number of servers | f66 | Servers |
Note: If you’re new to System Design, you might be interested in exploring the top 25 System Design interview questions.
Dropbox’s high-level design consists of different microservices, including metadata service, chunk service (chunker), metadata databases, cloud storage, and synchronization service. The chunk service is responsible for uploading or downloading the files to the cloud server’s internal file system. The metadata service handles the metadata associated with the files and users. Similarly, the synchronization service is responsible for syncing files across multiple devices to provide up-to-date data to the collaborators. These services notify collaborators about the changes made to the files on the cloud storage or any updates to the metadata database.
Let’s dive into Dropbox’s detailed System Design by starting with each functional requirement. We’ll discuss how to achieve each functionality step-wise and the architectural changes necessary to support it.
The file upload process in Dropbox architecture starts when you place a file in the folder provided by the client application. The file is then split into smaller chunks, usually 4 MB. These chunks are hashed using SHA-256, which produces unique hash values uniquely identifying each chunk. Each chunk is encrypted using the AES-256 encryption algorithm before sending it to cloud storage. To ensure high availability, the encrypted chunks are stored on multiple servers.
The client-side application works in tandem with the chunk server, synchronization server, and cloud storage to upload, download, or synchronize any changes in a file on the cloud storage. When a client requests to download a file, all the associated encrypted chunks are downloaded to the client’s local folder, where these chunks are decrypted and reassembled on the client side. Dropbox also uses caching and compression algorithms to improve the download speeds and minimize resource use on the server side.
The following pseudo-code shows the chunking process and computing their SHA-256:
IMPORT SHA256 from cryptography libraryFUNCTION create_file_chunks_with_hash(file_path, chunk_size):INITIALIZE empty list called 'chunks'INITIALIZE empty list called 'hashes'OPEN file at file_path in binary read modeSET file_position to 0WHILE file_position is less than file size:READ next chunk of chunk_size bytes from fileAPPEND the chunk to 'chunks'COMPUTE hash of the chunk using SHA256APPEND the hash to 'hashes'INCREMENT file_position by chunk_sizeCLOSE fileRETURN 'chunks', 'hashes'
Explanation of the above pseudo-code:
Line 1: Load the SHA256 library.
Lines 3–20: Define a function that takes two parameters as input: the file path and the chunk size. This function returns a list of chunks and their hashes.
Lines 7–17: Open a file located at file_path
for reading in binary mode, and the file_position
is set to 0 to ensure that the entire file is processed chunk by chunk. Next, the file is read chunk-wise using a loop, and each chunk is appended to a list. Also, the SHA-256 hash is computed for each chunk and appended to the hash list.
Lines 19–20: Once the file is read, it is closed, and the lists of chunks and hashes are returned.
If the chunk process is performed on the client side, then what’s the chunk server doing? The chunk server manages and checks each chunk’s hash against its hash value in the cloud storage. If a chunk already exists, the chunk server discards it and prevents it from being stored. The chunk server is also responsible for storing chunks across different data centers and servers to ensure redundancy and high availability. Similarly, during the download request, the chunk server retrieves chunks from storage, assembling them correctly and delivering them to the users.
The file-sharing process begins when a user selects a file or folder to share and sets permission for “read-only” or “can edit.” The request is sent to the sharing service, which generates unique and shareable links or can send direct invitations to collaborators. This service is also responsible for defining permissions associated with the shared content, such as restricted access to some users or enabling public access via a link. Along with the sharing service, the metadata service stores important information about the shared file, including permissions, file ownership, access control list (ACL), upload date, last modified date, and so on.
In a system like Dropbox, user authentication and access control are crucial to ensuring that only authorized users can access files and perform specific actions. These tasks are managed through several components, including the authentication server, authorization server, and access control mechanisms.
Authentication: For user authentication, we can use a combination of methods, including username/password credentials, OAuth 2.0, and multi-factor authentication (MFA). When a user logs in, the authentication server verifies their identity by validating the credentials or the token received from an external identity provider (such as Google or Facebook, in the case of OAuth 2.0). MFA adds a layer of security by requiring the user to provide a second factor, such as a one-time password (OTP) sent to their mobile device, ensuring that even if credentials are compromised, unauthorized access is still prevented.
Authorization: Once authenticated, the authorization server enforces access control. This component assigns users appropriate permissions based on their identity and the specific files or folders they are trying to access. Our proposed system can use role-based access control (RBAC), where users have permissions based on their roles (e.g., owner, editor, or viewer) in relation to shared files or folders. For example, the owner of a folder can invite others with specific roles, allowing for collaborative editing while maintaining control over sensitive files.
Additionally, the access control list (ACL) keeps track of the permissions assigned to each user or group for specific files and folders, ensuring that only authorized users can view, edit, or share content.
File synchronization in the system involves the interaction of different components, including the Dropbox client, metadata service, chunk service, and the synchronization service. The client’s responsibility is to monitor changes in local files for changes and sync them to the cloud, communicating with the aforementioned services. In the file syncing process, the main actor is the sync engine, which is a process that runs locally on the user’s device. It breaks files into chunks, computes hashes, detects changes, and initiates the upload process.
It is necessary to remember the distinction between a sync engine and a synchronization service. The synchronization service sits between the user’s device and backend storage. It ensures that changes made on one device should reflect across all devices linked to the same account and be visible to the file’s collaborators.
For file synchronization, the block-level sync algorithm can be used. This algorithm breaks large files into smaller chunks to ensure efficient upload and download process. On the client side, when a file is modified, the client computes chunks and sends them to the chunks server for comparison with the existing hashes. Only the chunks that have changed are uploaded, making the sync faster and more efficient for large files. For downloading, the client requests, only the changed chunks, which are retrieved from the storage server, and the file is reassembled locally. This approach minimizes data transfer, especially in cases where large files undergo minor changes (e.g., editing a document or adding a few lines of code). The block-level sync algorithm is especially useful for large files such as large videos or database backups in which small changes can result in massive amounts of data if traditional syncing is used.
The following illustrations show that only the modified chunk is uploaded to the cloud storage. The synchronization service sends the notification and the modifications to the collaborators via the pub/sub system (short for publisher/subscriber system).
When it comes to handling large files and network interruptions, the block-level sync algorithm provides resilience and efficiency. For large files, the client can upload or download chunks independently, meaning if there’s a network interruption, the sync engine can resume from the last successfully uploaded or downloaded chunk rather than restarting the entire process. This chunked approach is critical for users on unreliable networks or with bandwidth constraints. Dropbox also uses
Note: Dropbox uses an indexer to manage and search the stored files metadata, enabling quick retrieval and synchronization across devices.
Point to ponder!
When designing Dropbox’s clients, which protocols or techniques should be used to efficiently monitor changes made by other clients?
Data storage often involves breaking files into smaller segments for efficient distribution across multiple storage servers, with metadata tracking the location of each segment. These segments are typically encrypted to ensure data security. The system replicates them across multiple data centers to enhance durability and availability, providing redundancy and disaster recovery. If one data center fails due to hardware issues or natural disasters, the system retrieves data from another, ensuring continuous access. While this increases fault tolerance and supports disaster recovery, it introduces challenges such as maintaining low-latency access, managing eventual consistency between replicas, and handling the operational complexity and costs of replication across geographically distributed servers.
It is very common for multiple users to edit a document simultaneously on Dropbox. For this purpose, version control and conflict resolution strategies are essential. When a file is uploaded or modified, the metadata server tracks its version and stores the previous versions to allow rollback if needed.
For conflict resolution, we can employ the last-write-wins strategy, in which the last version uploaded to the server overwrites any previous changes. This algorithm is a simple one, but it can cause unintentional data loss if changes are made without proper coordination. To overcome this, we can employ an advanced strategy for some file types, such as creating “conflicted copies.” If two simultaneous users edit the same file, both versions are saved by appending timestamps or modifying the user’s name to the conflicting file. This enables users to view and manually merge the changes if required to avoid any data loss. In some cases, especially with text-based files like documents or code, we can employ merge strategies, which automatically merge non-conflicting changes from different file versions. However, the system might still create a conflicted copy requiring manual intervention for complex conflicts (such as simultaneous edits to the same file section).
From a user experience perspective, the proposed system provides clear notifications and indications via the pub/sub system when conflicts arise. Users are informed of the conflicted file and given the option to view both versions. An intuitive interface makes it easy to resolve conflicts by allowing users to compare versions and choose which one to keep or merge. This approach balances automation with user control, ensuring that conflicts are managed efficiently without overwhelming the user.
The proposed Dropbox system needs to implement robust security mechanisms such as encryption for data at rest and in transit. For file (data) encryption at rest, we need to use the commonly used 256-bit AES algorithm to ensure data protection in the event of a breach. Similarly, we need to use SSL/TLS encryption for secure transmission, which prevents interception or eavesdropping.
A system like Dropbox must also comply with the General Data Protection Rules (GDPR) and the Health Insurance Portability and Accountability Act (HIPAA). For this purpose, strict privacy controls for data handling are required. For GDPR, our system should provide features like data access, correction, and deletion while facilitating secure cross-border transfers. For HIPAA compliance, we must encrypt protected health information (PHI) and allow healthcare providers to sign a Business Associate Agreement (BAA).
Let’s combine all the components and services to create a detailed design of the Dropbox system.
Here is the description of each component for easy understanding:
Clients: These are the user devices (computers, mobiles) having client applications installed in them that interact with the backend to upload, download, and sync files.
Load balancer: Distributed incoming traffic from clients evenly across multiple servers to ensure efficient resource utilization.
API gateway: Acts as the entry point for clients, handling requests and routing them to the appropriate backend services.
CDN: Caches and delivers static content closer to users to reduce latency.
Authn and authz service: Handles authentication and authorization, ensuring that only verified users can access their data.
File metadata server: Manages metadata for files (such as file names, paths, and versions) and coordinates access to the metadata database and cache.
Metadata database: Dropbox employs NoSQL databases to store information about files, including their location/path in the storage system and access control details.
Chunk server: Manages the storage of file chunks, handling upload/download operations and ensuring data is stored efficiently.
Cloud or block storage: Provides scalable storage for file data in the form of chunks/blocks, often distributed across multiple servers.
Synchronization server: Keeps files synchronized across multiple devices by tracking changes and ensuring all clients have the latest version.
Pub/sub system: When files are modified, updates are published to subscribed clients, ensuring real-time synchronization across devices. This system also acts as the main notification service.
Note: User data and other structured data is stored in a relational database like MySQL.
For performance optimization, the proposed Dropbox system can use various techniques such as redundant database servers, caching for frequently accessed files, file (chunk) transfer through compression, and data deduplication techniques. One major factor that could hinder the performance is network condition, as fluctuating bandwidth and latency can affect file uploads and downloads. We can use various network algorithms that dynamically adjust file transfer rates based on the user’s network conditions to prevent this issue. Such techniques include adaptive bandwidth control, multi-threaded transfer, and TCP optimization techniques.
Also, storing copies of frequently accessed files locally on the user’s device reduces the time taken to retrieve data and minimizes latency. Similarly, compression reduces the data size during transfer to optimize file transfer, and deduplication ensures that only the unique data is uploaded.
Some of the well-known algorithms for compressing different types of files are given below.
Gzip (DEFLATE) for documents, JPEG for images, MP3 for audio
LZ77 lossless compression
On-the-fly compression algorithms such as the Lempel-Ziv-Markov chain algorithm (LZMA), Snappy, and Brotli
For deduplication, the system can use any of the following algorithms:
SHA-256 hashing algorithm
MD5
Rabin Fingerprinting
Content-defined chunking (CDC), and so on
The proposed design of Dropbox balances functional efficiency and robust scalability. The system’s ability to handle file synchronization, sharing, and storage seamlessly while being user-friendly showcases the power of System Design and distributed systems. The functional requirements, such as file synchronization, versioning, and access control, are met through a thoughtful combination of chunking, hashing, and metadata management, ensuring fast and reliable operations even as the system scales to millions of users. On the nonfunctional side, Dropbox emphasizes high availability, fault tolerance, and performance optimization. Features like real-time updates, efficient bandwidth usage, and de-duplication enable the system to provide an exceptional user experience across various platforms.
If you’re looking to dive deeper into System Design, Educative offers a fantastic course on designing scalable and efficient systems. This course includes real-world examples and interactive lessons to help you develop System Design concepts.
System Design interviews are now part of every Engineering and Product Management Interview. Interviewers want candidates to exhibit their technical knowledge of core building blocks and the rationale of their design approach. This course presents carefully selected system design problems with detailed solutions that will enable you to handle complex scalability scenarios during an interview or designing new products. You will start with learning a bottom-up approach to designing scalable systems. First, you’ll learn about the building blocks of modern systems, with each component being a completely scalable application in itself. You'll then explore the RESHADED framework for architecting web-scale applications by determining requirements, constraints, and assumptions before diving into a step-by-step design process. Finally, you'll design several popular services by using these modular building blocks in unique combinations, and learn how to evaluate your design.
Free Resources