Requirements of a Blob Store's Design

Identify the requirements of and make estimations for the blob store.

Requirements

Let’s understand the functional and non-functional requirements below:

Functional requirements

Here are the functional requirements of the design of a blob store:

  • Create a containerA container is like a folder in a file system used to group blobs. Don’t mix up this container with a Docker container.: The users should be able to create containers in order to group blobs. For example, if an application wants to store user-specific data, it should be able to store blobs for different user accounts in different containers. Additionally, a user may want to group video blobs and separate them from a group of image blobs. A single blob store user can create many containers, and each container can have many blobs, as shown in the following illustration. For the sake of simplicity, we assume that we can’t create a container inside a container.
Press + to interact
Multiple containers associated with a single storage account, and multiple blobs inside a single container
Multiple containers associated with a single storage account, and multiple blobs inside a single container
  • Put data: The blob store should allow users to upload blobs to the created containers.
  • Get data: The system should generate a URL for the uploaded blob, so that the user can access that blob later through this URL.
  • Delete data: The users should be able to delete a blob. If the user wants to keep the data for a specified period of time (retention time), our system should support this functionality.
  • List blobs: The user should be able to get a list of blobs inside a specific container.
  • Delete a container: The users should be able to delete a container and all the blobs inside it.
  • List containers: The system should allow the users to list all the containers under a specific account.
Press + to interact
Functional requirements of a blob store
Functional requirements of a blob store

Non-functional requirements

Here are the non-functional requirements of a blob store system:

  • Availability: Our system should be highly available.
  • Durability: The data, once uploaded, shouldn’t be lost unless users explicitly delete that data.
  • Scalability: The system should be capable of handling billions of blobs.
  • Throughput: For transferring gigabytes of data, we should ensure a high data throughput.
  • Reliability: Since failures are a norm in distributed systems, our design should detect and recover from failures promptly.
  • Consistency: The system should be strongly consistent. Different users should see the same view of a blob.
Press + to interact
The non-functional requirements of a blob store
The non-functional requirements of a blob store

Explain the concept of eventual consistency in the context of blob stores. Provide a scenario where it might be acceptable in the widget below.

Powered by AI
8 Prompts Remaining
Prompt AI WidgetOur tool is designed to help you to understand concepts and ask any follow up questions. Ask a question to get started.

Resource estimation

Let’s estimate the total number of servers, storage, and bandwidth required by a blob storage system. Because blobs can have all sorts of data, mentioning all of those types of data in our estimation may not be practical. Therefore, we’ll use YouTube as an example, which stores videos and thumbnails on the blob store. Furthermore, we’ll make the following assumptions to complete our estimations.

Assumptions:

  • The number of daily active users who upload or watch videos is five million.
  • The number of requests per second that a single blob store server can handle is 500This number can be significantly higher, depending upon the blob store. For example, Microsoft Azure can handle a maximum of 20,000 IOPS..
  • The average size of a video is 50 MB.
  • The average size of a thumbnail is 20 KB.
  • The number of videos uploaded per day is 250,000.
  • The number of read requests by a single user per day is 20.

Number of servers estimation

Considering our assumption of using daily active users as a proxy for the number of requests per second for peak load times, we get 5 million requests per second. The number of servers that we require, considering 500 RPS for the blob store server, is calculated using the formula given below:

Press + to interact
Number of servers required by a blob store system dedicated to storing YouTube data
Number of servers required by a blob store system dedicated to storing YouTube data

Note: Our revised estimate for the blob store server is 500 RPS, down from 64,000 ...

Level up your interview prep. Join Educative to access 70+ hands-on prep courses.