...

/

Sharded counter copy

Sharded counter copy

Let's understand sharded counter and its usage within the design of a system.

Problem statement

Real-time applications like Facebook, Twitter, and Youtube have high user traffic. Users come to the applications and perform multiple operations (view, like, comment, etc.) depending on the application’s structure. For instance, an image is posted on a Facebook page with millions of followers, and the post likes rapidly increase after each millisecond. Here, counting the likes might be easy for this single image, but what will we do when thousands of images or videos are uploaded simultaneously by many pages or celebrities, each with millions of followers. Let’s see the below figure of Youtube’s videos viewed by millions of users in 24 hours.

On average, Six thousand tweets are sent on Twitter within one second, which equals 350 thousand tweets per minute and 500 million tweets per day. A challenging task is to handle hundreds of billions of likes on 500 million tweets per day. The following table shows the most liked tweets in one day.

How will we handle millions of write requests coming against millions of like on thousands of tweets per minute? As we know, a write always takes more time than a read.

Solution

The first thing that comes to mind is the in-memory distributed key-value database (Redis or Memcached). Yes, it will work fine for small-scale applications, but it is not enough for a large scale. The second and most important thing is to calculate top N, where N can be the number of top viewed posts (heavy hitters). So here, only a key-value database is not sufficient.

The solution is a sharded counter where each counter has a specified number of shards as needed. Each counter can define for a specific post. There can be millions of counters for millions of posts to maintain views, likes, comments, etc., counts. These counters are also known as distributed counters. Let’s look at the illustration below to understand the distributed counters having specified shards.

Likely, the distributed counters could implement on different machines or different servers. Let’s discuss an example to understand how distributed counters handle millions of write requests for a single post.

Assume the famous channel (millions of subscribers) uploads a new video. The server receives a burst of write requests on view from worldwide users. First, a new counter initiates for a newly uploaded video. The server forwards the request to the corresponding counter, and the counter chooses the shard ...

Create a free account to access the full course.

By signing up, you agree to Educative's Terms of Service and Privacy Policy