...

/

Detailed Design of Sharded Counters

Detailed Design of Sharded Counters

Let's deep dive into the design of sharded counters.

Detailed design

We will now discuss the three primary functionalities of the sharded counter (creation, write, and read) in detail. We will answer many important questions using Twitter as an example. These questions include: how many shards should be created against each new tweet?; how will the shard value be incremented for a specified tweet?; what will happen in the system when read requests come from the end-users?

Sharded counter creation

As we discussed earlier, when a user posts a tweet on Twitter, the create API is called. The system creates multiple counters for each newly created post by the user. The following is the list of main counters created against each new tweet:

  • Tweet like counter
  • Tweet reply counter
  • Tweet retweet counter
  • Tweet view counter in case tweet contains video

Now the question is, how does the system decide the number of shards in each counter? The decision on the number of shards is very critical for good performance. If the shard count is small for a specific write workload, we will face high write contention resulting in slow writes. On the other hand, if the shard count is too high for a particular write profile, we will encounter higher overhead on the read operation. The reason for slower reads is due to the collection of values from different shards (that might reside on different nodes inside geographically distributed data centers). The reading cost of a counter value will rise linearly with the number of shards because values of all shards of a respective counter will be added. The writes will scale linearly as we add new shards due to increasing requests. Therefore there is a tradeoff between making writes fast versus read performance. We will see later how we can improve read performance.

The decision about the number of shards depends on many factors, that collectively are trying to predict the write load on a specific counter in the short term. For tweets, these factors include followers count. The tweet of a user with millions of followers gets more shards than a user with few followers on Twitter because there is a possibility that their tweets will get many (probably millions) likes. Sometimes, a celebrity tweet has a hashtag(s). The system also creates the sharded counter for this hashtag counter because it has a high chance of getting into trends.

Many human-centered activities often have a long-tailed activity pattern, where many people are ...

Create a free account to access the full course.

By signing up, you agree to Educative's Terms of Service and Privacy Policy