How data is partitioned in Dynamo

Dynamo is a highly available key-value store developed by Amazon for their internal use. It provides a flexible design to let applications choose their desired level of availability and consistency.

Data partitioning

At a high level, Dynamo is a Distributed Hash Table (DHT) that is replicated across a cluster of servers for high availability and fault tolerance.

Dynamo uses Consistent Hashing to distribute its data among nodes. Consistent hashing also provides scalability, which means it is easy to add or remove nodes from a Dynamo cluster.

There are two important aspects that we need to address while distributing data:

  • How to determine the node on which the piece of data will be stored.
  • The design of a protocol that describes how data would be moved if a new node joins or an existing node is removed from the system. Furthermore, how can data movement upon joining and removal of nodes be minimized?

Consistent hashing

The key idea behind consistent hashing is that every record and server is mapped on the unit circle. Each record is then assigned to the first server that appears on the circle in a clockwise direction. This way, each node in the ring is responsible for a range of data. Dynamo uses the consistent hashing algorithm to determine what row is stored to what node. Here is an example of the consistent hashing ring:

Consistent hash ring

In the diagram above, we have a hash range of 1-100. This means that the data items with a hash in the range 1-100 will be stored in the above system.

Data is divided equally by the four nodes in the system above. All data in the range 1-25 is stored at server A, and so on.

In Dynamo’s terminology, the start of each range is called a token. Each node is assigned one token. Here is the range information and server tokens in the above system:

Server Token Range start Range end
A 1 1 25
B 26 26 50
C 51 51 75
D 76 76 100

Operations in dynamo

Whenever Dynamo serves a put() or get() request, the MD5 hashing algorithm is applied to the key. The output of this hashing algorithm is a fixed-length digest value. This output determines the range within which the key falls, and as a result, the node responsible for the key is discovered.

Removal and addition of nodes

The consistent hashing scheme described above works great when a node is added or removed from the ring, as only the immediate neighboring node is affected. For example, when a node is removed, the next node (moving clockwise on the ring) becomes responsible for all of the keys stored on the outgoing node.

Virtual nodes

In a simple, consistent hashing scheme, data load balancing is inefficient because adding or removing a node results in moving a lot of data. There is a possibility that a node becomes a hotspot, i.e., a server might be responsible for a huge partition of data that can become a bottleneck in a distributed system.

To handle load imbalance, Dynamo introduced virtual nodes for every physical server in the system. The hash range is divided into numerous smaller ranges, and each of these subranges is dedicated to physical nodes in the system. The following diagram illustrates the concept:

As the diagram above shows, the virtual nodes are distributed randomly and non-contiguously so that no two neighboring virtual nodes are assigned to the same server. Virtual nodes help to spread the load more evenly across the cluster. This helps to speed up the process of data distribution in the scenarios of node addition and removal. Furthermore, the probability of hotspots is reduced, as each physical server has multiple smaller ranges rather than one big range.

Copyright ©2024 Educative, Inc. All rights reserved