Leaderless Replication

Learn about the mechanics of leaderless replication.

Introduction

Single-leader and multileader replication strategies involve a replica called a leader that receives writes and sequences the write for the rest of the replicas. The followers then apply the write requests in the same order dictated by the leader. Leaderless replication is a special class of replication strategy that removes the concept of a leader and treats every replica in the database uniformly.

Amazon published a research paper on the characteristics of a distributed leaderless database in, “Dynamo: Amazon’s Highly Available Key-Value Store.” This paper presents the design and implementation of Dynamo, a highly available key-value storage system that runs some of Amazon’s core services. Some of the important aspects of the paper include:

  • Partition algorithm

  • Replication

  • Data versioning

  • Handling failures through hinted handoff

  • Membership and failure detection

Databases that adopt leaderless replication are also called Dynamo-style databases. Examples of databases that adopt leaderless replication include Cassandra and Voldemort.

Quorum

In the absence of a leader for write requests, leaderless replication introduces a concept called quorum to fulfill read and write requests. A quorum is the minimum number of votes a distributed transaction obtains to perform a given operation.

In a leaderless replication with N replicas:

  • Out of N replicas, W replicas must confirm the write request for the client to consider it successful, where W <= N.

  • Out of N replicas, R replicas must acknowledge the read request for the client to consider it successful, where R <= N.

  • As long as W + R > N, we always get up-to-date data on reads because at least one of the R nodes we are reading will be up to date.

Values R and W are the minimum number of votes required for the read and write requests to be successful. A common choice is to keep N an odd number and set W = R = (N+1)/2 rounded up. For example, if N = 5, W = R = 3, we can tolerate the loss of two nodes and still get updated results on reads.

There are two approaches the client can interact with the database:

  • The client can forward the read and write requests to quorum directly.

  • The client can forward the read and write requests to a random node called the coordinator, and the coordinator node orchestrates with quorum nodes.

Conflict detection

In a leaderless replication, multiple clients can concurrently write to the database. As a result, multiple requests for the same record can land on quorum out of order. As a result, each replica can view the latest snapshot for a particular record differently. The database uses version numbers to resolve the latest snapshot for a particular record.

Data reconciliation

In a leaderless replication, the replicas can go offline temporarily or become out of sync. Sometimes the replicas can be down for an extended period. Without a leader, we need a framework so that all the replicas converge over time and have the same data view.

There are two broad mechanisms used in leaderless replication to achieve data reconciliation:

  • Read repair

  • Anti-entropy process

Read repair

Read repair reconciliation works well for frequently read data. This is the sequence of actions that occur during a read repair flow.

  • The client issues the read request to a quorum of replica nodes through the coordinator node.

  • The coordinator node examines the staleness of the response through the version number.

  • If the replicas' response is stale, the coordinator node issues a sync request to update the record with the latest snapshot. The sync request can be either synchronous or asynchronous.

  • Once the coordinator issues the sync request, the coordinator returns the recent value of the record to the client.

Get hands-on with 1300+ tech skills courses.