...
/Raft's Safety, Fault-Tolerance, and Availability Protocols
Raft's Safety, Fault-Tolerance, and Availability Protocols
Let's learn how Raft ensures safety, handles leader and followers' crashes, and maintains availability.
Safety
The previous lessons discussed how Raft selects leaders and replicates log entries. Still, additional mechanisms are needed to guarantee that every state machine executes the same commands in the same order. To see why this is the case, take an example of a follower that misses several log entries while the leader commits them. Such a follower can become the new leader and can overwrite the committed entries with new ones, resulting in different state machines executing different sequences of commands. The following slides show such a scenario:
To address this issue, the Raft algorithm restricts which servers can be elected as leaders to ensure that the leader for any given term contains all the entries committed in previous terms.
Election restriction
In leader-based consensus algorithms, the leader is responsible for eventually storing all committed log entries. However, some algorithms allow a leader to be elected without initially having all committed entries. These algorithms require additional mechanisms to identify and transmit missing entries to the new leader during or after the election process, leading to increased complexity.
The following table enlists the two election restrictions set by Raft, along with their rationale.
Restriction | Rationale |
Raft ensures that all committed entries from previous terms are on each new leader from the moment of its election, so there is no need to transfer them afterward. | This ensures that log entries have a unidirectional flow from leaders to followers, and leaders never overwrite their logs' existing entries. |
Raft prevents a candidate from winning an election unless all the committed entries are in its log. It achieves this restriction through the voting process. The voter denies the vote's request of a candidate whose log is out-of-date, rather than their own log. | As the voting process dictates, the candidate must contact a majority of the cluster to become a leader. Any majority of the cluster node will have at least one node that has the latest committed data. The `RequestVote` RPC enforces this restriction in Raft by including the information about the candidate's log. |
Point to ponder
What could be Raft’s criteria for the log to be considered up-to-date?
Committing entries from previous terms
The leader of a Raft cluster confirms the commitment of an entry from the current term once it has been stored on most of the servers. However, if the leader crashes before committing an entry, future leaders will try to replicate the entry. Nonetheless, if an entry from a previous term is stored on a majority of the servers, the leader cannot immediately deduce that it has been committed.
Before discussing Raft’s approach to commit log entries from previous terms, let’s discuss the issue of an old log entry stored on the majority of servers potentially getting overwritten by a future leader. ...