Raft's Cluster Membership Changes

Learn how Raft handles the servers' transfer from an old to a new configuration.

Cluster membership changes

Previously, it was assumed that the configuration of servers participating in the consensus algorithm is fixed. Sometimes, it is necessary to change the configuration to replace failed servers or change the replication levels of servers. However, taking the entire cluster offline to update configuration files is risky. The associated risk could be an operator error causing unavailability. To automate configuration changes and avoid these issues, the Raft consensus algorithm incorporates a mechanism for configuration changes.

Raft’s configuration change process must provide safety, preventing the possibility of two servers getting elected as leaders for the same term. However, directly converting all servers from the old configuration to the new one is risky since it risks dividing the cluster into two isolated majorities. As a result, switching all servers atomically at once is not possible.

The following illustration depicts an issue: If we switch directly from one configuration to another, different servers will make that switch randomly. There must be a moment when two majorities would follow two different cluster configurations, possibly having two different leaders from two different configurations.

Press + to interact
A moment in time when two separate leaders, one with a majority of the old configuration (C_old) and another with a majority of the new configuration (C_new), might be chosen to serve the same term
A moment in time when two separate leaders, one with a majority of the old configuration (C_old) and another with a majority of the new configuration (C_new), might be chosen to serve the same term

Point to ponder

1.

What mechanism should we use to disseminate the configuration information to all nodes?

Show Answer
Q1 / Q1
Did you find this helpful?

Joint consensus configuration

Raft uses a two-phase approach to ensure safety during configuration changes:

  • Phase I: It first switches the cluster to a transitional configuration called joint consensus, which combines the old and new configurations with the following rules.
    • Log entries are replicated to all servers in both configurations.
    • Any server from either configuration may serve as the leader.
    • Agreement for elections and entry commitment requires separate majorities from old and new configurations.
  • Phase II: It then transitions into the new
...
Access this course and 1400+ top-rated courses and projects.