The topology of a distributed system can change over time. New follower nodes can join the system, old ones can fail and the leader can experience outages. Let’s see how these changes are handled.

New follower

The data on the system is continually changing as write requests keep flowing from the clients. If a new follower joins the network, it must be brought up to speed with the current state of the data the leader hosts and then be sent the new changes as they arrive at the leader.

In the event a new follower joins, it is not possible for the leader to pause, take a snapshot of the data the leader currently hosts and send it to the new follower. This would defeat the goal of being a highly available system. Clients are constantly writing to the database and the system will become unavailable for writes. The procedure to add a new follower without downtime is that the new follower receives a consistent snapshot of the data at some point in time from the leader and thereafter all the changes that have taken place from the time the snapshot was taken. The leader maintains a log called the replication log, which associates the snapshot with a location within the log. The follower can then be sent all the changes that have taken place following the snapshot location within the log. The position of the snapshot within the log has various names in different systems, for instance. PostgreSQL calls it the log sequence number, and MySQL calls it the binlog coordinates.

Get hands-on with 1300+ tech skills courses.