Failovers

Nodes can fail, and we need to have an excellent strategy to minimize downtime. One way to reduce downtime is a fast failover. Let’s discuss the failover scenario and how our system handles such cases.

A primary replica discards its state about sessions, handles, and locks when it fails or loses leadership. This must be followed by the election of a new primary replica with the following two possible options:

If a primary replica gets elected rapidly, clients can connect with the new primary replica before their own approximation of the primary replica lease’s (local lease) duration runs out.
If the election extends for a longer duration, clients discover the new primary replica by emptying their caches and waiting for the grace period. As a result, our system can keep sessions running during failovers that go past the typical lease time-out, thanks to the grace period.

Once a client has made contact with the new primary replica, the client library and the primary replica give this impression to the application that nothing went wrong by working together. The new primary replica must recreate the in-memory state of the primary replica that it replaced. It accomplishes this in part by:

Reading data that has been duplicated via the standard database replication technique and kept safely on disk
Acquiring state from clients
Making cautious assumptions

Prologue

File Systems

Google File System (GFS)

Google Colossus File System

Facebook's Tectonic File System

Databases

Google Bigtable

Google Megastore

Google Spanner

Key-value Stores

Many-core Key-value Store

Scaling Memcache

SILT

Amazon DynamoDB

Concurrency Management

Two-phase Locking (2PL)

Google Chubby Locking Service

ZooKeeper

Big Data Processing: Batch to Stream Processing

MapReduce

Spark

Kafka

Consensus

Understanding Consensus: Two Generals, FLP, & Byzantine Generals

Two-phase Commit

State Machine Replication

Paxos

Raft

Epilogue

Detailed Design of Chubby: Part IV

Failovers

Newly elected primary replica proceedings

Quiz

Example scenario

Backup

Mirroring