...
/Replication and Coordination of State Machines
Replication and Coordination of State Machines
Learn to replicate state machines with coordination to maintain a fault-tolerant service.
A single state machine will be as fault-tolerant as the node it is running on. Replicating a state machine on multiple nodes can make it
Outputs from replicas
By behaving similarly, we mean producing the same output. All state machines in a group of replicas will produce the same output if the following conditions are satisfied for every replica that runs on a non-faulty node (or processor):
Every replica starts in the same initial state.
Every replica executes the same requests in the same order.
How many replicas?
Since failures are independent, we will assume that a failure can affect at most one node and (as a result) one state machine. The combined output of an ensemble of replicas resulting from replicating a state machine is the output of its
If nodes can experience Byzantine failures, then for the replica group to be
The group must have a minimum of
state machine replicas. The group's output must be the output produced by a majority of the replicas in the group.
As long as the failures are no more than
Points to ponder
In situations when Byzantine failures are possible, how can we determine if nodes have failed?
If nodes can only experience fail-stop failures, then we need at least