Consensus with Crash Failures in Synchronous Systems
Learn how to achieve consensus in systems with crash failures.
We'll cover the following
We’ll consider the weakest case, namely fail-stop failures. In this case, a faulty process can stop somewhere in the middle of its execution.
We recall the Definition of crash-resilience:
Crash-resilience: A protocol that can tolerate up to crashed processes is said to be -crash resilient.
Problem definition
We consider a system that contains a collection of processes, where each process can communicate reliably, whereby processes may fail only by crashing. Remember that in this case, any process might stop during a message sending step, resulting in only a subset of messages being sent, meaning a process may send a message to only a subset of all processes before crashing. In this case, we assume that the process remains halted and never reaches the decided state.
Conditions for the consensus problem in the fail-stop model
There are processes, of which at most might crash, i.e., at least processes are correct, and any process starts with an initial input value . The processes must decide for one of those values, satisfying the following properties (
- Termination: Every correct (not crashing) process is eventually decided in finite time.
- Agreement: All correct (not crashing) processes decide for the same value, i.e., no two processes decide differently.
- Validity: If all processes started by proposing the same initial value , then any (not crashing) process in the decision state has chosen that value .
Note: There are appropriate variations on the definition of validity in literature dependent on the application. An alternative and stronger validity condition for fail-stops are as follows.
A stronger validity condition
Validity: If a process decides a value , then was proposed by some process, meaning that the decision value must be the input value of a process.
Get hands-on with 1400+ tech skills courses.