The Paxos Algorithm

In this lesson, we will examine how the Paxos algorithm solves the consensus problem.

Some algorithms could arguably be applied as solutions to the consensus problem.

For instance, the 2-phase commit protocol could be used, where the coordinator would drive the voting process.

However, such a protocol would have very limited fault tolerance, since the failure of a single node (the coordinator) could bring the whole system to a halt.

The obvious next step is to allow multiple nodes to inherit the role of the coordinator in these failure cases. This would then mean that there might be multiple primaries that might produce conflicting results.

This phenomenon is demonstrated in the lesson multi-primary replication and the 3-phase commit lesson.

One of the first algorithms that could solve the consensus problem safely under these failures is the Paxos algorithm.

Story of the Paxos algorithm

This algorithm guarantees that the system will come to an agreement on a single value and tolerate the failure of any number of nodes (potentially all of them) as long as more than half the nodes work properly at any time, which is a significant improvement.

Funnily enough, this algorithm was invented by Leslie Lamport during his attempt to prove that this is actually impossible!

He decided to explain the algorithm in terms of a parliamentary procedure used in an ancient, fictional Greek island called Paxos.

Despite being elegant and highly entertaining, this first paperL. Lamport, “The Part-time Parliament,” ACM Transactions on Computer Systems (TOCS), 1998. was not well received by the academic community, who found it extremely complicated and could not discern its applicability in the field of distributed systems.

A few years later and after several successful attempts to use the algorithm in real-life systems, Lamport decided to publish a second paperL. Lamport, “Paxos Made Simple,” ACM SIGACT News (Distributed Computing Column) 32, 4 (Whole Number 121, December 2001), 2001., explaining the algorithm in simpler terms and demonstrating how it can be used to build an actual, highly available distributed system.

A historical residue of all this is the fact that the Paxos algorithm is regarded as a rather complicated algorithm until today. Hopefully, this section will help dispel this myth.

Roles

The Paxos algorithm defines three different roles:

  • Proposers
  • Acceptors
  • Learners

Every node in the system can potentially play multiple roles.

Proposer

A proposer is responsible for proposing values (potentially received from clients’ requests) to the acceptors and trying to persuade them to accept their value to arrive at a common decision.

Acceptor

An acceptor is responsible for receiving these proposals and replying with their decision on whether this value can be ...