...

The Paxos Algorithm

In this lesson, we will examine how the Paxos algorithm solves the consensus problem.

We'll cover the following...

Story of the Paxos algorithm
Basic ingredient of the Paxos protocol

Some algorithms could arguably be applied as solutions to the consensus problem.

For instance, the 2-phase commit protocol could be used, where the coordinator would drive the voting process.

However, such a protocol would have very limited fault tolerance, since the failure of a single node (the coordinator) could bring the whole system to a halt.

The obvious next step is to allow multiple nodes to inherit the role of the coordinator in these failure cases. This would then mean that there might be multiple primaries that might produce conflicting results.

This phenomenon is demonstrated in the lesson multi-primary replication and the 3-phase commit lesson.

One of the first algorithms that could solve the consensus problem safely under these failures is the Paxos algorithm.

Story of the Paxos algorithm

This algorithm guarantees that the system will come to an agreement on a single value and tolerate the failure of any number of nodes (potentially all of them) as long as more than half the nodes work properly at any time, which is a significant improvement.

Funnily enough, this algorithm was invented by Leslie Lamport during his attempt to prove that this is actually impossible!

He decided to explain the algorithm in terms of a parliamentary procedure used in an ancient, fictional Greek island called Paxos.

Despite being elegant and highly entertaining, this first paperL. Lamport, “The Part-time Parliament,” ACM Transactions on Computer Systems (TOCS), 1998. was not well received by the academic community, who found it extremely complicated and could not discern its applicability in the field of distributed systems.

A few years later and after several successful attempts to use the algorithm in real-life systems, Lamport decided to publish a second paperL. Lamport, “Paxos Made Simple,” ACM SIGACT News (Distributed Computing Column) 32, 4 (Whole Number 121, December 2001), 2001., explaining the algorithm in simpler terms and demonstrating how it can be used to build an actual, highly available distributed system.

A historical residue of all this is the fact that the Paxos algorithm is regarded as a rather complicated algorithm until today. Hopefully, this section will help dispel this myth.

Before Getting Started

Introduction to Distributed Systems

Basic Concepts and Theorems

Distributed Transactions

Achieving Isolation

Achieving Atomicity

Concluding Distributed Transactions

Consensus

Time

Order

Networking

Security

Security Protocols

From Theory to Practice

Case Study 1: Distributed File Systems

Case Study 2: Distributed Coordination Service

Case Study 3: Distributed Data Stores

Case Study 4: Distributed Messaging System

Case Study 5: Distributed Cluster Management

Case Study 6: Distributed Ledger

Case Study 7: Distributed Data Processing Systems

Practices & Patterns

Communication Patterns

Coordination Patterns

Data Synchronization

Shared-nothing Architectures

Distributed Locking

Compatibility Patterns

Dealing with Failure

Distributed Tracing

Concluding this Course

The Paxos Algorithm

Story of the Paxos algorithm

Roles

Proposer

Acceptor