Distributed Transaction

Learn about the mechanics of a two-phase commit for a distributed transaction.

Introduction

The storage engine and WAL handle the atomicity of a database transaction in a single node, as described earlier.

A distributed transaction is a transaction that spawns over multiple database nodes. It is inherently complex because all nodes must commit together or abort. A subset of nodes committing and the rest aborting will lead to inconsistent results.

In this section, we will discuss the two-phase commit, which is a popular distributed transaction technique.

Two-phase commit

A two-phase commit (2PC) is a protocol for achieving an atomic distributed commit across multiple nodes, such that all nodes commit together or abort together. The protocol ensures that at no time does a subset of nodes commit and the rest abort.

These are the components involved in a 2PC:

  • Coordinator: 2PC coordinates transactions across multiple nodes through a coordinator (or transaction manager). The coordinator is either a library or a process running on the client to orchestrate the distributed transaction.

  • Participant: The database nodes involved in the transaction are called participants.

Workflow

A 2PC transaction begins with the coordinator generating a globally unique transaction identifier. Then, the coordinator ...