What is the two-phase commit protocol?

The two-phase commit protocol breaks a database commit into two phases to ensure correctness and fault tolerance in a distributed database system.

The protocol

Consider a transaction coordinator that manages the commits to database stores. As the name suggests, the entire process is divided into two phases:

Prepare phase

  • After each database store (slave) has locally completed its transaction, it sends a “DONE” message to the transaction coordinator. Once the coordinator receives this message from all the slaves, it sends them a “PREPARE” message.
  • Each slave responds to the “PREPARE” message by sending a “READY” message back to the coordinator.
  • If a slave responds with a “NOT READY” message or does not respond at all, then the coordinator sends a global “ABORT” message to all the other slaves. Only upon receiving an acknowledgment from all the slaves that the transaction has been aborted does the coordinator consider the entire transaction aborted.

Commit phase

  • Once the transaction coordinator has received the “READY” message from all the slaves, it sends a “COMMIT” message to all of them, which ​contains the details of the transaction that needs to be stored in the databases.
  • Each slave applies the transaction and returns a “DONE” acknowledgment message back to the coordinator.
  • The coordinator considers the entire transaction to be completed once it receives​ a “DONE” message from all the slaves.

The following diagram illustrates a successful transaction using the two-phase commit protocol:

svg viewer

Pros

The protocol makes the data consistent and available, either all the databases get an update or none do. T​his protocol ensures that the databases are always synchronized.

Cons

The two-phase commit is a blocking protocol; the failure of a single node blocks progress until the node recovers. Moreover, if the transaction coordinator fails, then the database is left in an inconsistent state and only recovers once the coordinator recovers. This leads to another drawback as the protocol’s latency depends on the slowest node. Since it waits for all the nodes to send acknowledgment messages, a single slow node will slow down the entire transaction.​

Copyright ©2024 Educative, Inc. All rights reserved