The **two-phase commit protocol** breaks a [database commit](https://www.educative.io/answers/what-are-database-transactions) into two phases to ensure correctness and fault tolerance in a [distributed](https://www.educative.io/answers/what-are-distributed-systems) database system.

# The protocol

Consider a transaction coordinator that manages the commits to database stores.
As the name suggests, the entire process is divided into two phases:

## Prepare phase
* After each database store (slave) has locally completed its transaction, it sends a “*DONE*” message to the transaction coordinator. Once the coordinator receives this message from *all* the slaves, it sends them a "*PREPARE*" message.
* Each slave responds to the "*PREPARE*" message by sending a "*READY*" message back to the coordinator. 
* If a slave responds with a "*NOT READY*" message or does not respond at all, then the coordinator sends a global "*ABORT*" message to all the other slaves. Only upon receiving an acknowledgment from *all* the slaves that the transaction has been aborted does the coordinator consider the entire transaction aborted.

## Commit phase
* Once the transaction coordinator has received the "*READY*" message from *all* the slaves, it sends a "*COMMIT*" message to all of them, which ​contains the details of the transaction that needs to be stored in the databases.
* Each slave applies the transaction and returns a "*DONE*" acknowledgment message back to the coordinator.
* The coordinator considers the entire transaction to be completed once it receives​ a "*DONE*" message from *all* the slaves. 

The following diagram illustrates a successful transaction using the two-phase commit protocol:


## Pros
The protocol makes the data *consistent* and *available*, either all the databases get an update or none do. T​his protocol ensures that the databases are always synchronized.

## Cons
The two-phase commit is a blocking protocol; the failure of a single node blocks progress until the node recovers. Moreover, if the transaction coordinator fails, then the database is left in an inconsistent state and only recovers once the coordinator recovers. This leads to another drawback as the protocol's latency depends on the slowest node. Since it waits for all the nodes to send acknowledgment messages, a single slow node will slow down the entire transaction.​

What is the two-phase commit protocol?

The protocol

Prepare phase

Commit phase

Pros

Cons