The Write Path
Explore Cassandra's write path, and the in-memory and on-disk storage structures and processes that enable incredibly fast reads and writes.
We'll cover the following...
High-level write path
Being a peer-to-peer system, any node in the Cassandra cluster can handle a write request. The node receiving the write request is called the Coordinator node. Based on the RF defined for the table, the coordinator node forwards the request to all the nodes holding the token range. The coordinator then waits for acknowledgment from the number of nodes defined in consistency level (CL). Once the consistency level is achieved for the write operation, the coordinator responds back to the client.
In case a replica node is down, the coordinator node stores a copy of the data, called a hint, to replay the data when the replica returns to the ring.
For multi-datacenter deployments, the coordinator contacts a node in each of the other datacenters, to act as a remote coordinator and propagate the write to all replicas in their respective datacenters.
The diagram above demonstrates a Cassandra cluster comprising two datacenters: datacenter1
, and datacenter2
with keyspace replication factor RF being 3 for both datacenters. With RF = 3, three nodes in each datacenter contain the token range. The node in datacenter1
receiving the write request becomes the coordinator node. The coordinator forwards the write request to all three replicas responsible for storing the partition in datacenter1
. The coordinator also forwards the write request to a node in datacenter2
, which acts as a remote coordinator, propagating the write to all replicas in datacenter2
. Thus, all active replicas in datacenter1
and datacetenter2
receive and store a copy of the data.
Please note that Node2
in datacenter1
is down. The coordinator saves a copy of the data to be written as a hint and replays the write once the replica returns to the ring.
Assuming the CL for the write operation was QUORUM
, i.e.,