The Read Path

Examine the Apache Cassandra Read Path along with the storage structures that expedite the read process.

High-level read path

Being a peer-to-peer system, any node in the Cassandra cluster can handle a read request. The node receiving the request becomes the coordinator node. The coordinator node checks whether enough replicas are available to satisfy the consistency level (CL) specified for the operation. If not, an exception is thrown and the read operation fails.

If CL can be achieved, the coordinator employs a dynamic snitch to determine the fastest replica. It sends a “direct read request” to the fastest replica and “digest requests” to a number of replicas required to fulfill the consistency level. A direct read request results in the replica responding with the requested data. A digest request results in the replica sending a digest/checksum-hash of the requested data. Digest requests reduce the network data traffic.

Once the data from the direct (full) read request has been received, the coordinator calculates its digest/hash and compares it to the digests sent by other replicas. If all digests are identical, and the required consistency level has been achieved, the data from the direct read request is delivered to the client. 

The diagram below illustrates the high-level read path with an example. The cluster consists of one datacenter titled datacenter1 comprising five nodes. Assume the RF for the keyspace is 3 and the consistency level (CL) for the read operation is TWO.

Press + to interact
High level read path
High level read path

The high level read path comprises the following steps:

  1. Node5 receives a read request for partition stored on Node1, Node2, and Node3 referred to as Replica1, Replica2, and Replica3, respectively. Node5 becomes the coordinator.

  2. The coordinator sends a direct read request to the fastest replica, Replica2 in this example.

  3. The coordinator sends a digest read request to second fastest replica, Replica3 in this example.

  4. Replica2 returns full data to the coordinator.

  5. Replica3 returns digest "28" to the coordinator.

  6. The coordinator calculates the digest from the full data returned in step 4. Let’s assume the digest calculated is "28".

  7. The coordinator compares digests from steps 5 and 6.

  8. Since the digests match & desired CL has been achieved, the coordinator responds to the client with data it received in step 4.

In the example illustrated above, the digests "28" matched, indicating identical data stored on both Replica2 and Replica3. In case the data stored on replicas is not identical, a process called ...