Using Checksums

This lesson discusses how checksums are used in disks to detect disk corruption.

With a checksum layout decided upon, we can now proceed to actually understand how to use the checksums. When reading a block DD, the client (i.e., file system or storage controller) also reads its checksum from disk Cs(D)C_s(D), which we call the stored checksum (hence the subscript CsC_s). The client then computes the checksum over the retrieved block DD, which we call the computed checksum Cc(D)C_c(D). At this point, the client compares the stored and computed checksums; if they are equal (i.e., Cs(D)==Cc(D)C_s(D) == C_c(D), the data has likely not been corrupted and thus can be safely returned to the user. If they do not match (i.e., Cs(D)!=Cc(D)C_s(D) != C_c(D)), this implies the data has changed since the time it was stored (since the stored checksum reflects the value of the data at that time). In this case, we have a corruption, which our checksum has helped us to detect.

Given a corruption, the natural question is what should we do about it? If the storage system has a redundant copy, the answer is easy: try to use it instead. If the storage system has no such copy, the likely answer is to return an error. In either case, realize that corruption detection is not a magic bullet; if there is no other way to get the non-corrupted data, you are simply out of luck.

Get hands-on with 1400+ tech skills courses.