A New Problem: Misdirected Writes
In this lesson, we look at the problem of misdirected writes and discuss a solution for it.
We'll cover the following
The basic scheme described in the previous lesson works well in the general case of corrupted blocks. However, modern disks have a couple of unusual failure modes that require different solutions.
The first failure mode of interest is called a misdirected write. This arises in disk and RAID controllers which write the data to disk correctly, except in the wrong location. In a single-disk system, this means that the disk wrote block not to address (as desired) but rather to address (thus “corrupting” ). In addition, within a multi-disk system, the controller may also write not to address of disk but rather to some other disk . Thus our question:
CRUX: HOW TO HANDLE MISDIRECTED WRITES
How should a storage system or disk controller detect misdirected writes? What additional features are required from the checksum?
Adding a physical identifier
The answer, not surprisingly, is simple: add a little more information to each checksum. In this case, adding a physical identifier (physical ID) is quite helpful. For example, if the stored information now contains the checksum and both the disk and sector numbers of the block, it is easy for the client to determine whether the correct information resides within a particular locale. Specifically, if the client is reading block 4 on disk 10 (), the stored information should include that disk number and sector offset, as shown below. If the information does not match, a misdirected write has taken place, and a corruption is now detected. Here is an example of what this added information would look like on a two-disk system. Note that this figure, like the others before it, is not to scale, as the checksums are usually small (e.g., 8 bytes) whereas the blocks are much larger (e.g., 4 KB or bigger):
Get hands-on with 1400+ tech skills courses.