RAID Level 0: Striping
This lesson introduces RAID level zero and analyzes it with respect to different axes.
We'll cover the following...
The first RAID level is actually not a RAID level at all, in that there is no redundancy. However, RAID level 0, or striping as it is better known, serves as an excellent upper-bound on performance and capacity and thus is worth understanding.
The simplest form of striping will stripe blocks across the disks of the system as follows (assume here a 4-disk array):
From the figure above, you get the basic idea: spread the blocks of the array across the disks in a round-robin fashion. This approach is designed to extract the most parallelism from the array when requests are made for contiguous chunks of the array (as in a large, sequential read, for example). We call the blocks in the same row a stripe; thus, blocks 0, 1, 2, and 3 are in the same stripe above.
In the example, we have made the simplifying assumption that only 1 block (each of say size 4KB) is placed on each disk before moving on to the next. However, this arrangement need not be the case. For example, we could arrange the blocks across disks as shown in the figure below:
In this example, we place two 4KB blocks on each disk before moving on to the next disk. Thus, the chunk size of this RAID array is 8KB, and a stripe thus consists of 4 chunks or 32KB of data.
ASIDE: THE RAID MAPPING PROBLEM
Before studying the capacity, reliability, and performance characteristics of the RAID, we first present an aside on what we call the mapping problem. This problem arises in all RAID arrays; simply put, given a logical block to read or write, how does the RAID know exactly which physical disk and offset to access?
For these simple RAID levels, we do not need much sophistication in order to correctly map logical blocks onto their physical locations. Take the first striping example above (chunk size = 1 block = 4KB). In this case, given a logical block address A, the RAID can easily compute the desired disk and offset with two simple equations:
Disk = A % number_of_disks Offset = A / number_of_disks
Note that these are all integer operations (e.g., 4 / 3 = 1 not 1.33333…). Let’s see how these equations work for a simple example. Imagine in the first RAID above that a request arrives for block 14. Given that there are 4 disks, this would mean that the disk we are interested in is (14 % 4 = 2): disk 2. The exact block is calculated as (14 / 4 = 3): block 3. Thus, block 14 should be found on the fourth block (block 3, starting at 0) of the third disk (disk 2, starting at 0), which is exactly where it is.
You can think about how these equations would be modified to support different chunk sizes. Try it! It’s not too hard.
Chunk sizes
Chunk size mostly affects the performance of the array. For example, a small chunk size implies that many files will get striped across many disks, thus increasing the parallelism of reads and writes to a single file. However, the positioning time to access blocks across multiple disks increases, because the positioning time for the entire request is determined by the maximum of the positioning times of the requests across all drives.
A big chunk size, on the other hand, reduces such intra-file parallelism, and thus relies on multiple concurrent requests to achieve high throughput. However, large chunk sizes reduce positioning time. If, for example, a single file fits within a chunk and thus is placed on a single disk, the positioning time incurred while accessing it will just be the positioning time of a single disk.
Back to RAID-0 analysis
Let us now evaluate the capacity, reliability, and performance of striping. From the perspective of capacity, it is perfect: given disks each of size blocks, striping delivers ...