Reed-Solomon codes are used to detect and correct errors in the data introduced during data transmission or on the data storage devices, such as a scratch on CDs corrupting the data. We will answer for error correction, where we know the position of the erred data. Such errors are known as erasures.
Reed-Solomon-based error correction works by adding some extra bits called parity at the end of the actual data and using those extra bits to recover the corrupted data. The code consists of two components –– an Encoder and a Decoder, as shown in the illustration below.
The encoder takes data as data symbols, where each data symbol refers to a fixed number of bits b of the data. The encoder appends parity symbols to data symbols. The data and parity symbols together make a codeword. Given a symbol size (in bits) b, we can have maximum
If we have d data symbols in a codeword with a b-bit symbol size, the parity symbols that can be added are n-d, and the number of symbol errors we can correct in a codeword depends on the number of parity symbols added to that codeword.
Let’s say we have the data symbol of size 4 bits. The maximum number of symbols that we can have in a codeword with a 4-bit symbol size is 15, as calculated below:
If there are 10 data symbols in the codeword with a 4-bit symbol size, we can add a maximum of 5 parity symbols, as calculated below:
With n-d parity bits, the decoder can correct up to
Now, let’s see the mathematical method used to correct erasures and how it works through an example.
With Lagrange interpolation, we find a polynomial of one degree less than the number of data symbols in the codeword. The encoder generates the polynomial using data symbols and finds parity symbols to be appended to the data using the generated polynomial. The decoder generates the polynomial using the symbols (data and parity) that are received at the receiver's side to recover data symbol erasures.
Let’s take an example of three data symbols, 4, 5, and 3, and create a polynomial of degree two for these points, add the parity symbol and see how the parity symbol is used by the decoder to find erased data symbol in the following slides.
If we have to deal with the worst case i.e. all data symbols lost, we need to add