This blog discusses an important compression technique for colored images known as chroma subsampling. This technique is interesting because it is based on insights about the human visual system. The main idea is to achieve compression by discarding the image data which we can't see or to which we are less sensitive. Compression techniques in which some data or information is lost are known as lossy compression methods.
Digital images can be used for various practical purposes. They may be used for human consumption or for machines. It is not always wise to use chroma subsampling if the image is intended for machines. However, it is a useful method of lossy compression if the digital image is to be intended for the human visual system.
To make a decision during lossy compression about losing a part of the information or image data, it is important to understand how our visual system works. When we look at some object, its flipped image is made at the retina through the lens of the eye. Look at the following image showing an illustrated cross-section of the human eye:
At the retina, several tiny photoreceptive sensors sense this image and send the information to the brain through the optic nerve. The photoreceptive sensors at the retina are broadly classified into two types, rods and cones. The rods are very sensitive to light, while the cones are responsible for color vision. The number of rods in the retina of the human eye is approximately 91 million, while the cones are approximately 4.5 million. This means that the human eye is more sensitive to luminance than to color in visual data. This insight is the foundation of the idea that we can reduce the resolution of the color component without compromising the perceived visual quality of the image.
Since we want to reduce the resolution of the color component in the image, we first have to separate the color part from the luminance part of the image. This can be achieved by transforming the image from
Three components,
These equations are also used pixel-wise to transform the color space of the image. Both the set of equations discussed above are for the case when the
The term subsampling means considering a reduced number of samples from the available samples for the same amount of space. As discussed above, each pixel of an image in
The leftmost column of the illustration shows three layers of sixteen pixels in the
There are other subsampling methods like 4:2:2, in which along four samples of
Let's calculate the amount of space saved by using chroma subsampling. Take an image of full HD size, that is
The transformation from
The compression ratio is defined as the ratio of the uncompressed size to the compressed size. The amount of space saved, expressed as a fraction of the uncompressed size, can be computed as follows:
Let's compute these quantities for the example we are working with:
This shows that by using 4:2:0 chroma subsampling, we can save half of the space required to store or transmit an image. In the case of 4:2:2, the compression ratio is
The famous image compression standard JPEG uses 4:2:0 chroma subsampling as part of its algorithm while compressing colored images in lossy mode. JPEG also supports other subsampling formats like 4:4:4, 4:2:2, and 4:1:1. In the 4:1:1 format, the chroma component is one-fourth, taking one sample per four samples in a row. In 4:4:4, there is no subsampling involved, and the chroma component is completely preserved. The JPEG standard recommends a collection of algorithms applied one after the other to compress a given image, and chroma subsampling is one of those algorithms. The 4:2:0 chroma subsampling is also part of many other image and video compression standards, including MPEG and H.26x.
To dive a little deeper and have experience implementing the ideas presented in this blog, you may like to visit the following hands-on projects:
Free Resources