Entropy coding functions on a conceptual level. This enables it to work with any group of symbols, provided the probability information for each symbol is available. Take, for example, an integer-valued scalar x, which can assume values of -1, 0, and +1, with corresponding probabilities of 0.25, 0.5, and 0.25, respectively.
Entropy Coding: Everything You Need to Know
What Is Entropy Coding?#
In this blog, we’ll take a detailed look at entropy coding, an important technique to compress or efficiently represent data without any loss. To understand the term entropy, we first need to understand a few concepts in this context.
When we’re dealing with data, there is an associated source and a set of possible symbols that the source can generate. For example, think of a system that generates alphanumeric messages or strings like email messages. Another example is a system that generates numbers in the range , like image or video data. A similar system is shown in the figure on the right that can generate values in the range .
Let’s represent the set of symbols by
We normally use the logarithm to adjust the scale. The base of the logarithm is taken to be two so that the information is represented in terms of bits. Therefore:
The entropy of a system refers to the expected unpredictability or the expected amount of information of a symbol present in the system. It is normally calculated in bits and represents the average number of bits required to store a symbol of the system. The entropy of the system
If every symbol of the system is encoded using codes of a fixed length, we call it a fixed-length coding scheme. On the contrary, if the code lengths of different symbols are different, such a coding scheme is called a variable-length coding scheme. Entropy coding is a generic term that refers to any variable-length coding technique that uses shorter codes for more frequent symbols and longer codes for less frequent symbols. The average bits per symbol are at least as much as the system’s entropy through any coding scheme. Entropy, in that sense, is the lower bound on the average bits per symbol by any coding scheme. Let’s look at an example to further understand these concepts.
Example#
Let’s consider a system
Probability Distribution
Symbol | Probability |
a | 0.130 |
b | 0.185 |
c | 0.165 |
d | 0.150 |
e | 0.250 |
f | 0.120 |
The entropy of this system is computed as follows:
This means that, on average,
The Huffman Coding Scheme#
The Huffman coding scheme is a basic, widely-used entropy coding scheme. Another extensively used entropy coding scheme is the arithmetic coding scheme. The most famous variants of the arithmetic coding scheme are binary arithmetic coding (BAC) and context adaptive binary arithmetic coding (CABAC).
Huffman coding works by applying the following two steps repeatedly to form a binary tree:
1- Sorting the symbols by their probabilities
2- Merging two symbols with the smallest probabilities to make one (merged working) symbol with a probability equal to the sum of probabilities of the merged symbols
These two steps are repeated until we get the sum of all the probabilities, that is
Using the same system
In the figure above, the first column shows the symbols and their probabilities. In the second column, the symbols are sorted by their probabilities. In the third column, two symbols with the least probabilities are merged, and the numbers are sorted again. The arrows relate the new position with the old position of the numbers. In the subsequent columns, merge and sort steps are repeated till we get
To get the prefix codes for each symbol, we observe the unique path of each leaf node from the root. No such path can be a prefix of another path. There are two outgoing arrows from every green node. As a matter of convention, we label the arrow going to the left child with a
Huffman Codes
Symbol | Code (From the root, Left = 1, Right = 0) | Length of the code | Probability of the symbol |
a | 011 | 3 | 0.130 |
b | 00 | 2 | 0.185 |
c | 111 | 3 | 0.165 |
d | 110 | 3 | 0.150 |
e | 10 | 2 | 0.250 |
f | 010 | 3 | 0.120 |
Performance and Efficiency Considerations#
In an encoding mechanism, we assign a code to each symbol and write it to a file. Similarly, a decoding mechanism reads codes from the encoded file and determines the corresponding symbol. Let’s consider the example above. If we assign fixed-length codes, they will be three bits long, and the average bits per symbol will also be the same. However, in the case of the Huffman codes, the average bits per symbol will be the following:
We can measure the performance of the Huffman codes compared to fixed-length coding in two ways. Let’s look at both one by one:
This tells us that if fixed-length coding takes
This tells us that if Huffman coding takes a
As the entropy of the system gives the lower bound for representing information, we can compute the efficiency of the Huffman coding scheme by comparing it with the entropy of the system. The efficiency of fixed-length coding can also be computed similarly. Let’s calculate it using the same example:
This tells us that for every
Impact of Data Characteristics on Entropy Coding#
If there are
Importance of Entropy Coding in Data Compression#
Mostly, the symbols in the data have different frequency of occurrence, which favors entropy coding over the fixed-length coding scheme. Entropy coding provides a way to represent data in a more compact way as compared to the fixed-length coding scheme. This makes it a useful loss-less compression technique. All the famous image and video compression standards, like JPEG, MPEG, and H.26x, use entropy coding as a last step before generating the compressed output. Entropy coding is considered a fundamental data compression tool if the probability distribution of the source symbols is known.
The variable length codes used by any entropy coding scheme are prefix codes. This means that no code is a prefix of another code, a property important for decoding without any ambiguity.
To understand the basic concepts of computer systems and data compression, you may want to experience the following interactive course:
Information Representation in Computer Systems
Computer systems do not understand human instructions, nor can they perceive real-life data in its raw form. Therefore, computer systems require a way to store and represent information in an accessible way. They use software and hardware components combined to help retrieve information, store it, manipulate it, and convert it back to a human-accessible format. If you want to learn how computer systems perform complex tasks such as storing and manipulating textual data, videos, music, images of your cat, and much more, this course is meant for you.
To dive deeper and have experience implementing ideas regarding data compression, you may like to visit the following hands-on projects:
Frequently Asked Questions
What is an example of entropy coding?
What is an example of entropy coding?