Jensen-Shannon Divergence and Cross-Entropy Loss

Learn about Jensen-Shannon Divergence and Cross-Entropy Loss.

The PyTorch Image Model provides a special loss called JsdCrossEntropy. As the name suggests, it uses Jensen-Shannon divergence and cross-entropy to calculate the total loss.

Jensen-Shannon divergence

The Jensen-Shannon divergence is a well-known term in the probability theory and statistics domains. It’s symmetric, has finite values, and measures the similarity between two probability distributions.

Cross-entropy

Cross-entropy is another well-known concept in information theory. It measures the entropy between two probability distributions. Cross-entropy averages the total number of bits to represent an event from one specific distribution. The final outcome is the difference between the two probability distributions.

The JsdCrossEntropy class

The JsdCrossEntropy class uses both Jensen-Shannon divergence and cross-entropy. It helps improve learnability when paired with noise-robust loss functions such as cross-entropy.

Logically it doesn’t need a label as it simply computes how significantly different the predictions are.

Let’s look at the interactive playgrounds below to learn more:

Get hands-on with 1400+ tech skills courses.