Refining Neural Networks with Class Weights
Discover how adjusting class weights in neural network models can address the challenge of predicting rare events.
We'll cover the following
Loss function
A rare event problem has very few positively labeled samples. Due to this, even if the classifier misclassifies the positive labels, their effect on the loss function is minuscule.
where
- are the true labels.
- is the predicted probability for .
Remember the loss function in the equation above; it gives equal importance to the positive and negative samples. We can overweight the positives and underweight the negative samples to overcome this. A binary cross-entropy loss function will then be:
where
- .
- are the true labels.
- is the predicted probability for .
Class-weighting approach
The class-weighting approach works as follows:
-
The model estimation objective is to minimize the loss. In a perfect case, if the model could predict all the labels perfectly, that is, and , the loss will be zero. Therefore, the best model estimate is the one with the loss closest to zero.
-
With the class weights, , if the model misclassifies the positive samples, that is, , the loss goes farther away from zero as compared to if the negative samples are misclassified. In other words, the model training penalizes the misclassification of positives more than negatives.
-
Therefore, the model estimation strives to classify the minority positive samples correctly.
In principle, any arbitrary weights such that can be used. But a rule-of-thumb is:
-
:
-
:
Get hands-on with 1400+ tech skills courses.