Balanced accuracy is a statistical measure used to assess the performance of classification models on imbalanced datasets. It represents the arithmetic mean of sensitivity (true positive rate) and specificity (true negative rate) and ensures that both minority and majority classes are equally important during evaluation.
Balanced accuracy score ranges from 0 to 1, where 0 indicates poor performance, and 1 represents a perfect classifier.
Sensitivity or true positive rate (TPR): It measures the proportion of actual positive instances correctly identified by the model.
TP is the number of true positives, and FN is the number of false negatives.
Specificity or true negative rate (TNR): It represents the proportion of actual negative instances correctly identified by the model.
TN is the number of true negatives, and FP is the number of false positives.
Suppose a medical researcher uses a machine learning model to predict whether or not 400 patients with certain symptoms have a specific rare disease.
The following confusion matrix summarizes the predictions made by the model:
First, calculate the sensitivity and specificity,
Then calculate the balanced accuracy as follows:
This value indicates that the model performs reasonably well in handling the imbalanced nature of the dataset, as we know that the closer the balanced accuracy is to 1, the better the model classify observations correctly.
Robustness to imbalanced data: By accounting for both minority and majority classes, balanced accuracy provides a fair representation of a model's performance, making it more reliable when dealing with imbalanced datasets.
Better decision-making: In critical applications such as disease diagnosis, where false negatives can be disastrous, balanced accuracy helps make better-informed decisions about model deployment.
Comparative analysis: Balanced accuracy enables fair comparisons between different models trained on imbalanced data, facilitating the selection of the most suitable model for a specific task.
Learn about other evaluation metrics.
In conclusion, balanced accuracy is a metric for evaluating classification models on imbalanced datasets, providing a fair assessment by considering both sensitivity and specificity.
The balanced accuracy score ranges from:
-1 to 1
0 to 1
0 to 100