Diagnosing Data Bias
Learn how to diagnose a dataset as being biased. Learn useful fairness metrics for measuring bias.
We'll cover the following
Continuing onwards in our discussion of various data biases, we arrive at proxy bias. Proxy bias occurs when a variable or a combination of variables correlate strongly with a protected attribute (e.g., race, sex, etc.) so that even if the protected attribute itself is removed, the other variables recreate it and the bias persists.
This is particularly relevant in sectors like finance, where financial institutions are required to avoid using protected attributes as direct inputs to a model. While this is not the best approach to avoid bias, this is a law in the United States. Let’s consider a biased lending algorithm that predicts whether an applicant will be able to repay a loan based on sex and race.
Note that for obvious reasons, this is a discriminatory algorithm. If the applicant is female, there is a completely different set of standards required for approval. If the applicant is a Black female, they’re outright denied. Imagine, however, that this algorithm really existed in the real world. Because it’s a financial algorithm, any institution that wants to deploy this must not use protected attributes as inputs.
Let's consider a situation in which the dataset this decision tree was constructed from also contains two additional variables: address and consistency of employment (i.e., a measure of gaps in an individual's work). With these variables instead of race and sex, the decision tree might look something like the following.
The similarities between this tree and the previous one are striking. This is because the removal of the protected classes is substituted for these other variables that are highly correlated with race and sex. Consistency of employment, all else being equal, would be relatively lower for women between the ages of 18 and 45 who take time away from work to raise a family. Address, specifically which county an applicant lives in, can carry a strong correlation with race (because some counties have higher densities of particular subgroups). While using these variables theoretically introduces some uncertainty (e.g., not everyone who exhibits lower employment consistency is female), they also abstract away the notion that the algorithm is biased and often allow the model to pass through validation steps unaltered. This decision tree and the previous one are essentially the same.
Data fairness metrics
How do we diagnose whether our data has representation issues? There are two fairly simple and straightforward metrics that help answer this question: representation rate and statistical rate.
Representation rate
Representation rate is an outcome-agnostic measure, which means it considers only the base distribution of the data without looking at the outcome. Representation rate
Statistical rate
Contrary to representation rate, statistical rate is not outcome-agnostic. Rather, it looks at distributions with respect to each outcome. With a binary target, statistical rate (τ) is the outcome-dependent ratio of the probability of finding an outcome of
If
The smaller these ratios are, the more imbalanced and less representative the data is. With low representation rates, models tend to work better for end users that align with the majority class and worse for others. With low statistical rates, models tend to use the protected class as the most important indicator of the target variable.
Spurious correlations
Spurious correlations constitute some of the most interesting examples of data bias because they often involve image data and are notoriously difficult to identify in black box algorithms. Spurious correlations are similar to lurking variables in statistics in that they occur when an unintended pattern in the data has a strong correlation with the target variable. Therefore, the black box model learns over the unintended pattern and not the relevant features. The most common example of this is the "tank example" (which is something of an urban legend because no one quite knows if it actually happened).
Supposedly, in the 1980s, the Pentagon wanted to train a neural network classifier that took images of tanks as inputs and determined if the tank in the image was American or Russian. To everyone's elation, the classifier was 100% accurate—it was a perfect model. It wasn’t until much later that analysts realized the reason why the mode was so accurate; the images taken of the American tanks were taken on a sunny day, while the Russian tank images were taken on a cloudy day. The neural network didn't care about the tanks at all, it was just detecting background brightness. All of the effort that went into the neural network had only created a sunny-day detector.
When the analysts tested the algorithm on a separate set of images, the accuracy was nowhere near 100%. Because neural networks are black box models, there’s no way to identify if spurious correlations are being created unless models are rigorously tested on an extremely large and diverse dataset (which often doesn’t exist in practice because of low data availability!).