Explainable AI
Learn about Explainable AI, its mathematical framework, and its taxonomy.
Need for XAI
We need to identify and resolve biases and deploy AI with confidence. Let's go over what this means and how to do this below.
Identifying and resolving bias
We often say that a model is as good as its training data. But how do we verify it? XAI offers an efficient framework to detect and resolve the biases and drift present in a model. A bias is a phenomenon that skews the result of an algorithm in favor of or against a population. A model drift, also known as model attenuation, is the loss of an ML model’s predictive power as a result of changes in the environment.
With explanations, data scientists can generate insights into the behavior and internal workings of the model they have trained. This helps data scientists debug and mitigate issues with model performance in the development phase.
For example, as shown in the figure below, predictions of a deep learning model that generates credit scores can be tested for any hidden bias to variables, like gender or age. By generating explanations regarding model predictions, we can estimate the weight the model gives to gender while generating a person’s credit score. A high weight means that the model is too biased toward the gender of the person. Therefore, our data must be reweighed, where weights of minority genders in the dataset are adjusted to balance it before it’s fed into the machine learning model.
Another example of XAI can be in the health care sector. For example, a deep learning model that takes an X-ray image as input and predicts the presence of a malignant tissue can be tested for false positives. As shown in the figure below, by analyzing the weight the model gives to each pixel in the X-ray, image experts can manually verify whether high-weight pixel regions correspond to the presence of malignant tissue.
Deploying AI with confidence
Deploying AI can pose a substantial risk when the model is biased because it can lead to unpredictable performance. Sometimes, hidden biases of these models may even result in ethical issues (as in the example above). So, we can only be confident about a model’s behavior by verifying the quality of explanations it generates. Further, through XAI, we can monitor our model’s predictions. Such continuous evaluation lets us monitor our model’s performance, gain continual feedback, and optimize model performance.
The framework of XAI
Let’s look at an example of XAI. Given a deep neural network
Taxonomy of XAI
Based on the nature of explanations and how they are generated, XAI methods can be put into the following categories:
A saliency map is an image where the brightness of a pixel represents how salient (or important) that pixel is for the prediction made by the neural network.
A class activation map (aka CAM) uses the gradient information from the penultimate convolutional layers of the neural network to assign saliency/importance scores to each neuron present in that layer.
Counterfactuals are images similar to the original image but with an altered prediction. Analyzing these counterfactuals helps us understand which image pixels/regions are important for prediction.
Perturbation-based methods produce explanations by assigning image regions or pixels an importance score based on how their removal (via perturbation or occlusion) affects the model’s prediction.
Concept explanations produce explanations based on human-specified concepts rather than explaining the prediction based on features and low-level concepts. For example, a low-level explanation for images assigns a saliency value to every pixel.
Prototypical explanations produce prototype images similar to the original image representing a particular class. Analyzing these prototypes allows us to form a basis for categorization and decision-making.
The figure below summarizes these methods.