Intro to Model Explainability
Learn about explainable methods for understanding model decisions.
As businesses across sectors implement ML and AI, the need for transparent decision-making grows increasingly important. The problem with black-box models (neural networks, large language models, etc.) is that their decision process is entirely opaque and unauditable. Model explainability has evolved as a subfield to combat this problem.
Explainability vs. interpretability
Simply put, explainability attempts to provide some clarity into how an ML algorithm makes its decision. Interpretability is the ability to have clarity into why an ML algorithm made a decision. The difference is subtle but has significant consequences, notably that explainability is just one piece of interpretability. This is best illustrated in an example.
Consider again a lending algorithm that attempts to classify applicants as either able or unable to repay a loan. Let’s assume we run two different models: a logistic regression and a random forest.
Logistic regressions
With a logistic regression model, it’s possible to retrieve the exact parameters that went into making the prediction. Because logistic regressions are essentially linear models and have a set formula, anyone can instantly understand properties, such as which features were the most relevant and (much more difficult) the effect of changes to variables on the output.
For example, a logistic regression model would be able to demonstrate that age and credit score are the most important features by assigning higher coefficients to them. Because the output is essentially just an equation, we can even study by hand what the effect of a credit score of 600 vs. 650 is, with age being held the same.
Random forests
Random forests are much trickier because they are black-box ensemble models. Their constituents, decision trees, are as interpretable and explainable as logistic regressions (because there’s a precise formula that describes how decisions are made and subtle changes to the input can be traced down the tree to find their outputs), but random forests are made of a majority vote from hundreds of these trees.
With these models, it’s impossible to have any sense of how small changes affect the output. Each decision tree within the random forest is constructed with a random subset of the features on a random subset of the training data, making it very difficult to reconstruct any kind of formula or rule. Furthermore, there’s no easily extractable equation because of the sheer complexity and non-linearity of the model.
There are some efforts to understand pieces of the model. Feature importances, for example, are one way of identifying how the random forest is making its decisions.
Feature importance is done by rerunning the same model, but without one variable. Each variable is subsequently removed, one at a time, from the model and the performances are plotted. The variable that yields the lowest ...