Exercise: SHAP Visualization and Analysis
Learn to plot SHAP Interactions and feature importance, and to reconstruct predicted probabilities from SHAP values.
We'll cover the following...
Exploring feature interactions with SHAP values
In this exercise, you’ll become more familiar with using SHAP values to provide visibility into the workings of a model. First, we’ll take an alternate look at the interaction between Features 3 and 5, and then use SHAP values to calculate feature importances similar to what we did with a random forest model in the chapter “Decision Trees and Random Forests.” Finally, we’ll see how model outputs can be obtained from SHAP values, taking advantage of their additive property:
-
Given the preliminary steps accomplished in this section already, we can take another look at the interaction between Features 3 and 5, the two most important features of the synthetic dataset. Look at the SHAP values of Feature 5, colored by those of Feature 3:
shap.plots.scatter(shap_values[:,'Feature 5'], color=shap_values[:,'Feature 3'])
The resulting plot should look like this:
Here we are seeing the SHAP values of Feature 5. In general, from the scatter plot, we can see that SHAP values tend to increase as feature values increase for Feature 5. However there are certainly counterexamples to that general trend, as well as an interesting interaction with Feature 3: for a given value of Feature 5, which can be thought of as a vertical slice from the image, the color of the dots can either become more red, going from the bottom to the top, for negative feature values, or less red for positive feature values. This means that for a given value of Feature 5, its SHAP value depends on the value of Feature 3. This is a further illustration ...