Exercise: Equal-Interval Chart
Explore how to create equal-interval bins for predicted probabilities to analyze default rates in a test set. Understand how to calculate and visualize default rates with error bars, interpreting sample size effects across bins.
We'll cover the following...
Using intervals of predicted probability
In this exercise, you'll make a plot similar to the "Default rate according to model prediction decile" plot from the last lesson. However, instead of splitting the test set into equal-population deciles of predicted probability, you'll use equal intervals of predicted probability. Specifying the intervals could be helpful if a business partner wants to think about potential model-based strategies using certain score ranges. You can use pandas cut to create equal-interval binnings, or custom binnings using an array of bin edges, similar to how you used qcut to create quantile labels.
Perform the following steps to complete the exercise:
-
Create the series of equal-interval labels, for 5 bins, using the following code:
equal_intervals, equal_interval_bin_edges = \ pd.cut(x=test_set_pred_proba,\ bins=5,\ retbins=True)Notice that this is similar to the call to
qcut, except here withcutwe can say how many equal-interval bins we want by supplying an integer to thebins...