...

/

Exercise: Obtaining Probabilities from Logistic Regression Model

Exercise: Obtaining Probabilities from Logistic Regression Model

Learn to obtain the probabilities of the trained logistic regression model.

Discovering predicted probabilities

How does logistic regression make predictions? Now that we’re familiar with accuracy, true and false positives and negatives, and the confusion matrix, we can explore new ways of using logistic regression to learn about more advanced binary classification metrics. So far, we’ve only considered logistic regression as a “black box” that can learn from labeled training data and then make binary predictions on new features. While we will learn about the workings of logistic regression in detail later in the course, we can begin to peek inside the black box now.

One thing to understand about how logistic regression works is that the raw predictions—in other words, the direct outputs from the mathematical equation that defines logistic regression—are not binary labels. They are actually probabilities on a scale from 0 to 1 (although, technically, the equation never allows the probabilities to be exactly equal to 0 or 1, as we’ll see later). These probabilities are only transformed into binary predictions through the use of a threshold. The threshold is the probability above which a prediction is declared to be positive, and below which it is negative. The threshold in scikit-learn is 0.5. This means any sample with a predicted probability of at least 0.5 is identified as positive, and any with a predicted probability < 0.5 is decided to be negative. However, we are free to use any threshold we want. In fact, choosing the threshold is one of the key flexibilities of logistic regression, as well as other machine learning classification algorithms that estimate probabilities of class membership.

Predicted probabilities from logistic regression

In the following exercise, we will get familiar with the predicted probabilities of logistic regression and how to obtain them from a scikit-learn model.

We can begin to discover predicted probabilities by further examining the methods available to us on the logistic regression model object that we trained earlier in this section. Recall that before, once we trained the model, we could then make binary ...