Logistic Regression Predictions Using Sigmoid

Learn how the logistic regression coefficients are utilized for the predictions.

From Logistic Regression coefficients to predictions using sigmoid

Before the next exercise, let’s take a look at how the coefficients for logistic regression are used to calculate predicted probabilities, and ultimately make predictions for the class of the response variable.

Recall that logistic regression predicts the probability of class membership, according to the sigmoid equation. In the case of two features with an intercept, the equation is as follows:

p=11+e−(θ0+θ1X1+θ2X2)p = \frac{1}{1+e^{-(θ_0 + θ_1X_1 + θ_2X_2)}}

When you call the fit method of a logistic regression model object in scikit-learn using the training data, the θ0θ_0, θ1θ_1, and θ2θ_2 parameters (intercept and coefficients) are estimated from this labeled training data. Effectively, scikit-learn figures out how to choose values for θ0θ_0, θ1θ_1, and θ2θ_2, so that it will classify as many training data points correctly as possible. We’ll gain some insight into how this process works in the next chapter.

When you call predict, scikit-learn calculates predicted probabilities according to the fitted parameter values and the sigmoid equation. A given sample will then be classified as positive if p≥0.5p ≥ 0.5, and negative otherwise.

We know that the plot of the sigmoid equation looks like the following, which we can connect to the equation above by making the substitution X=θ0+θ1X1+θ2X2X = θ_0 + θ_1X_1 + θ_2X_2:

Get hands-on with 1400+ tech skills courses.