Logistic regression as a linear model

We mentioned previously that logistic regression is considered a linear model, while we were exploring whether the relationship between features and response resembled a linear relationship. Recall that we plotted groupby/mean of the EDUCATION feature in the “Data Exploration” chapter, as well as for the PAY_1 feature in this chapter, to see whether the default rates across values of these features exhibited a linear trend. While this is a good way to get a quick approximation of how “linear or not” these features may be, here we formalize the notion of why logistic regression is a linear model.

What is a linear model?

A model is considered linear if the transformation of features that is used to calculate the prediction is a linear combination of the features. The possibilities for a linear combination are that each feature can be multiplied by a numerical constant, these terms can be added together, and an additional constant can be added. For example, in a simple model with two features, $X_1$ and $X_2$ , a linear combination would take the following form:

Linear \space combination \space of \space X_1 \space and \space X_2 = θ_0 + θ_1X_1 + θ_2X_2

The constants $θ_i$ can be any number, positive, negative, or zero, for i = 0, 1, and 2 (although if a coefficient is 0, this removes a feature from the linear combination). A familiar example of a linear transformation of one variable is a straight line with the equation y = mx + b. In this case, $θ_0 = b$ and $θ_1 = m$ . $θ_0$ ...

Introduction

Data Exploration and Cleaning

(Challenge) Exploring Remaining Financial Features in Dataset

Introduction to scikit-learn and Model Evaluation

Fake News Detection Using Scikit-learn

(Challenge) Logistic Regression and Precision-Recall Curve

Details of Logistic Regression and Feature Extraction

(Challenge) Logistic Regression Model and Coefficients

The Bias-Variance Trade-Off

(Challenge) Cross-Validation and Feature Engineering

Decision Trees and Random Forests

(Challenge) Cross-Validation Grid Search with Random Forest

Gradient Boosting, XGBoost, and SHAP Values

(Challenge) XGBoost and SHAP Explanation for Case Study Data

Predict Frog Toxicity with Python and XGBoost

Test Set Analysis, Financial Insights, and Delivery to the Client

(Challenge) Deriving Financial Insights

Appendix

Why Is Logistic Regression Considered a Linear Model?

Logistic regression as a linear model

What is a linear model?