Lasso (L1) and Ridge (L2) Regularization
Learn about the two ways of regularizing logistic regression models.
Before applying regularization to a logistic regression model, let’s take a moment to understand what regularization is and how it works. The two ways of regularizing logistic regression models in scikit-learn are called lasso (also known as L1 regularization) and ridge (also known as L2 regularization). When instantiating the model object from the scikit-learn class, you can choose penalty = 'l1'
or 'l2'
. These are called “penalties” because the effect of regularization is to add a penalty, or a cost, for having larger values of the coefficients in a fitted logistic regression model.
As we’ve already learned, coefficients in a logistic regression model describe the relationship between the log odds of the response and each of the features. Therefore, if a coefficient value is particularly large, then a small change in that feature will have a large effect on the prediction.
When a model is being fit and is learning the relationship between features and the response variable, the model can start to learn the noise in the data. We saw this previously in the figure below: if there are many features available when fitting a model, and there are no guardrails on the values that their coefficients can take, then the model fitting process may try to discover relationships between the features and the response variable that won’t generalize to new data. In this way, the model becomes tuned to the unpredictable, random noise that accompanies real-world, imperfect data. Unfortunately, this only serves to increase the model’s skill at predicting the training data, which is not our ultimate goal. Therefore, we should seek to root out such spurious relationships from the model.
Get hands-on with 1300+ tech skills courses.