Ridge Regression

Understand the need for regularization in linear regression.

Chapter Goals:

  • Learn about regularization in linear regression
  • Learn about hyperparameter tuning using cross-validation
  • Implement a cross-validated ridge regression model in scikit-learn

While ordinary least squares regression is a good way to fit a linear model onto a dataset, it relies on the fact that the dataset's features are each independent, i.e. uncorrelated. When many of the dataset features are linearly correlated, e.g. if a dataset has multiple features depicting the same price in different currencies, it makes the least squares regression model highly sensitive to noise in the data.

Because real life data tends to have noise, and will often have some linearly correlated features in the dataset, we combat this by performing regularization. For ordinary least squares regression, the goal is to find the weights (coefficients) for the linear model that minimize the sum of squared residuals:

i=1n(xiwyi)2\sum_{i = 1}^n (\mathbf{x}_i \cdot w - y_i)^2 ...

Access this course and 1400+ top-rated courses and projects.