Regularized Regression
Understand how regularized regression reduces overfitting by penalizing large coefficients through ridge (L2) and lasso (L1) methods. Learn to balance model accuracy and coefficient magnitude using lambda parameters. Discover cross-validation to optimize lambda and feature selection to improve model efficiency by eliminating insignificant predictors.
We'll cover the following...
Overfitting is when the model gives low error on training data but a high error on testing data. You learned about overfitting in the previous lesson. A highly complex model leads to overfitting. In regression, overfitting refers to large values of coefficients. When the value of coefficients is very high, that coefficient dominates, leading to overfitting.
What is the impact of the amount of data to model overfitting?
So, we know overfitting is caused by large coefficients in regression. In regression, our objective is to reduce the cost. If, somehow, we can penalize the high coefficient value, we can get a better model. So, now we have to form a loss function that includes:
- How well our function is fitting the data
- The magnitude of the coefficients generated
Our cost would be the sum of the above two terms
We need to put a balance between the two terms. So we use a parameter lambda to control this.
...