Feature Selection (Intrinsic Methods)

Feature Selection refers to the process of selecting the most appropriate features for making the model. Here, you can learn all about it.

Intrinsic or Embedded Methods

Embedded methods learn about the features that contribute the most to the model’s performance while the model is being created. You have seen Feature Selection methods in the previous lessons, and we will discuss several more in future lessons, like Decision Tree based methods.

  • Ridge Regression (L2-Regularization)

  • Lasso Regression (L1-Regularization)

  • Elastic-Net Regression (uses both L1 and L2 Regularization)

  • Decision Tree-Based Methods (Decision Tree Classification, Random Forest Classification, XgBoost Classification, LightGBM).

We know regularization reduces some of the parameters in the equation below to zero. This property of regularization methods can be used as a Feature Selection Method.

y=w0+x1w1+x2w2+x3w3...+xnwny = w_0 + x_1w_1 + x_2w_2 + x_3w_3... +x_nw_n

Scikit Learn implementation

We have already covered the implementation of regularization models like Ridge, Lasso, and Elastic-net regression in the previous lessons.

  • Scikit Learn provides a SelectFromModel class.

  • It is used with models that provide coef_ or feature_importance_ attributes.

  • It takes in a threshold parameter.

  • The features are considered unimportant and removed if the corresponding coef_ or feature_importances_ values are below the provided threshold parameter.

Get hands-on with 1400+ tech skills courses.