...

/

Exercise: F-test and Univariate Feature Selection

Exercise: F-test and Univariate Feature Selection

Learn how to select the univariate features using the F-test.

Univariate feature selection using F-test

In this exercise, we’ll use the F-test to examine the relationship between the features and response variable. We will use this method to do what is called univariate feature selection: the practice of testing features one by one against the response variable, to see which ones have predictive power. Perform the following steps to complete the exercise:

  1. Our first step in doing the ANOVA F-test is to separate out the features and response as NumPy arrays, taking advantage of the list we created, as well as integer indexing in pandas:

    X = df[features_response].iloc[:,:-1].values
    y = df[features_response].iloc[:,-1].values
    print(X.shape, y.shape)
    

    The output should show the shapes of the features and response:

    # (26664, 17) (26664, )
    

    There are 17 features, and both the features and response arrays have the same number of samples as expected.

  2. Import the f_classif function and feed in the features and response:

    from sklearn.feature_selection import 
    f_classif 
    [f_stat, f_p_value] = f_classif(X, y)
    

    There are two outputs from f_classif: the ...

Access this course and 1400+ top-rated courses and projects.