...

Examining Relationships Between Features and Response Variable

Learn to examine the relationship between features and response variables.

We'll cover the following...

Using correlation for exploring feature relations
- Filtering features using list comprehension
Try it yourself

In order to make accurate predictions of the response variable, good features are necessary. We need features that are clearly linked to the response variable in some way. Thus far, we’ve examined the relationship between a couple of features and the response variable, either by calculating the groupby/mean of a feature and the response variable, or using individual features in a model and examining performance. However, we have not yet done a systematic exploration of how all the features relate to the response variable. We will do that now and begin to capitalize on all the hard work we put in when we were exploring the features and making sure the data quality was good.

Using correlation for exploring feature relations

A popular way of getting a quick look at how all the features relate to the response variable, as well as how the features are related to each other, is by using a correlation plot. We will first create a correlation plot for the case study data, then discuss how to interpret it, along with some mathematical details.

In order to create a correlation plot, the necessary inputs include all features that we plan to explore, as well as the response variable. Because we are going to use most of the column names from the DataFrame for this, a ...

Introduction

Data Exploration and Cleaning

(Challenge) Exploring Remaining Financial Features in Dataset

Introduction to scikit-learn and Model Evaluation

Fake News Detection Using Scikit-learn

(Challenge) Logistic Regression and Precision-Recall Curve

Details of Logistic Regression and Feature Extraction

(Challenge) Logistic Regression Model and Coefficients

The Bias-Variance Trade-Off

(Challenge) Cross-Validation and Feature Engineering

Decision Trees and Random Forests

(Challenge) Cross-Validation Grid Search with Random Forest

Gradient Boosting, XGBoost, and SHAP Values

(Challenge) XGBoost and SHAP Explanation for Case Study Data

Predict Frog Toxicity with Python and XGBoost

Test Set Analysis, Financial Insights, and Delivery to the Client

(Challenge) Deriving Financial Insights

Appendix

Examining Relationships Between Features and Response Variable

Using correlation for exploring feature relations