/Relationships Between Dependent and Independent Variables
Relationships Between Dependent and Independent Variables
Learn how to understand and visualize interfeature relationships using a correlation matrix and heatmaps.
We'll cover the following...
Relationships between features and the label
In the previous lesson, we started working on an online retail transaction dataset and performed feature engineering techniques. In this lesson, we’ll build on that by analyzing the wrangled dataset and observing the relationship between the features and the dependent variable (label). This will help us choose a subset of features to optimize the regression model.
First, let's import the necessary libraries and the wrangled dataset.
Press + to interact
import pandas as pdimport numpy as npimport datetime as dtimport matplotlib.pyplot as pltimport seaborn as sns# load the wrangled datasetdf_retail = pd.read_csv('wrangled_transactions.csv', header=0, index_col='customer_id')print(df_retail.head())print(df_retail.shape)
Correlation matrix
To get a good overview of the interfeature relationships, we can use a ...