...

/

Relationships Between Dependent and Independent Variables

Relationships Between Dependent and Independent Variables

Learn how to understand and visualize interfeature relationships using a correlation matrix and heatmaps.

Relationships between features and the label

In the previous lesson, we started working on an online retail transaction dataset and performed feature engineering techniques. In this lesson, we’ll build on that by analyzing the wrangled dataset and observing the relationship between the features and the dependent variable (label). This will help us choose a subset of features to optimize the regression model.

First, let's import the necessary libraries and the wrangled dataset.

Press + to interact
import pandas as pd
import numpy as np
import datetime as dt
import matplotlib.pyplot as plt
import seaborn as sns
# load the wrangled dataset
df_retail = pd.read_csv('wrangled_transactions.csv', header=0, index_col='customer_id')
print(df_retail.head())
print(df_retail.shape)

Correlation matrix

To get a good overview of the interfeature relationships, we can use a ...