Pair Plots
Learn to design, customize, and interpret relationships among data features through pair plots.
We'll cover the following...
Overview
We use scatter plots for bivariate analysis of data. However, to see a correlation between more than two variables at once, there’s a more thoughtful way: pair plots. A pair plot displays pairwise relationships between the dataset’s variables and plots all numeric variables by default.
Use, customize and interpret pair plots
A pair plot is a great place to start when performing data analysis because it gives us a detailed initial view of our data.
We import the tips
dataset from seaborn in the code snippet below. Let’s get started!
import pandas as pdimport seaborn as snsimport matplotlib.pyplot as pltsns.set_theme() # set default themetips_df = sns.load_dataset('tips')print(tips_df.head())
To construct a pair plot, we can use the sns.pairplot()
function and pass our tips_df
DataFrame to the data
argument. We also pass True
to the second argument, dropna
, which means if our dataset has any null records, those will not be used in the visualization. How cool is that! The resulting pair plot is shown below:
sns.pairplot(data = tips_df, dropna = True)plt.savefig('output/graph.png') # save figure
Isn’t it amazing how just one line of code generates such a powerful data visualization? We have three numeric variables in the tips
dataset, creating nine pairwise plots. Let’s break down the pair plot visualization to understand the complete plot. First, as shown in ...