Visualization with Heatmaps
Learn how to plot, interpret, and style heatmaps for data analysis.
We'll cover the following...
Overview
A heatmap is a visual approach to displaying a data table. It represents the rectangular data as a color-encoded matrix. The color maps use color variation to represent different details by varying hue, saturation, or brightness.
Correlation with heatmaps
We begin by importing the required libraries (seaborn
, pandas
, matplotlib
) and set the default seaborn theme using the sns.set_theme()
function. Next, we import the penguins
dataset and view the data with the pandas head()
function, as shown below:
import seaborn as snsimport pandas as pdimport matplotlib.pyplot as pltsns.set_theme()penguins_df = sns.load_dataset('penguins')print(penguins_df.head())
Heatmaps are popular for representing correlation among variables. We plot a correlation matrix of the penguins
dataset on a heatmap by passing penguins_df.corr()
to the sns.heatmap()
function. We’ve customized the font size for the plot by specifying sns.set(font_scale=0.7)
so that the complete column names are visible in the plot. The heatmap is shown below:
sns.set(font_scale=0.7)sns.heatmap(penguins_df.corr())plt.savefig('output/graph.png')
The corr()
function computes Pearson’s correlation for numeric variables present in the data. The values range from –1 to +1. The negative values represent a negative correlation, and the positive values represent a positive correlation.
In a heatmap, as shown through the color bar, low correlation values are ...