Hist Plots
Learn how to plot, design, and interpret hist plots for data visualizations.
We'll cover the following...
Overview
Hist plot stands for histogram plot. It represents the distribution of continuous numerical data. Bar plots display the frequency of different categorical variables, whereas histograms show the frequency of numeric variables in the form of bins.
Univariate hist plot
Let’s get started by importing the required libraries. We’ll also import the tips and diamonds datasets and use them for our visualizations.
import pandas as pdimport seaborn as snsimport matplotlib.pyplot as pltsns.set_theme()# import datasetstips_df = sns.load_dataset("tips")diamond_df = sns.load_dataset("diamonds")print(tips_df.head())print("\n")print("\n")print(diamond_df.head())
We plot a histogram for the variable total_bill using the sns.histplot() function. The x-axis shows the range of values of total_bill, which is $10–50, and the y-axis shows the frequency of occurrence. Most of the total_bill observations are between $10 and $30.
sns.histplot(x = 'total_bill', data = tips_df) # histogramplt.ylabel('frequency')plt.savefig('output/graph.png')
By default, seaborn determines the number of bins for the histograms based on the dataset. However, we can customize the number of bins in a histogram using the bins parameter. Be careful with the number of bins used because a small number may not show the patterns, and a huge number can add too much noise.
sns.histplot(x = 'total_bill', bins= 30, data = tips_df) #customise no. of binsplt.ylabel('frequency')plt.savefig('output/graph.png')
We can ...