Distribution Plots

Let’s learn about distribution plotting in seaborn.

Seaborn

On its official website, seaborn is described as a “Python visualization library based on matplotlib” which “provides a high-level interface for drawing attractive statistical graphics.”

Seaborn’s plots

Let’s start with seaborn’s distribution plots. When dealing with a set of data, the first thing we want to do is usually getting a sense of how the variables are distributed. In this lesson, we’ll learn some useful plotting options that allow us to visualize the distribution of a dataset.

These distribution plots include distplot(), jointplot(), pairplot(), rugplot(), kdeplot(), and so on.

Note: Seaborn also comes with built-in data sets, which help us learn and practice our skills on benchmark data. These datasets can be conveniently loaded using seaborn’s built-in function load_dataset('data_set_name'). We’ll use various built-in datasets for learning purposes in the seaborn section.

Press + to interact
import seaborn as sns
print(sns.__version__)

We have a list of built-in datasets in seaborn. We can obtain these using sns.get_dataset_names().

Some examples of seaborn’s datasets include ['anscombe', 'attention', 'brain_networks', 'car_crashes', 'dots', 'exercise', 'flights', 'fmri', 'gammas', 'iris', 'planets', 'tips', 'titanic'] , and so on.

Any dataset from the list can be loaded using sns.load_dataset('name') from the online repository.

Let’s start our learning journey with a simple, built-in dataset—tips. This dataset shows the amount of money that restaurant staff receives in tips from customers based on various factors. This is a great dataset to learn the range of plotting options in seaborn. ...