Distribution Plots
Let’s learn about distribution plotting in seaborn.
Seaborn
On its official website, seaborn is described as a “Python visualization library based on matplotlib” which “provides a high-level interface for drawing attractive statistical graphics.”
Seaborn’s plots
Let’s start with seaborn’s distribution plots. When dealing with a set of data, the first thing we want to do is usually getting a sense of how the variables are distributed. In this lesson, we’ll learn some useful plotting options that allow us to visualize the distribution of a dataset.
These distribution plots include distplot()
, jointplot()
, pairplot()
, rugplot()
, kdeplot()
, and so on.
Note: Seaborn also comes with built-in data sets, which help us learn and practice our skills on benchmark data. These datasets can be conveniently loaded using seaborn’s built-in function
load_dataset('data_set_name')
. We’ll use various built-in datasets for learning purposes in the seaborn section.
import seaborn as snsprint(sns.__version__)
We have a list of built-in datasets in seaborn
. We can obtain these using sns.get_dataset_names()
.
Some examples of seaborn’s datasets include ['anscombe', 'attention', 'brain_networks', 'car_crashes', 'dots', 'exercise', 'flights', 'fmri', 'gammas', 'iris', 'planets', 'tips', 'titanic']
, and so on.
Any dataset from the list can be loaded using sns.load_dataset('name')
from the online repository.
Let’s start our learning journey with a simple, built-in dataset—tips
. This dataset shows the amount of money that restaurant staff receives in tips from customers based on various factors. This is a great dataset to learn the range of plotting options in seaborn
. ...