Visualization with Distributions
This lesson introduces us with types of distributions used to visualize data , and how Python provides support with its libraries.
We'll cover the following...
Introduction to distributions #
A probability distribution is a mathematical function that provides the probabilities of the occurrence of different possible outcomes.
For example, you might have a program that returns 1 with a 50% probability and 0 with a 50% probability. Thus, 50% of your probability distribution would be assigned to 1 and 50% to 0.
If you were to plot this expected distribution, you would have two bars of equal height for 1 and 0.
Often, with data you don’t know the mathematical function which generated your data, so instead you observe the empirical distribution. You might sample 10 colored balls from a bag and get 2 red, 3 yellow, and 5 green. That would then be your empirical distribution and you could graphical represent it with 3 bars. One of height 2 for red, one of height 3 for yellow, and one of height 5 for green.
Seaborn has a few ways to plot distributions:
- Histograms
- Box plots
- Violin plots
- Joint plots