Boxplot

Learn how to build a boxplot and modify its elements using ggplot2.

Overview of the boxplot

A box and whisker plot (commonly known as a boxplot) is used to visualize multiple distributions using summary statistics. A boxplot provides information on the five summary statistics of the given dataset variable, which include:

  • The minimum value
  • The maximum value
  • The median
  • The first and third quartiles

In short, a boxplot is popularly used to display data distribution. It helps detect outliers and compare distributions of the data.

Press + to interact
Structure of a boxplot
Structure of a boxplot

Let’s study the structure of the boxplot:

  • The box, as seen from the figure, is the core of the boxplot.
  • The lower side of the box represents the first quartile (or Q1), i.e., the 25th25^{th} percentileA percentile is a number that compares one score to the scores of the rest of the group. It displays the percentage by which a given score exceeds the other scores. For example, if a user’s score is in the 90th percentile for a test, it means that the user scored better than 90% of people who took the test. of the data. The upper end of the box represents the third quartile (or Q3), i.e., the 75th75^{th}
...