Histograms

Explore histograms and how to plot them in R.

Let’s consider the temp variable in the weather data frame once again. This time, we’ll only look at how the values of temp are distributed.

  • What are the smallest and largest values?

  • What is the center or most typical value?

  • How do the values spread out?

  • What are frequent and infrequent values?

One way to visualize the distribution of this single variable temp is to plot it on a horizontal line, as shown below:

Press + to interact
Plot of hourly temperature recordings from NYC in 2013
Plot of hourly temperature recordings from NYC in 2013

This gives us a general idea of how the values of temp distribute. Observe that temperatures vary from around 11°F (-11°C) up to 100°F (38°C). Furthermore, there appears to be more recorded temperatures between 40°F and 60°F than outside this range. However, because of the high degree of overplotting in the points, it’s hard to get a sense of exactly how many values are between, say, 50°F and 55°F.

What is commonly produced instead of the diagram above is known as a histogram. A histogram is a plot that visualizes the distribution of a numerical value as follows: ...