Tidy Data

Learn about the tidy data format.

We'll cover the following...

Now, let’s learn about the concept of a tidy data format with an example from the fivethirtyeight package.  The fivethirtyeight package (Kim et al., 2019) provides access to the datasets used in many articles published by the data journalism website.

Let’s focus our attention on the drinks data frame and look at its first five rows of output:

Press + to interact
library(fivethirtyeight)
print(drinks)
?drinks

After reading the help file by running ?drinks, we see that drinks is a data frame containing results from a survey of the average number of servings of beer, spirits, and wine consumed in 193 countries. This data was originally reported on FiveThirtyEight.com in Mona Chalabi’s article, “Dear Mona Followup: Where Do People Drink The Most Beer, Wine, And Spirits?.”

Let’s apply some of the data wrangling verbs on the drinks data frame:

  • We’ll filter() the drinks data frame to only consider four countries that include the United States, China, Italy, and Saudi Arabia. ...