Tidy Data
Learn about the tidy data format.
We'll cover the following...
Now, let’s learn about the concept of a tidy data format with an example from the fivethirtyeight
package. The fivethirtyeight
package (Kim et al., 2019) provides access to the datasets used in many articles published by the data journalism website.
Let’s focus our attention on the drinks
data frame and look at its first five rows of output:
library(fivethirtyeight)print(drinks)?drinks
After reading the help file by running ?drinks
, we see that drinks
is a data frame containing results from a survey of the average number of servings of beer, spirits, and wine consumed in 193 countries. This data was originally reported on FiveThirtyEight.com in Mona Chalabi’s article, “Dear Mona Followup: Where Do People Drink The Most Beer, Wine, And Spirits?.”
Let’s apply some of the data wrangling verbs on the drinks
data frame:
We’ll
filter()
thedrinks
data frame to only consider four countries that include the United States, China, Italy, and Saudi Arabia. ...