Stacking and Unstacking Data

Learn to pivot between long-form and tabular-form data in the tidyverse.

A common issue in data science is that we receive data in long form but need it in tabular form, or vice versa. Because we’re working in the tidyverse, we always want our data to be tidy. But that isn’t always how the data is stored. Humans read tabular data more easily, while long-format data is often more efficiently stored in databases. So, depending on where our data is coming from and who’s using it most often, there will come times when we need to make our data longer, i.e., fewer columns but more rows, or more tabular, i.e., fewer rows but more columns.

Press + to interact
Tabular and long-form data
Tabular and long-form data

Tabular and long-form data formats

Tabular-format data is often also referred to as wide form because there tend to be many columns. On the other hand, long-form data typically contains a single, or just a few, value columns, with a preceding column indicating what the value represents. For instance, our student grades dataset attached to the example below, has a course identifier column and a grade column. The CourseID identifier column tells us what’s in the Grade ...