Special Data Type—tibbles

Learn to use tibbles in the R tidyverse, understand why they’re more efficient and user-friendly, and create basic summaries.

The first step in using the tidyverse is moving from the use of data frames to tibbles, which are the tidyverse equivalent of data frames. They clean up some of the legacy behaviors of data frames that can be frustrating in modern data science contexts and give cleaner and easier-to-read outputs.

It’s worth noting, however, that just because something is a tibble doesn’t make it tidy—any tidy or messy data frame can be converted to the tibble format, so we may still need to do some cleaning to meet the criteria for tidy. In doing this tidying, we’ll ensure that tidyverse functions work appropriately on our dataset.

Creating and accessing tibbles

To create a tibble, we need to load the tidyverse. Then we have a few possible creation methods available to us. These are shown in the example below.

Press + to interact
main.R
MySurveyData.csv
#Load base tidyverse libraries
library(ggplot2)
library(purrr)
library(tibble)
suppressPackageStartupMessages(library(dplyr))
library(tidyr)
library(stringr)
library(readr)
library(forcats)
#Load data
#Method 1 – Dataframe to tibble
#Read data from csv file into a dataframe
VAR_SurveyDataDF <- read.csv("MySurveyData.csv",
header = TRUE,
stringsAsFactors = FALSE)
#Convert dataframe to a tibble
VAR_SurveyDataTib <- as_tibble(VAR_SurveyDataDF)
#Method 2 – Tibble directly
VAR_SurveyDataTibDirect <- read_csv("MySurveyData.csv",
col_names = TRUE,
skip = 0,
n_max = Inf,
show_col_types = FALSE)
#output results
paste0("A dataframe")
VAR_SurveyDataDF #dataframe version
paste0("A tibble created from a dataframe")
VAR_SurveyDataTib #tibble from dataframe version
paste0("A directly created tibble")
VAR_SurveyDataTibDirect #direct tibble version
#output specific data points
paste0("The age column of the tibble")
VAR_SurveyDataTib$Age
paste0("The value in cell 1,1 of the tibble: ", VAR_SurveyDataTib[1,1])

In this code snippet, there are several new things happening:

  • Lines 2–9: Loads the core packages of the tidyverse.

  • Lines 15–20: Reads a csv file into a data frame, then convert that data frame to a tibble.

  • Lines 23–27: Reads a csv file directly into a tibble.

  • Lines 31–37: Outputs our data frame as well as the two tibbles that we’ve created.

  • Lines 41 and 43: Accesses specific elements of our tibble using two different methods.

The as_tibble() function

In method one, we’ve loaded MySurveyData.csv using the read.csv statement. The read.csv statement is from base-R. This gives us a data frame that we’ve denoted as VAR_SurveyDataDF. Any data frame can be converted to a tibble using the as_tibble() function. The benefit here is that if we work outside of the tidyverse in a portion of our code before moving back into the tidyverse, ...