Special Data Type—tibbles
Learn to use tibbles in the R tidyverse, understand why they’re more efficient and user-friendly, and create basic summaries.
The first step in using the tidyverse is moving from the use of data frames to tibbles
, which are the tidyverse equivalent of data frames. They clean up some of the legacy behaviors of data frames that can be frustrating in modern data science contexts and give cleaner and easier-to-read outputs.
It’s worth noting, however, that just because something is a tibble
doesn’t make it tidy—any tidy or messy data frame can be converted to the tibble
format, so we may still need to do some cleaning to meet the criteria for tidy. In doing this tidying, we’ll ensure that tidyverse functions work appropriately on our dataset.
Creating and accessing tibbles
To create a tibble
, we need to load the tidyverse. Then we have a few possible creation methods available to us. These are shown in the example below.
#Load base tidyverse librarieslibrary(ggplot2)library(purrr)library(tibble)suppressPackageStartupMessages(library(dplyr))library(tidyr)library(stringr)library(readr)library(forcats)#Load data#Method 1 – Dataframe to tibble#Read data from csv file into a dataframeVAR_SurveyDataDF <- read.csv("MySurveyData.csv",header = TRUE,stringsAsFactors = FALSE)#Convert dataframe to a tibbleVAR_SurveyDataTib <- as_tibble(VAR_SurveyDataDF)#Method 2 – Tibble directlyVAR_SurveyDataTibDirect <- read_csv("MySurveyData.csv",col_names = TRUE,skip = 0,n_max = Inf,show_col_types = FALSE)#output resultspaste0("A dataframe")VAR_SurveyDataDF #dataframe versionpaste0("A tibble created from a dataframe")VAR_SurveyDataTib #tibble from dataframe versionpaste0("A directly created tibble")VAR_SurveyDataTibDirect #direct tibble version#output specific data pointspaste0("The age column of the tibble")VAR_SurveyDataTib$Agepaste0("The value in cell 1,1 of the tibble: ", VAR_SurveyDataTib[1,1])
In this code snippet, there are several new things happening:
-
Lines 2–9: Loads the core packages of the tidyverse.
-
Lines 15–20: Reads a
csv
file into a data frame, then convert that data frame to atibble
. -
Lines 23–27: Reads a
csv
file directly into atibble
. -
Lines 31–37: Outputs our data frame as well as the two
tibbles
that we’ve created. -
Lines 41 and 43: Accesses specific elements of our
tibble
using two different methods.
The as_tibble()
function
In method one, we’ve loaded MySurveyData.csv
using the read.csv
statement. The read.csv
statement is from base-R. This gives us a data frame that we’ve denoted as VAR_SurveyDataDF
. Any data frame can be converted to a tibble
using the as_tibble()
function. The benefit here is that if we work outside of the tidyverse in a portion of our code before moving back into the tidyverse, ...