Crosstabs and Pivot Tables in R
Learn about the pivot and cross tables and how to work with them.
Create crosstabs in R
Crosstabs, also known as contingency tables, display the frequency or count of observations for each combination of two or more variables. Crosstabs are used to compare the relationship between different variables and to explore how the variables are related to each other.
Imagine that we have a data frame that includes information about the name, age, profession, address, and location of the population. Our goal is to check the frequency of professions by location, like data analysts from the U.S. or civil engineers from Japan.
There are several ways to create crosstabs in R. We’ll go through some of them in this lesson.
Using the table()
and ftable()
function
The table()
and ftable()
functions work the same way. They take two categorical column names as variables.
# Syntax structure
table(<variable1>, <variable2>)
ftable(<variable1>, <variable2>)
Let’s see how we can change the shape of the datasets using these functions.
# We use the 'jobs' dataset in this example.print(head(jobs)) # Preview of the dataset# Create a crosstab that shows the frequency relationship between the country and the profession.print('--------------------------------------------------------')print(table(jobs$professions,jobs$country))print('--------------------------------------------------------')# Do the same using the ftable() functionprint(ftable(jobs$professions,jobs$country))print('--------------------------------------------------------')# Create a crosstab that shows the frequency relationship between the gender and the profession.print(table(jobs$gender,jobs$professions))print('--------------------------------------------------------')# Do the same using the ftable() functionprint(ftable(jobs$gender,jobs$professions))
-
Line 5: We create a crosstab between the
profession
and thecountry
columns with thetable
function, revealing the numbers of people from the same profession and country. -
Line 7: We create the same table using the
ftable()
function. -
Line 11–14: We do the same thing between the
gender
andprofession
columns.
Using the xtabs()
function
Another way to create cross tables is using the xtabs()
function.
The structure of this function is a little different from the others, requiring the tilde sign (~
) in the syntax.
It also allows for multiple crosstabs in one function, using the plus (+
) sign.
xtabs(~<column1> + <column2>, data = <dataset>)
Here are some examples:
# We use the 'jobs' dataset in this example.print(head(jobs)) # Preview of the dataset# Create a crosstab that shows the frequency relationship between the country and the gender.print('--------------------------------------------------------')print(xtabs(~country+gender,data=jobs))print('--------------------------------------------------------')# Create a crosstab that shows the frequency relationship between the country and the professions.print(xtabs(~country+professions,data=jobs))