Faceting and Saving Plots

Learn to create matrices of related plots using facets and export plots using ggsave.

As data scientists, we often need to look at many plots at once and share those plots with others. Combinations of related plots are essential for understanding and communicating insights from data. The ggplot2 package provides two convenient features to help support this need. The first is faceting, which allows us to create a set of plots displayed as a matrix, each showing a subset of the data. The second is ggsave, which enables us to export our plots to several different file formats for use outside of R. In this lesson, we’ll explore facets, how to use them to create a matrix of plots, and how to save plots in different formats for various purposes.

Press + to interact
Matrix of plots created with facets
Matrix of plots created with facets

Facets

Faceting is a valuable technique for creating a matrix of plots. It allows us to conveniently explore relationships between variables across different subsets of the data. In data science, that usually means we want to check if the relationships we’re exploring vary based on a specific categorization of the data we’re working with. For instance, in the iris dataset, we have the following data columns: Sepal.Length, Sepal.Width, Petal.Length, Petal.Width and Species, corresponding to the sepal and petal dimensions for flowers and the flower species. A common question might be, “What’s the relationship between Sepal.Length and Petal.Length, and does that relationship change between species of flowers?” The functions facet_grid and facet_wrap allow us to quickly create a matrix of plots where the categorical variable Species subsets the data for each plot.

  • facet_grid: This arranges related plots in a regular grid of rows and columns, with each plot cell corresponding to a unique combination of faceting variables. For instance, we may create a set of two-dimensional scatter plots for which all plots share a common set of x and y variables. Additionally, we may want to check whether relationships vary across combinations of two other variables—let’s call them A and B. Suppose we specify that A defines our row faceting and B defines our column faceting. In that case, each plot represents a subset of the data for a unique combination of A and B indicated in the matrix. The facet_grid function is most useful for data sets with two categorical variables, like A and B in the example above.
Press + to interact
Facet grid components
Facet grid components
  • facet_wrap: This is similar but more flexible about the layout of the plots in the matrix. Generally, facet_wrap is used to create a “wrapped” structure of plots. The plots are arranged in one or more rows, and ggplot2 can automatically determine how many columns
...