Dataset References
Learn more about some important datasets that are useful for building data visualizations in ggplot2.
Reference Datasets for ggplot2
visualizations
The R Datasets
package is one of the several datasets maintained by the R Core team and included with the R base installation. By calling the data()
function without any arguments, we can list all the built-in datasets.
data()
Let’s look at the available built-in datasets in the ggplot2
package using the code below:
data(package="ggplot2")
Note: We can replace
ggplot2
in the above command with any required package (example:MASS
) to list the datasets available with the specific package.
Therefore, the base R installation, along with ggplot2
, offers several useful built-in datasets. Let’s familiarize ourselves with some of those datasets. We’ll import each dataset and print the first ten rows to get an idea about the variables in the dataset.
The mpg
dataset
This is one of the popular datasets used in the data science community. The mpg
dataset is a built-in dataset from the ggplot2
package. It consists of a subset of the fuel economy data provided by the EPA.
This dataset contains data about the fuel economy of major car models between to .
Note: We can browse and download this dataset from the official website of the US Department of Energy.
head(mpg, n=10)
The mtcars
dataset
The mtcars
(Motor Trend Car Road Tests) dataset is another commonly used dataset for data science projects. This dataset provides the fuel consumption data collected for automobiles and ten attributes of automotive ...