Scatter Plot

Learn how to build a scatter plot and explore different aspects of data visualization in ggplot2.

Important concepts for data visualization

To create data visualizations in ggplot2, we need to be familiar with the following:

  • The essential concepts for data visualization
  • The theory of grammar of graphics
  • The elements of the grammar of graphics
  • Some key dplyr commands to work with datasets in R
  • The available ggplot2 support packages

Now, we’ll learn how to create scatter plots and customize them in ggplot2.

Scatter plots in ggplot2

Points in a scatter plot show the values of two different numerical variables. The location of each point on the horizontal and vertical axes represents the values of the data point. Scatter plots are frequently used to establish relationships between variables and identify patterns in the data.

Let’s consider the mpg dataset from the built-in datasets in R as an example.

We begin by importing the ggplot2 library and print the first ten rows of the dataset with the following command:

Press + to interact
library(ggplot2)
head(mpg,n=10)
  • Line 1: We import the ggplot2 package using the library() command in R.
  • Line 2: We use the head() command in R and specify the number of rows n=10 to print first ten rows of the dataset.

Here is the basic ggplot2 code for a scatter plot:

Press + to interact
ggplot(mpg) +
geom_point(aes(x = displ, y = hwy))
  • Line 1: We initialize a new ggplot object with the ggplot() function and pass the name of the mpg dataset. Using the + operator, we add a layer to the ggplot object.
  • Line 2: We use the geom_point() function to create a scatter plot. Next, we use the aes() function for mapping the variables to aesthetics, where x and y are mapped to the displ and hwy variables, respectively.

Note: The explanation for the above two lines of code will ...