Scatter Plot
Learn how to build a scatter plot and explore different aspects of data visualization in ggplot2.
Important concepts for data visualization
To create data visualizations in ggplot2
, we need to be familiar with the following:
- The essential concepts for data visualization
- The theory of grammar of graphics
- The elements of the grammar of graphics
- Some key
dplyr
commands to work with datasets in R - The available
ggplot2
support packages
Now, we’ll learn how to create scatter plots and customize them in ggplot2
.
Scatter plots in ggplot2
Points in a scatter plot show the values of two different numerical variables. The location of each point on the horizontal and vertical axes represents the values of the data point. Scatter plots are frequently used to establish relationships between variables and identify patterns in the data.
Let’s consider the mpg
dataset from the built-in datasets in R as an example.
We begin by importing the ggplot2
library and print the first ten rows of the dataset with the following command:
library(ggplot2)head(mpg,n=10)
- Line 1: We import the
ggplot2
package using thelibrary()
command in R. - Line 2: We use the
head()
command in R and specify the number of rowsn=10
to print first ten rows of the dataset.
Here is the basic ggplot2
code for a scatter plot:
ggplot(mpg) +geom_point(aes(x = displ, y = hwy))
- Line 1: We initialize a new
ggplot
object with theggplot()
function and pass the name of thempg
dataset. Using the+
operator, we add a layer to theggplot
object. - Line 2: We use the
geom_point()
function to create a scatter plot. Next, we use theaes()
function for mapping the variables to aesthetics, wherex
andy
are mapped to thedispl
andhwy
variables, respectively.
Note: The explanation for the above two lines of code will ...