Line Chart
Learn how to create line charts with ggplot2.
Basic line chart in ggplot2
In this lesson, we’ll build line charts, which are great for showing trends and are frequently used in time series.
Let’s begin with importing the gapminder
package, which contains a subset of the gapminder
dataset on life expectancy, GDP per capita, and population by country. We’ll also print the first few rows of the gapminder
dataset.
library(gapminder)head(gapminder)
- Line 1: We import the
gapminder
package with thelibrary()
command in R. This also loads the built-ingapminder
dataset. - Line 2: We print the first few rows of the dataset with the
head()
function.
Note: The library and the dataset have the same name.
Suppose we want to plot a basic line graph for GDP vs. year for a particular country, the United States. However, because there is no column with the GDP values in the dataset, we’ll need to calculate the corresponding values from the entire dataset.
To get the values we need for building a line plot, we have to use some of the dplyr
functions for processing the required data, as shown below:
us_data <- gapminder %>%filter(continent == "Americas", country == "United States")
- Line 1: We create a new data frame,
us_data
, to store the output. We use the pipe operator,%>%
, to pass thegapminder
dataset to thefilter()
function. - Line 2: The
filter()
function filters the data for the continentAmericas
and the countryUnited States
.
Note: In the above code, we use the
%>%
pipe operator in R, which is written as (percentage—larger than—percentage) to pass what’s on the left as a first argument to the function on the right.
Another useful function we need to use here is the mutate()
function, which modifies or creates new columns in a data frame.
Next, we’ll create a new column titled gdp
(Gross Domestic Product) and filter and prepare the us_data
data frame for generating a line chart for a single country like the United States
.
us_data <- gapminder %>%mutate(gdp=gdpPercap*pop)%>%filter(continent == "Americas", country == "United States")head(us_data)
- Line 1: We create a new data frame,
us_data
, to store the output and use the%>%
operator to pass thegapminder
dataset to themutate()
function. - Line 2: The
mutate()
function creates a new column with the namegdp
by multiplying thegdpPercap
andpop
columns. - Line 3: We use the
%>%
operator again to pass the output of themutate()
function to thefilter()
function, where thefilter()
function filters the data for theAmericas
continent and theUnited States
country. - Line 4: We print the first few rows of the newly created
us_data
data frame.
Now, we’ll build our first line graph using the geom_line()
function for the filtered data frame.
ggplot(us_data) +geom_line(aes(x = year, y = gdp))