Introduction to the ggplot2 Library

Learn about the ggplot2 library.

In base R, we can create a wide variety of visualizations, from simple scatterplots and histograms to more complex interactive graphics and geospatial maps. R provides built-in functions for visualization, but their capabilities are limited. To draw charts from a wider option range, we will use the ggplot2 package.

ggplot2 functions

We sometimes need to customize a wide range of features in a chart to reach the perfect result. Unlike simple functions, the ggplot2 library offers a sequential puzzle-like structure supported by various functions in its repertoire. We use distinct functions to determine different chart features in ggplot2. Then, we combine these functions using the + sign.

For example, we can use the ggplot() function to specify the data variable and columns and use labs() to modify the labels of the data points.

Let’s explain these functions in detail.

ggplot() function

ggplot() is the first and essential layer of the ggplot2 structure. It is designed to operate on the data frame structures. We can rarely use other data containers like lists and vectors in this structure.

Using this function, we identify the data frame using the data argument. Then, the aes() function, which is nested in ggplot(), helps us to determine the columns to be used on the x and y axes of the chart.

Here is the most basic structure of the ggplot() layer:

# Syntax structure
ggplot(data = <variable_name>, aes(x = <column1>, y = <column2>))

Optionally, we can add legends by assigning a column variable to the color argument in this function. It automatically creates a legend showing the difference (shape, color, and so on) in the categories of the given column.

geom_...() function

The geom functions are also essential layers in creating a ggplot2 ...

Get hands-on with 1400+ tech skills courses.