...

/

More Charts with ggplot2

More Charts with ggplot2

Learn to build few advanced charts like word cloud, parliament diagram, waffle chart, and hexbin chart using ggplot2.

Introduction to word cloud

A word cloud is a graph representing text data by frequency, i.e., the most frequently occurring words in a text are displayed with larger font sizes. This chart type helps visualize the main topics in a text and can be used to quickly identify important words or phrases.

Getting started with the ggwordcloud package

We’ll create a word cloud in ggplot using the ggwordcloud package, an R package designed specifically for creating word clouds.

First, let’s import the ggwordcloud package:

Press + to interact
library(ggwordcloud)

The ggwordcloud package provides several example datasets that can be used to create word clouds.

For example, we’ll use the thankyou_words_small dataset from the ggwordcloud package, which contains the word “Thank you” in different languages, the number of native speakers, and overall speakers of those languages.

We print a few rows of the dataset with the following code:

Press + to interact
head(thankyou_words_small)

Basic word cloud using ggplot2

The ggplot2 package offers a specialized geom called the geom_text_wordcloud() function for creating word clouds through the ggwordcloud package.

We can generate the word cloud using the code below:

Press + to interact
ggplot(thankyou_words_small)+
geom_text_wordcloud(aes(label = name))
  • Line 1: We initialize a new ggplot object with the ggplot() function and pass the name of the dataset thankyou_words_small. Using the + operator, we add a layer to the ggplot object.
  • Line 2: We use the geom_text_wordcloud() function to create a word cloud. Next, we use the aes() function to specify that the text for the word cloud should come from the name variable.

Modifying the text size in a word cloud

In the previous word cloud, all the displayed words were the same size. However, it is possible to set the size of the words based on a numerical variable.

Here is an example:

Press + to interact
ggplot(thankyou_words_small)+
geom_text_wordcloud(aes(label = name,
size = speakers))
  • Line 3: We pass the speakers variable to the size argument of the aes() function to show the size of each word in the word cloud proportional to the number of speakers for that language.

Word cloud text scaling and color customization

The default scaling of the text in the word cloud makes the words appear too small relative to the plot area. To improve the font size control, we’ll use the scale_size_area() function in ggplot2, which adjusts the text’s size based on the plot’s area.

In addition to adjusting the size of the words, we can also change the colors of the words based on a categorical variable. To do this, we’ll pass the variable name to the color argument of the aes() function.

Here is an example:

Press + to interact
ggplot(thankyou_words_small)+
geom_text_wordcloud(aes(label = name,
size = speakers,
color = name)) +
scale_size_area(max_size = 20)
  • Line 4: We change the colors of the words by passing the variable name name to the color argument of the aes() function. The plot shows colors based on the variable, and this adds a default ggplot2 colormap (consisting of red, green, and blue) for the name variable.
  • Line 5: We set the max_size argument as 20 inside the scale_size_area() function for adjusting the text’s size based on the plot’s
...