More Charts with ggplot2
Learn to build few advanced charts like word cloud, parliament diagram, waffle chart, and hexbin chart using ggplot2.
We'll cover the following...
Introduction to word cloud
A word cloud is a graph representing text data by frequency, i.e., the most frequently occurring words in a text are displayed with larger font sizes. This chart type helps visualize the main topics in a text and can be used to quickly identify important words or phrases.
Getting started with the ggwordcloud package
We’ll create a word cloud in ggplot using the ggwordcloud package, an R package designed specifically for creating word clouds.
First, let’s import the ggwordcloud package:
library(ggwordcloud)
The ggwordcloud package provides several example datasets that can be used to create word clouds.
For example, we’ll use the thankyou_words_small dataset from the ggwordcloud package, which contains the word “Thank you” in different languages, the number of native speakers, and overall speakers of those languages.
We print a few rows of the dataset with the following code:
head(thankyou_words_small)
Basic word cloud using ggplot2
The ggplot2 package offers a specialized geom called the geom_text_wordcloud() function for creating word clouds through the ggwordcloud package.
We can generate the word cloud using the code below:
ggplot(thankyou_words_small)+geom_text_wordcloud(aes(label = name))
- Line 1: We initialize a new
ggplotobject with theggplot()function and pass the name of the datasetthankyou_words_small. Using the+operator, we add a layer to theggplotobject. - Line 2: We use the
geom_text_wordcloud()function to create a word cloud. Next, we use theaes()function to specify that the text for the word cloud should come from thenamevariable.
Modifying the text size in a word cloud
In the previous word cloud, all the displayed words were the same size. However, it is possible to set the size of the words based on a numerical variable.
Here is an example:
ggplot(thankyou_words_small)+geom_text_wordcloud(aes(label = name,size = speakers))
- Line 3: We pass the
speakersvariable to thesizeargument of theaes()function to show the size of each word in the word cloud proportional to the number of speakers for that language.
Word cloud text scaling and color customization
The default scaling of the text in the word cloud makes the words appear too small relative to the plot area. To improve the font size control, we’ll use the scale_size_area() function in ggplot2, which adjusts the text’s size based on the plot’s area.
In addition to adjusting the size of the words, we can also change the colors of the words based on a categorical variable. To do this, we’ll pass the variable name to the color argument of the aes() function.
Here is an example:
ggplot(thankyou_words_small)+geom_text_wordcloud(aes(label = name,size = speakers,color = name)) +scale_size_area(max_size = 20)
- Line 4: We change the colors of the words by passing the variable name
nameto thecolorargument of theaes()function. The plot shows colors based on the variable, and this adds a defaultggplot2colormap (consisting of red, green, and blue) for thenamevariable. - Line 5: We set the
max_sizeargument as20inside thescale_size_area()function for adjusting the text’s size based on the plot’s