Visualization with tm
Explore NLP visualization in R using the tm package and Base R's plotting functions.
We'll cover the following...
Visualizing NLP with R
The tm
package provides tools for manipulating natural language datasets and drawing statistical conclusions from those data sets. It provides surprisingly few tools for visualizing those conclusions, instead relying on the rich set of graphics available through base R and other packages.
Here’s an example of using tm
to create a data object from a novel and then using the barplot()
function available through base R:
Press + to interact
library(tm, quietly = TRUE)jpeg("output/nlpvisplot.jpeg", width = 1000)DirSource(directory = "data", pattern = "mws.+txt") |>Corpus() |>DocumentTermMatrix(control = list(tolower = TRUE,stopwords = TRUE,removePunctuation = TRUE,removeNumbers = TRUE)) |>as.matrix() |>colSums() |>sort(decreasing = TRUE) |>(\(myobj) myobj[1:50])() |>barplot(main = "Mary Shelley - Most Frequent Terms", las = 2)dev.off()
Let’s break this code down step by step:
...