quanteda Basics

Learn how to implement basic NLP tasks with quanteda.

We compared tm, tidytext, and quanteda for a brief overview of how the different packages implemented text mining and natural language processing tasks. Here is the code presented in that lesson:

Press + to interact
# install.packages("quanteda")
# install.packages("readtext")
library(quanteda, quietly = TRUE)
library(readtext)
textDF <- readtext(file = "data/mws*txt", docvarsfrom = "filenames")
quantCorpus <- corpus(textDF)
tokens(quantCorpus)

In the code above, quanteda commands are used:

  • Line 7: readtext is used to read text files in the data directory with names beginning with mws. This produces a data.frame. docvarsfrom saves the file names as the doc id.

  • Line 9: corpus converts the data.frame into a quanteda corpus.

  • Line 11: tokens is used to tokenize the corpus. ...

Access this course and 1400+ top-rated courses and projects.