Using a Suitable Corpus Class
Learn about the different types of corpora in the tm package and plug-in packages for efficient text mining and NLP analysis in R.
Let’s do a deeper exploration of the corpora included as part of the tm
package via plug-in packages.
Corpus
Corpus
is a convenient alias to create either a SimpleCorpus
or a VCorpus
, depending on the arguments provided. For example, SimpleCorpus
can’t contain XML, so if we were to use Corpus
with XML, Corpus
would create a VCorpus
. Here is an example of Corpus
:
Get hands-on with 1400+ tech skills courses.