...

/

Removing Other Unnecessary Terms

Removing Other Unnecessary Terms

Learn to use the data cleaning tools available with the tm package to remove unnecessary terms.

Handling punctuation and numbers

You may have noticed several instances of newlines ( \n) in the text. In most cases, punctuation, numbers, and extra white space are unnecessary for NLP analysis. In fact, these elements inflate the word count but don’t add meaning. In this lesson, we’ll talk about removing them as well.

Overview of transformations in the tm package

In NLP, stopwords are removed to provide better visibility to significant words. However, stopwords aren’t the only problem when cleaning text data. ...

Access this course and 1400+ top-rated courses and projects.