Text Preprocessing with Python/

...

Solution Explanations: Advanced Text Preprocessing

Review solution explanations for the code challenges on advanced text preprocessing.

We'll cover the following...

Solution 1: Part-of-speech tagging
Solution 2: Named entity recognition
Solution 3: Text classification

Press + to interact

Python 3.8

Files

import pandas as pd
import nltk
import string
from nltk.tokenize import word_tokenize
from nltk.corpus import stopwords
nltk.download('averaged_perceptron_tagger', quiet=True) 
feedback_df = pd.read_csv('feedback.csv') 
feedback_df['tokens'] = feedback_df['feedback'].apply(lambda text: word_tokenize(text.lower()))
stop_words = set(stopwords.words('english'))
feedback_df['tokens'] = feedback_df['tokens'].apply(lambda tokens: [token for token in tokens if token not in stop_words])
feedback_df['tokens'] = feedback_df['tokens'].apply(lambda tokens: [token for token in tokens if token not in string.punctuation])
feedback_df['pos_tags'] = feedback_df['tokens'].apply(nltk.pos_tag)
print(feedback_df['pos_tags'])

About This Course

Introduction To Text Preprocessing

Regular Expressions

Irrelevant Text Data

Basic Text Preprocessing Techniques

Indexing

Text Transformation

Text Representation

Text Feature Engineering

Advanced Text Preprocessing

N-grams

Text Classification of Customer Reviews

Conclusion

Text Classification Using PyTorch

Solution Explanations: Advanced Text Preprocessing

Solution 1: Part-of-speech tagging