Solution Explanations: N-Grams
Review solution explanations for the code challenges on n-grams.
We'll cover the following...
Solution 1: Introduction to n-grams
Here’s the solution:
Press + to interact
main.py
feedback.csv
import pandas as pdfrom sklearn.feature_extraction.text import CountVectorizerfrom nltk.tokenize import word_tokenizeimport stringfeedback_df = pd.read_csv('feedback.csv')def preprocess(text):text = text.lower()translator = str.maketrans('', '', string.punctuation)text = text.translate(translator)return textfeedback_df['feedback'] = feedback_df['feedback'].apply(preprocess)vectorizer = CountVectorizer(tokenizer=word_tokenize, ngram_range=(2, 3))X = vectorizer.fit_transform(feedback_df['feedback'])grams = vectorizer.get_feature_names()print(grams)
Let’s go through the solution explanation:
Lines 7–11: We define the
preprocess()
function that lowercases text and removes its punctuation characters.Line 12: We then apply the ...
Access this course and 1400+ top-rated courses and projects.