Sentiment Analysis with spaCy
Explore how to train spaCy's TextCategorizer for sentiment analysis using Amazon Fine Food Reviews data. Learn to prepare and balance datasets, create training examples, and implement a multilabel classification pipeline. This lesson also introduces transitioning to Keras and TensorFlow for further text classification tasks.
We'll cover the following...
In this lesson, we'll work on a real-world dataset and train spaCy's TextCategorizer on this dataset. We'll be working on the Amazon Fine Food Reviews dataset from Kaggle in this chapter. The original dataset is huge, with 100,000 rows. We sampled 4,000 rows. This dataset contains customer reviews about fine food sold on Amazon. Reviews include user and product information, user rating, and text.
We can load the dataset through the following method:
Exploring the dataset
Now, we're ready to explore the dataset step by step:
First, we'll do the imports for reading and visualizing the dataset:
We'll read the CSV file into a pandas DataFrame and output the shape of the DataFrame:
Next, we examine the rows and the columns of the dataset by printing the first five rows:
We'll be using the
TextandScore...