Exploratory Data Analysis
Learn how to perform exploratory data analysis for natural language processing on a dataset and plot data on bar charts and word clouds.
We'll cover the following...
We’ll perform EDA on the BBC News dataset.
Bar chart
By using the value_counts()
and plot()
pandas functions, we can create a bar chart that visualizes class proportions.
Press + to interact
# Plotting the bar chartcolor = ['C0', 'C1', 'C2', 'C3', 'C4']categories = data['category'].value_counts()categories.plot(kind = 'bar', figsize = (12,8), color = color)plt.show()
As we can see in the output, the dataset is imbalanced because the classes aren’t evenly distributed. We’ll deal with this issue later because right now it may cause problems with classification model training. The two most common ...