Exploratory Data Analysis

Learn how to perform exploratory data analysis for classification on our dataset and plot data on pie charts, box plots, color-encoded matrices, and scatter plots.

We will now perform EDA on the Iris dataset. EDA is a fundamental part of every machine learning project because it helps us understand the fundamental statistical properties of a dataset by using visualizations.

Pie charts

Pie charts let us easily visualize the proportions of categorical variables.

Press + to interact
# Plotting pie chart
data['species'].value_counts().plot(kind='pie')
plt.ylabel('')
plt.show()

As we can see in the output, the Iris classes are evenly distributed. Each one is 33.333.3 ...