Exploratory Data Analysis
Learn how to perform exploratory data analysis for clustering on our dataset and plot data on histograms, color-encoded matrices, and scatter plots.
We'll cover the following...
Histogram
The hist()
pandas function lets us easily visualize the distribution of each variable.
Press + to interact
# Plotting histogramdata.hist(bins = 30, figsize = (10,7), grid = False)plt.show()
We can see in the output that all variable distributions are either bimodal or multimodal, meaning they have two or more peaks respectively. This typically happens when the dataset contains multiple groups with different characteristics. In this case, the dataset was specifically created to contain ...