K-Means on Two-Dimensional Data
This lesson will focus on K-Means on two-dimensional data in Python.
We'll cover the following...
K-means in Python
We do not need to code the above algorithm because it is available in sklearn.cluster
. We will be clustering on a dummy dataset first. The dummy dataset has three columns, feature_1
, feature_2
, and label
. The dataset has 4 classes, which mean each row of the data set can have a label from 0
, 1
, 2
, or 3
. First, we will plot a scatter plot of the two features.
Press + to interact
import pandas as pdimport matplotlib.pyplot as pltdf = pd.read_csv('dummy.csv')print(df.head())X = df.drop(columns = ['label'])Y = df['label']plt.scatter(x= X['feature_1'],y = X['feature_2'])
We drop the label
column in ...