K-Means Clustering

Learn about the k-means algorithm, its initialization, NP-hardness, and variance computation with examples.

We'll cover the following

Objective
- Variance computation code example
K-means is NP-hard
Lloyd’s algorithm
K-means++
K-means variants
- K-medoids clustering
- K-center clustering

Traditionally, in machine learning, we start with the popular partitional clustering algorithm called $k$ -means clustering. This algorithm divides the data into $k$ clusters based on a similarity score. The objective is to minimize the total variance of the $k$ clusters. The number of clusters, $k$ , must be specified.

Note: The choice of similarity score is a hyperparameter.

Objective

Given a set $D=\{\bold x_1, \bold x_2, \dots, \bold x_n \}$ of $n$ data points in $\R^d$ , the goal is to partition $D$ into the given $k \ge 2$ partitions, say, $C_1, C_2, \dots, C_k$ such that $\sum_{j=1}^K var(C_j)$ is minimum. The $var(C_j)$ using Euclidean distance can be defined as follows:

var(C_j) = \sum_{\bold x \in C_j}\|\bold x-\bold \mu_j\|_2^2

Here, $\bold \mu_j$ is the centroidThe centroid of a set of points is the mean of the data points in the set. of the the partition $C_j$ .

Note: The variance of a partition $C_j$ is defined in terms of Euclidean distance here; however, other distance/similarity measures can also be used.

Variance computation code example

Let’s compute the variance of a set of points using numpy:

Get hands-on with 1200+ tech skills courses.

Course Overview

Supervised Learning

Clustering

Project: Bag of Visual Words

Generalized Linear Regression

Face Recognition Using Kernel Linear Discriminant

Support Vector Machine

Logistic Regression

Ensemble Learning

Early Stage Diabetes Prediction Using Ensemble Learning

Decoding Dimensions: PCA and Autoencoders

Image Reconstruction Using PCA

Image Colorization using Autoencoders

Colorful Face Generation with VAEs

Appendix

Wrapping Up

K-Means Clustering

Objective

Variance computation code example