Customer Segmentation

Learn how to segment customer bases using k-means clustering.

There are a number of unsupervised clustering algorithms, but k-means is one of the easiest. It can segment an unlabeled dataset into a predetermined number of groups. The input parameter k stands for the number of clusters or groups we would like to form. However, if k is too small, then the centroids won’t lie within the clusters. But if k is too large, some of the clusters may be oversplit.

Implementing k-means clustering

The k-means algorithm follows these steps:

  1. Choose the number of clusters (k).

  2. Randomly assign centroids for each cluster.

  3. Assign each observation to a cluster for which the centroid is the ...