Introduction to Clustering

In life, we try to group many things to understand them better or even simplify them. Let’s say you and your friend are trying to classify video games. Your friend might want to classify them based on genre and end up with a collection of all the video games in neat little clusters. While classifying video games by their developers, you might have a different collection from your friend’s collection.

> Note: Both collections originated from the same data pool (video games), and both have learned something interesting from the data.

In machine learning, we do something similar. We ask the machine to group the data to get meaningful information. This grouping of unlabeled data is called clustering. This grouping is based on similarity among data points. After clustering, the data points should be similar within clusters and dissimilar across clusters.

We can see a lot of data points below:

Get hands-on with 1400+ tech skills courses.