Clustering Algorithms Comparison
Learn about the comparison of three famous clustering algorithms.
We'll cover the following
An overview of three clustering algorithms
Clustering is an unsupervised learning technique that divides the dataset into distinct groups such that the points within each cluster are more similar to each other than they are to points in other clusters. There are many different clustering algorithms, and each has its own strengths and weaknesses. Here’s a comparison of three popular clustering algorithms: -means, DBSCAN, and agglomerative clustering.
-means
This is a centroid-based algorithm or a distance-based algorithm, where we calculate the distances to assign a point to a cluster. -means starts by randomly selecting initial centroids and then assigns each data point to the cluster corresponding to the nearest centroid. The centroids are then updated to the mean of the points in the corresponding cluster, and the process is repeated until convergence. One of the main advantages of -means is that it’s computationally efficient and easy to implement. However, it can get stuck in local optima, and it’s sensitive to the initial centroids chosen. Additionally, it assumes that the clusters are spherical and equally sized, which might not always be the case.
Example
Let’s try a scenario where -means performs better as compared to DBSCAN and agglomerative clustering:
Get hands-on with 1400+ tech skills courses.