DBSCAN
Learn about DBSCAN clustering, density, dense regions, point types, and algorithm.
After learning the famous -means clustering algorithm, a type of partitional clustering, we’ll move towards a more robust approach still used in the industry. This approach is called density-based clustering.
DBSCAN clustering
DBSCAN stands for Density-Based Spatial Clustering of Applications with Noise. In density-based clustering, the set of data points is partitioned into dense regions separated by regions of low density. DBSCAN is one popular density-based clustering algorithm. It’s a robust algorithm that doesn’t require prior knowledge of the number of clusters in the data, unlike the -means algorithm.
How to define density
Density is defined by two parameters, epsilon () and minPoints (), which quantify density for individual points and a set of points.
Epsilon specifies the maximum distance between two points to be considered neighbors. If the distance between two points is less than or equal to , they’re considered to be in each other’s neighborhood. MinPoints specifies the minimum number of neighbors a point must have within the distance to be considered a core point. Any point with fewer than ...