An Introductory Guide to Data Science and Machine Learning/

...

Hierarchical Clustering

Let’s look at another famous clustering algorithm called hierarchical clustering, which involves grouping instances into a hierarchy or tree of clusters.

We'll cover the following...

Hierarchical clustering

Agglomerative hierarchical clustering

Basic steps
An example

Dendrogram
Linkage criteria
Divisive hierarchical clustering

Hierarchical clustering

Hierarchical clustering involves creating a hierarchy of clusters. Representing data objects in the form of a hierarchy is useful for data summarization and visualization. For example, consider we want to organize the people in an organization into major groups such as executives, managers, and staff. We can further partition these above into smaller sub-groups. This is the basic idea of hierarchical clustering.

Agglomerative hierarchical clustering

Agglomerative hierarchical clustering uses a bottom-up strategy. In this strategy, each instance is initially treated as a cluster. At each successive iteration of this clustering algorithm, a cluster is merged with other most similar clusters until only one large cluster is formed.

Basic steps

Agglomerative hierarchical clustering works as follows.

We compute the similarities between the instances and represent these in the form of a matrix, which is also called similarity or proximity matrix.
Now we treat each instance as a single cluster.
We merge the closest two clusters and update the proximity matrix. Updating the matrix involves computing the cluster distance between the new merged cluster and every other cluster. This is so that in the next iteration, this new updated matrix is used, and we perform the merge operation using the updated matrix again.
We keep on repeating step 3 until a condition is met or one cluster is formed. The condition might be the minimum clusters remaining once the merge operation has been performed.

What is Data Science ?

Applications of Data Science

Overview of Libraries

Probability and Statistics

Machine Learning Part-1

Machine Learning Part-2

Machine Learning Part-3

Deep Learning

Machine Learning Tools and Libraries

Big Data Tools and Technologies

Where to go next ?

Hierarchical Clustering

Hierarchical clustering

Agglomerative hierarchical clustering

Basic steps

An example