...

/

Probabilistic Clustering

Probabilistic Clustering

Learn about the probabilistic technique to perform clustering. This lesson introduces the Gaussian distribution and expectation-maximization algorithms to perform clustering.

We'll cover the following...

Probabilistic approach for clustering

KMeans does the hard assignment of the data points in the clusters. KMeans also provides spherical shape clusters. That is not always better with the given data. The cluster can be formulated of any shape.

Probabilistic models do the soft assignment. For each data point, they tell the probability of assigning in each cluster. They also account for cluster shape and learn weights in dimensions.

Gaussian distribution

Gaussian distribution is specified by mean and variance. Mean represents the center of the distribution and variance represents the spread of the distribution.

Gaussian distribution is also known as Normal distribution. We have given some introduction in the statistics module also. It is generally given by a bell-shaped curve.

The probability density function(pdf) f(x) is given by:

f(x)=12πσ2e12σ2(xμ)2f(x) = \frac{1}{\sqrt{2\pi\sigma^2}}e^{-{\frac{1}{2\sigma^2}(x-\mu)^2}} ...