Gaussian distribution can be used to model the probabilistic framework of a random, continuous variable. In the case of a classical machine learning problem, our ultimate goal is to train a model so that it can predict the most likely outcome, given the closest probability distribution function approximated from our training dataset.
Similarly, Gaussian mixture models emerge from the underlying inductive bias that the dataset can be produced entirely from a combination of finite Gaussian distributions.
K-means clustering is a another unsupervised technique that attempts to hard-classify all the data points in the training set. In other words, each data point will be classified as belonging to one cluster or another. No data point will be a member of more than one cluster in the data set.
On the other hand, Gaussian mixture models do not rigidly classify each and every instance into one class or the other. The algorithm attempts to produce K-Gaussian distributions that would take into account the entire training space. Every point can be associated with one or more distributions. Consequently, the deterministic factor would be the probability that each point belongs to a certain Gaussian distribution. The distribution with the greatest likelihood would win.
Furthermore, Gaussian distribution automatically takes into account any missing data points, while K-means clustering depends on a dataset containing a proportionally rich and varied mix of data points.
Following are a few major real-life applications of Gaussian mixture models:
The Gaussian probability density function can be modeled as follows, where
The probability,
The mixing co-efficient of the
Given that our algorithm is a mixture of
Given that we now have the probability distribution
Hence, to find out what
If we add up
Hence, it follows that for all
Taking a natural logarithm on both sides of the equation would preserve the concavity of the objective function, so we can take the derivative and find the cluster with the greatest likelihood.
Applying Baye's rule and given that
import matplotlib.pyplot as pltfrom sklearn.datasets import make_blobsfrom sklearn import mixturefeatures, true_values = make_blobs(n_samples=800, centers=4, cluster_std=0.50, random_state=0)gmm_model = mixture.GaussianMixture(n_components=4).fit(features)labels = gmm_model.predict(features)plt.scatter(features[:, 0], features[:, 1], c=labels, s=40, cmap='viridis');plt.savefig('./output/svsrelu.png', dpi = 500)plt.show()
n_samples
indicates the number of data samples in the training data, centers
is used to specify the number of Gaussian distributions, cluster_std
is used to specify the standard deviation of clusters, and random_state
is used to ensure that we get the same results upon multiple executions of the code despite randomization. On the whole, Gaussian mixture models are a refined form of the original models (k-NNs and K-means clustering). They consider the closeness between points belonging to the same class as an associated measure of data points. Defining distance in terms of probability and eradicating the need for each point to belong to only one cluster adds to the learning capabilities of the model and allows for non-binary trends to be better addressed.