Gaussian distribution, commonly referred to as the normal distribution, is a commonly used continuous probability distribution. Many types of data can be fitted using Gaussian distribution, such as the heights of a population and the test scores of students.
The Gaussian distribution is formed by the following probability density function:
where:
The normal probability distribution graph is a "bell-shaped" curve, as illustrated below.
The shape of the normal distribution curve depends on two parameters:
Let's look at each of them in detail:
The mean is the normal distribution's central tendency which specifies the location of the bell curve's peak. The majority of values cluster around the mean. A slight change in the mean on a graph will cause the entire curve to shift left or right on the X-axis.
The standard deviation measures how much our data's values differ from one another and what the distance between the data points is. We calculate variance by subtracting each data point from the mean value and then finding the squared mean of the differenced values.
The properties of Gaussian distribution are as follows:
The shape of a Gaussian distribution is perfectly symmetrical. This denotes that the distribution curve can be cut in half to yield two equal halves. In other words, the values to the left and right of the central middle point are mirror images.
The Gaussian distribution's mean, median, and mode are all the same and can be seen in the center. They are represented by the number 0 (zero) along the bottom line.
The empirical rule describes where most data in a normal distribution will appear. In statistics, the empirical rule states that virtually every piece of data in a normal data set will fall within three standard deviations of the mean.
For all normal distributions, 68.2% of the observations will fall within one standard deviation of the mean, 95.4% will fall within two standard deviations, and 99.7% will fall within three standard deviations.
Skewness measures a Gaussian distribution's symmetry, whereas kurtosis measures the thickness of the tail ends relative to the tails of a normal distribution. The Gaussian distribution is symmetric and has zero skewness and a kurtosis equal to 3.0.
Standard Gaussian or standard normal distribution is a special case of normal distribution, where and . The probability density function becomes:
It would be impossible to keep separate tables for each combination of mean and standard deviation. So, one method for computing probabilities for a normal distribution is to use tables that give probabilities for the standard distribution.
The standard normal distribution may represent any normal distribution if we consider standard deviations above or below the mean rather than actual problem units.
# import numpyimport numpy as npfrom numpy import random# generating a random normally distributed array# of size 5arr = random.normal(size=(1, 5))print("The distribution's mean is: ",np.mean(arr))print("And variance: ",np.var(arr))
In the above code, we use mean()
and var()
methods to return the 1st and 2nd moment, respectively.
The graph on the right depicts the relationship between the adult male population and height. This graph’s data is normally distributed. We can deduce from the graph data that the mean, median, and mode height for adult men is 70 inches.
In order to sample the general normal distribution (with mean
It has a mean
# import numpyimport numpy as npfrom numpy import randomX = random.normal(size=(1, 5))print("----Standard Normal Distribution----")print("The matrix's mean is: ",X.mean())print("And variance: ",np.var(X))#General Normal Distributionprint("----Standard Normal Distribution----")SD = 2Mean = 0.5Y = SD*X+Meanprint("The matrix's mean is: ",Y.mean())print("And variance: ",np.var(Y))
The code above shows how we can easily use the linearity property to make general Gaussian distribution out of the standard Gaussian distribution.