What is Gaussian Distribution?

Share

Gaussian distribution, commonly referred to as the normal distribution, is a commonly used continuous probability distribution. Many types of data can be fitted using Gaussian distribution, such as the heights of a population and the test scores of students.

PDF

The Gaussian distribution is formed by the following probability density function:

f(x)=1σ2πexp(12(xμσ) ⁣2)f(x) = \frac{1}{\sigma\sqrt{2\pi}} \exp\left( -\frac{1}{2}\left(\frac{x-\mu}{\sigma}\right)^{\!2}\,\right)

where:

  • μ\mu is the distribution’s mean
  • σ\sigma is the distribution’s standard deviation

Parameters

The normal probability distribution graph is a "bell-shaped" curve, as illustrated below.

The shape of the normal distribution curve depends on two parameters:

  • μ\mu being the mean (or expected value) of the subject phenomenon as defined by the continuous random variable, X.

  • σ2\sigma^2 is the variance or the square of standard deviation, as defined by the continuous random variable, X.

Let's look at each of them in detail:

Mean(μμ)

The mean is the normal distribution's central tendency which specifies the location of the bell curve's peak. The majority of values cluster around the mean. A slight change in the mean on a graph will cause the entire curve to shift left or right on the X-axis.

Standard deviation(σ\sigma)

The standard deviation measures how much our data's values differ from one another and what the distance between the data points is. We calculate variance by subtracting each data point from the mean value and then finding the squared mean of the differenced values.

Properties

The properties of Gaussian distribution are as follows:

Symmetrical

The shape of a Gaussian distribution is perfectly symmetrical. This denotes that the distribution curve can be cut in half to yield two equal halves. In other words, the values to the left and right of the central middle point are mirror images.

Mean, median, and mode are equal

The Gaussian distribution's mean, median, and mode are all the same and can be seen in the center. They are represented by the number 0 (zero) along the bottom line.

Empirical rule

The empirical rule describes where most data in a normal distribution will appear. In statistics, the empirical rule states that virtually every piece of data in a normal data set will fall within three standard deviations of the mean.

For all normal distributions, 68.2% of the observations will fall within one standard deviation of the mean, 95.4% will fall within two standard deviations, and 99.7% will fall within three standard deviations.

Skewness and kurtosis

Skewness measures a Gaussian distribution's symmetry, whereas kurtosis measures the thickness of the tail ends relative to the tails of a normal distribution. The Gaussian distribution is symmetric and has zero skewness and a kurtosis equal to 3.0.

Standard Gaussian distribution

Standard Gaussian or standard normal distribution is a special case of normal distribution, where μ=0\mu = 0 and σ=1\sigma = 1. The probability density function becomes:

f(x)=12πexp(x ⁣22)f(x) = \frac{1}{\sqrt{2\pi}} \exp\left( -\frac{x^{\!2}}{2}\right)

It would be impossible to keep separate tables for each combination of mean and standard deviation. So, one method for computing probabilities for a normal distribution is to use tables that give probabilities for the standard distribution.

The standard normal distribution may represent any normal distribution if we consider standard deviations above or below the mean rather than actual problem units.

# import numpy
import numpy as np
from numpy import random
# generating a random normally distributed array
# of size 5
arr = random.normal(size=(1, 5))
print("The distribution's mean is: ",np.mean(arr))
print("And variance: ",np.var(arr))

In the above code, we use mean() and var() methods to return the 1st and 2nd moment, respectively.

Example

The graph on the right depicts the relationship between the adult male population and height. This graph’s data is normally distributed. We can deduce from the graph data that the mean, median, and mode height for adult men is 70 inches.

General normal distribution

In order to sample the general normal distribution (with mean μ\mu and standard deviation σ\sigma), we can use the linearity property of normal distribution. In other words, if XX is a standard normal RV and YY is a general one, then:

It has a mean μ\mu and variance σ2\sigma^2​​.

# import numpy
import numpy as np
from numpy import random
X = random.normal(size=(1, 5))
print("----Standard Normal Distribution----")
print("The matrix's mean is: ",X.mean())
print("And variance: ",np.var(X))
#General Normal Distribution
print("----Standard Normal Distribution----")
SD = 2
Mean = 0.5
Y = SD*X+Mean
print("The matrix's mean is: ",Y.mean())
print("And variance: ",np.var(Y))

The code above shows how we can easily use the linearity property to make general Gaussian distribution out of the standard Gaussian distribution.

Copyright ©2024 Educative, Inc. All rights reserved