...

/

Skewness and Kurtosis

Skewness and Kurtosis

You will learn about the skewness and kurtosis, which are important properties of distributions, in this lesson.

Skewness

Skewness refers to the distortion or asymmetry in a Bell Curve or Normal Distribution. It tells us how much a distribution varies from the Normal Distribution. A Normal Distribution has a Skewness of zero. A distribution can be right (positively) or left (negatively) skewed.

Consequence of Skewness

  • Skewness helps us locate the outliers (the data points exhibiting mysterious behaviour). For example, the transactions happening over a credit card abruptly jump to a higher amount than the normal transactions happening. This contributes to an outlier.

  • The mean of a positively skewed distribution is greater than the median while the mean of a negatively skewed distribution is less than the median.

  • The mean of a positively skewed distribution is greater than the mode while the mean of a negatively skewed distribution is less than the mode.

  • Regression models in Machine Learning are affected by the presence of outliers, which can be indicated from a skewed distribution. So, it becomes necessary at times to remove the Skewness.

How to check Skewness?

Naive approach

  • The naive approach is to make the histogram or density curve of a column of a dataset. Check the curve to see if it is more Gaussian-like or skewed.

  • There are some mathematical measures to check the skewness of a column.

Method 1

Skewness = 1ni=1n(xixˉ)3[1n1i=1n(xxˉ)2]32\frac{\frac{1}{n}\sum_{i=1}^{n}(x_i-\bar{x})^3 }{[\frac{1}{n-1}\sum_{i=1}^{n}(x-\bar{x})^2]^\frac{3}{2}} ...