Skewness and Kurtosis
You will learn about the skewness and kurtosis, which are important properties of distributions, in this lesson.
Skewness
Skewness refers to the distortion or asymmetry in a Bell Curve or Normal Distribution. It tells us how much a distribution varies from the Normal Distribution. A Normal Distribution has a Skewness of zero. A distribution can be right (positively) or left (negatively) skewed.
Consequence of Skewness
-
Skewness helps us locate the outliers (the data points exhibiting mysterious behaviour). For example, the transactions happening over a credit card abruptly jump to a higher amount than the normal transactions happening. This contributes to an outlier.
-
The mean of a positively skewed distribution is greater than the median while the mean of a negatively skewed distribution is less than the median.
-
The mean of a positively skewed distribution is greater than the mode while the mean of a negatively skewed distribution is less than the mode.
-
Regression models in Machine Learning are affected by the presence of outliers, which can be indicated from a skewed distribution. So, it becomes necessary at times to remove the Skewness.
How to check Skewness?
Naive approach
-
The naive approach is to make the histogram or density curve of a column of a dataset. Check the curve to see if it is more Gaussian-like or skewed.
-
There are some mathematical measures to check the skewness of a column.
Method 1
Skewness = ...