Measures of Variability
You'll learn about measures of Variability which gives us the dispersion in the dataset.
Measures of Variability or Spread
Measures of Variability also known as the measure of spread shows us the dispersion in the dataset and how the data is distributed around the center (Measure of Location) of the dataset. The most commonly used Measures of Variability are discussed below.
Variance
The Variance is the expected value (mean) of the squared differences of the data values from the mean. It shows us how close or far the values in a dataset are from the mean of the dataset in squared units.
Formula
-
is the variance.
-
is the total number of values in the dataset
-
is the sum of the values from 1 to n.
-
is the square of the difference of each value in the dataset from the mean.
-
is the sum of all the squared difference of values from the mean.
Example
-
Lets say we have a list of numbers as 34, 56, 190, 10000, and 45.
-
Here n = 5 (Number of Values)
-
The mean of the above list of numbers is calculated as
- The calculations are done below.
x | x-x̄ | (x-x̄) |
---|---|---|
34 | -2031 | 4124961 |
56 | -2009 | 4036081 |
190 | -1875 | 3515625 |
10000 | 7935 | 62964225 |
45 | -2020 | 4080400 |
Σ(x-x̄)=78721292 |
The above value is a squared value. If some unit existed, that unit will also be squared.
Standard Deviation
Standard deviation is calculated by taking the square root of the variance. It gives us the same measure but in simplified form, and the units are not squared anymore. This makes the inference clearer.
Formula
-
is the standard deviation.
-
is the total number of values in the dataset
-
is the sum of the values from 1 to n.
-
is the square of the difference of each value in the dataset from the mean.
-
is the sum of all the squared difference of values from the mean.
Example
-
Lets say we have a list of numbers as 34, 56, 190, 10000, and 45.
-
Here n = 5 (Number of Values)
-
The mean of the above list of numbers is calculated as
-
We have calculated the variance from the above calculation i.e .
-
So,
Range
Range is defined as the difference between the largest and the smallest values in the dataset. It gives us an idea about the range of dataset values.
Formula
Range = -
Where:
-
is the largest value in the list of numbers.
-
is the smallest value in the list of numbers.
Example
-
Lets say we have a list of numbers as 34, 56, 190, 10000, 45, so n=5.
-
Here = 10000 and = 34.
-
Range = - = 10000 - 34 = 9966
Quartiles
Quartiles are numbers that divide the data values into quarters. They break the dataset into four segments. Like for median they also have a pre-condition that the data should be sorted.
Get hands-on with 1400+ tech skills courses.