Key Concepts

Population vs. sample

The population is a collection of all the observations related to the problem at hand. It is not practical to gather knowledge about all the observations. So, we choose a good amount of observations, from the population, which represent our sample.

Statistical Inference

The process of estimating a population parameter from a sample statistic is called statistical inference. It has two major areas estimation and statistical hypothesis testing.

Parameter vs. statistic

Values like mean and standard deviation for the population are called parameters while the samples are called statistics. We can estimate the parameter value using the statistic value. The gap between the sample statistic and population parameter is called the sampling error.

Central Limit Theorem

The Central Limit Theorem states that if you have a large population with mean μ\mu and standard deviation σ\sigma and it takes sufficiently large random samples from the population with replacement (samples drawn are independent), then the distribution of the sample means will be approximately normally distributed.

Sampling Distribution

The probability distribution of a statistic is called a sampling distribution. It depends on the distribution of the population, the size of the samples, and the method of choosing the samples. The standard deviation of the sampling distribution is called the standard error. The standard error of the sampling distribution decreases as the sample size increases.

Constructing the sampling distribution

  1. Take a sample size “nn” and a sample statistic say mean “xˉ\bar{x}”.

  2. Randomly choose the sample values according to the sample size.

  3. Calculate the chosen sample statistic xˉ\bar{x} on the given sample and store it.

  4. Repeat from the second step up to a multiple numbers of times.


Get hands-on with 1300+ tech skills courses.