Key Concepts

Population vs. sample

The population is a collection of all the observations related to the problem at hand. It is not practical to gather knowledge about all the observations. So, we choose a good amount of observations, from the population, which represent our sample.

Statistical Inference

The process of estimating a population parameter from a sample statistic is called statistical inference. It has two major areas estimation and statistical hypothesis testing.

Parameter vs. statistic

Values like mean and standard deviation for the population are called parameters while the samples are called statistics. We can estimate the parameter value using the statistic value. The gap between the sample statistic and population parameter is called the sampling error.

Central Limit Theorem

The Central Limit Theorem states that if you have a large population with mean $\mu$ and standard deviation $\sigma$ and it takes sufficiently large random samples from the population with replacement (samples drawn are independent), then the distribution of the sample means will be approximately normally distributed.

Sampling Distribution

The probability distribution of a statistic is called a sampling distribution. It depends on the distribution of the population, the size of the samples, and the method of choosing the samples. The standard deviation of the sampling distribution is called the standard error. The standard error of the sampling distribution decreases as the sample size increases.

Constructing the sampling distribution

Take a sample size “ $n$ ” and a sample statistic say mean “ $\bar{x}$ ”.
Randomly choose the sample values according to the sample size.
Calculate the chosen sample statistic $\bar{x}$ ...

What is Data Science ?

Applications of Data Science

Overview of Libraries

Probability and Statistics

Machine Learning Part-1

Machine Learning Part-2

Machine Learning Part-3

Deep Learning

Machine Learning Tools and Libraries

Big Data Tools and Technologies

Where to go next ?

Key Concepts in Statistics