Probability refers to the chance or likelihood that an event will occur. It is calculated by taking into account all the possible ways an event can occur and dividing it by the total number of outcomes.
Probabilities give a sense of success and failure. Normally, they are described as values between 0 to 1 where the sum of all probabilities is always 1. They can also be described as percentages where the sum of all percentages is always 100. The following table maps the likelihood of occurrence to different values of probabilities.
Probability | Likelihood of Occurrence |
0% | Impossible |
0–50% | Unlikely |
50% | Equal chance |
50–100% | Likely |
A probability distribution is a mathematical function that describes the probability of occurrence of all possible outcomes of a statistical experiment. All possible outcomes of an experiment represent its sample space and probability distribution numerically describes the outcomes. Simply put, probability distribution is recording all the possible outcomes in the form of a table, just like a frequency distribution table. It can also be described as a graph or mathematical function to quickly reference the outcome of an event.
A probability distribution can be used to calculate the probability of landing on heads while flipping a fair coin, drawing a specific card from a 52-card deck, and having a specific number while rolling a fair die.
Flipping a fair coin: For example, when a fair coin has flipped the chances of landing on the head or tail are equally likely. Since the sum of all possible outcomes is always 1, the probability of landing on the head or tail is exactly 1/2. Representing these outcomes in a table will constitute the probability distribution of the experiment, i.e., flipping a coin.
Rolling a 6-sided die: Similarly, the chances of landing on any number while rolling a 6-sided die are equally likely. The respective probability distribution in the form of a table is given below:
Outcome | Probability |
1 | 1/6 |
2 | 1/6 |
3 | 1/6 |
4 | 1/6 |
5 | 1/6 |
6 | 1/6 |
Measuring height in a class: Let’s take an example of a simple experiment that measures and records the height of all students in a particular class. Recording heights will form a frequency distribution that can readily explain the spread of different heights and likelihood of different values.
No. of Students | Height (cm) |
12 | 160 |
8 | 150 |
4 | 152 |
4 | 161 |
3 | 156 |
3 | 154 |
7 | 155 |
In a nutshell, calculating probability distribution determines the values of a random variable which is based on the outcome of a random experiment. Based on a single random variable, probability distribution is divided into two types:
The result of a discrete probability distribution assumes countable categories or distinct values. For example, coin tosses and rolling a die are discrete events because there are no values in between. In a coin flip the outcomes are only heads or tails, there is no in-between value.
The possible values in discrete probability distributions always sum to one. Each outcome is non-zero and can take only one of the possible values.
Example: The likelihood of rolling any number on a fair die is 1/6 and the sum of the likelihoods of all numbers sums to 1.
Let’s take an example of rolling a fair die twice where X is the number of heads that occurred. Let’s solve the following:
All of the possible outcomes of this experiment (the sample space) are {tt, ht, th, hh} where represents the tail and represents the head. Here is the probability distribution table where all of the outcomes are equally likely and mutually exclusive. The table is shown below:
X | Frequency | P(X) |
0 | 1 | 0.25 |
1 | 2 | 0.50 |
2 | 1 | 0.25 |
Sum = 1 |
The above table describes the probability distribution of the discrete variable.
Since we are interested in calculating the probability of having at least one head where the rolling patterns are mutually exclusive, the probability is:
In a continuous probability distribution, a random variable can take infinite values over an interval. The random variable in a continuous probability distribution is measured on a scale.
Example: Measuring height, temperature, and weight.
Contrary to discrete probability distribution where outcomes are always non-zero, a continuous probability distribution can assume zero values. For example, the probability of measuring an exact temperature of 32.00 degrees on a particular day could be zero.
A continuous probability distribution (CPD) is calculated on intervals or ranges of values rather than a single value. The probability in CPD indicates the likelihood of a value falling within a specific interval or range which is synonymous with representing the CPD using a probability plot shown below.
Here are a few important points to note in the probability plot shown above:
Consider a machine that fills 100 bottles of juice in an hour. The quantity of juice is in each bottle but due to some mechanics, the quantity varies between to . The following table records the exact quantity of liquid filled in the last hour.
Weight X (L) | No. of Bottles |
0.950–0.975 | 30 |
0.975–1.000 | 35 |
1.000–1.025 | 20 |
1.025–1.050 | 15 |
Total | 100 |
Here the weights are represented as a range of values, as of now this table is giving an expression of a simple frequency table. To create probability distribution out of the above table we will simply divide the frequency, the number of jars by 100. The new table is shown below.
Weight X (L) | No. of Bottles | Probability |
0.950–0.975 | 30 | 0.30 |
0.975–1.000 | 35 | 0.35 |
1.000–1.025 | 20 | 0.20 |
1.025–1.050 | 15 | 0.15 |
Total | 100 | 1.00 |
Now we can calculate the probability of the machine filling bottles ranging in weight from 0.975–1.025 in the following way:
The above computation shows that the probability of the machine to fill bottles within an hour with the weights falling in the range described above is 55%.