Rules of Probability

Understand the rules of probability and how they apply to discriminative and generative models.

At the simplest level, a model, be it for machine learning or a more classical method such as linear regression, is a mathematical description of how various kinds of data relate to one another.

In the task of modeling, we usually think about separating the variables of our dataset into two broad classes:

  1. Independent data: It primarily means inputs to a model are denoted by XX. These could be categorical features (such as a 00 or 11 in six columns indicating which of six schools a student attends), continuous (such as the heights or test scores of the same students), or ordinal (the rank of a student in the class).

  2. Dependent data: It refers to the outputs of our models and are denoted by YY. As with the independent variables, these can be continuous, categorical, or ordinal, and they can be an individual element or multidimensional matrix (tensor) for each element of the dataset.

In some cases, YY is a label that can be used to condition a generative output, such as in a conditional GAN.

So, how can we describe the data in our model using statistics? In other words, how can we quantitatively describe what values we are likely to see, how frequently, and which values are more likely to appear together? One way is by asking the likelihood of observing a particular value in the data or the probability of that value. For example, if we were to ask what the probability is of observing a roll of 44 on a six-sided die, the answer is that, on average, we would observe a 4 once every six rolls. We write this as follows:

Create a free account to view this lesson.

By signing up, you agree to Educative's Terms of Service and Privacy Policy