...

/

Probability Distribution Monad (Continued)

Probability Distribution Monad (Continued)

In this lesson, we will continue our discussion on probability distribution using monads in C# and implement the weight functionality. We will also highlight the advantages of probability distribution monad.

In the previous lesson, we achieved two significant results in our effort to build better probability tools. First, we demonstrated that the SelectMany implementation, which applies a likelihood function to a prior probability is the bind operation of the probability monad. Second, we gave an implementation of a wrapper object that implements it. Its action can be summed up as:

  1. Sample from the prior distribution
  2. Use the likelihood function to get the conditional distribution
  3. Sample from the conditional distribution
  4. Run the projection on the pair of samples to get the result

Implementing the Weight Function

You will probably recall that we did not implement the Weight function.

It’s a little tricky to do so, for two reasons.

  1. First, we decided to make weights integers.
    • If the weights are fractions between 0.00.0 and 1.01.0, you can multiply the weight of the prior sample by the weight of the conditional sample. (And if the weights are logarithms, you can add them.) It’s trickier with integers.
  2. Second, the projection at the end introduces once again the possibility that there will be “collisions”; the projection could pick non-unique values for unique combinations of the samples, that then have to be weighted as the sum.

That’s all a little abstract, so let’s work an example.

Example

Suppose we have a population of people who have been diagnosed with Frob Syndrome, which seems to be linked with height. We’ll divide the population of Frob Syndrome patients into three categories:

enum Height { Tall, Medium, Short}

and let’s suppose in our study population there are five tall people, two medium-height people, and one short person in every eight:

var prior = new List<Height>() { Tall, Medium, Short}.ToWeighted(5, 2, 1);

Now let’s suppose we’ve surveyed each of the tall, medium, and short people to learn the severity of their symptoms:

enum Severity { Severe, Moderate, Mild}

At this point, we are going to make the numbers a bit odd to illustrate the mathematics more clearly. What is the likelihood of a member of each group to report symptoms? Let’s say that 1010 out of every 2121 tall people report severe symptoms and the remaining 1111 report moderate symptoms. For the medium-height people, 1212 out of 1717 report moderate symptoms and 55 report mild symptoms. And all the short people report mild symptoms:

var severity = new List<Severity> { Severe, Moderate, Mild};

Func<Height, IDiscreteDistribution<Severity>> likelihood = h =>
{
  switch (h)
  {
    case Tall: 
      return severity.ToWeighted(10, 11, 0);
    case Medium: 
      return severity.ToWeighted(0, 12, 5);
    default: 
      return severity.ToWeighted(0, 0, 1);
  }
};      

And now let’s suppose we have a recommended prescription level:

enum Prescription { DoubleDose, NormalDose, HalfDose}

Taller people or people with more severe symptoms get a higher dose; shorter people or people with mild symptoms get a smaller dose:

 Func<Height, Severity, Prescription> projection = (h, s) =>
 {
  switch (h)
  {
    case Tall: return s == Severe ? DoubleDose : NormalDose;
    case Medium: return s == Mild ? HalfDose : NormalDose;
    default: return HalfDose;
  }
};

The question now is: what is the probability distribution on prescriptions for this study population? That is if we picked a random member of this population, how likely is it that they’d have a double, normal or half dose prescription?

IDiscreteDistribution<Prescription> doses = prior.SelectMany(likelihood, projection);

The problem is to work out the weightings of the three possible outcomes.

As we mentioned before, it’s easiest to do this when the weights are fractions because we can then just multiply them and then add them up:

Height Severity Prescription
Tall (58)(\frac{5}{8}) Severe (1021)(\frac{10}{21}) DoubleDose (2584)(\frac{25}{84})
...