Better Estimation of the Expected Value (Continued)
In this lesson, we are going to stick to the restriction to distributions with support over 0.0 to 1.0 for pedagogic reasons, but our aim is to find a technique that gets us back to sampling over arbitrary distributions.
In the previous lesson, we implemented a better technique for estimating the expected value of a function f
applied to samples from a distribution p
:
- Compute the total area (including negative areas) under the function
x => f(x) * p.Weight(x)
. - Compute the total area under
x => p.Weight(x)
.- This is for a normalized PDF or the normalizing constant of a non-normalized PDF; if we already know it, we don’t have to compute it.
- The quotient of these areas is the expected value
Draw Backs of Using Quadrature to get an Approximate Numerical Solution
Essentially our technique was to use quadrature to get an approximate numerical solution to an integral calculus problem.
However, we also noted that it seems like there might still be room for improvement, in two main areas:
- This technique only works when we have a good bound on the support of the distribution; for our contrived example, we chose a “profit function” and a distribution where we said that we were only interested in the region from to .
- Our initial intuition that implementing an estimate of “the average of many samples” by averaging many samples, seems correct; can we get back there?
The argument that we are going to make here (several times!) is: two things that are both equal to the same third thing are also equal to each other.
Recall that we arrived at our quadrature implementation by estimating that our continuous distribution’s expected value is close to the expected value of a very similar discrete distribution. We are going to make our argument a little bit more general here by removing the assumption that p
is a normalized distribution. That means that we’ll need to know the normalizing factor np
, which as we’ve noted is Area(p.Weight)
.
We said that we could estimate the expected value like this:
- Imagine that we create a sided “unfair die” discrete distribution.
- Each side corresponds to a wide slice from the range to ; let’s say that we have a variable that takes on values , , , and so on, corresponding to the sides.
- The weight of each side is the probability of choosing this slice:
p.Weight(x) / 1000 / np
- The value of each side is the “profit function”
f(x)
- The expected value of “rolling this die” is the sum of (value times weight): the sum of
f(x) * (p.Weight(x) / 1000 / np)
over our thousand values of .
Here’s the trick:
- Consider the standard continuous uniform distribution
u
. That’s a perfectly good distribution with support to . - Consider the function
w(x)
which isx => f(x) * p.Weight(x) / np
. That’s a perfectly good function fromdouble
todouble
.
Question: What is an estimate of the expected value of
w
over samples fromu
?
Get hands-on with 1200+ tech skills courses.