...

/

Introduction to Optimal Transport and OTT

Introduction to Optimal Transport and OTT

This lesson will introduce Optimal Transport and the OTT library.

We talked about Wasserstein distance in the last lesson. It’s quite important to get a background of the underlying theory of optimal transport (OT) first.

Note: This lesson is a basic introduction to optimal transport (OT). It can be skipped if needed.

Optimal transport

In 1781, French mathematician Gaspard Monge presented the following problem:

_“A worker with a shovel in hand has to move a large pile of sand lying on a construction site. The worker’s goal is to erect a target pile with a prescribed shape (for example, a giant sandcastle). Naturally, the worker wishes to minimize her total effort, quantified, for instance, as the total distance or time spent carrying shovelfuls of sand.”
[G. Monge, Mémoire sur la théorie des déblais et des remblais (De l’Imprimerie Royale, 1781)]

So the problem is quite clear:

  • We have a distribution X\mathcal{X} (the raw pile of sand in this case).
  • We want to transport it to another distribution Y\mathcal{Y}.
  • We want to reduce the cost (be it time or cost) of the transportation, T:XYT:\mathcal{X} \to \mathcal{Y}.

This problem applies in almost any area of daily life whenever we have to move a distribution rather than a single item. Common areas include computer vision, bioinformatics (especially in single-cell sequencing), and machine learning.

Wasserstein distance

Having seen the optimal transport problem, we are now in a better position to understand the Wasserstein distance.

In 1939, Russian mathematician Leonid Kantorovich defined this metric to address the OT problem. However, his work remained obscure and was only rediscovered 30 years later by another Russian mathematician, Leonid Vaseršteĭn and hence named Wasserstein - which is a bit unfair to Kantorovich. So some scholars encourage the use of his name).

Note: Wasserstein distance is often referred to as Earth-mover distance in computer science literature because of the original Monge formulation.

Like norms, Wasserstein distance can be defined over a range of natural numbers. Generally,

Wp(μ,ν)=(infE[(d(X, ...