Sequence Alignment is the Manhattan Tourist Problem in Disguise
Let’s relate sequence alignment to the Manhattan Tourist Problem.
We'll cover the following...
Sequence comparison as a graph
In the figure below, we add two arrays of integers to an alignment of ATGTTATA and
ATCGTCC. The array [0 1 2 2 3 4 5 6 7 8] holds the number of symbols of ATGTTATA
used up to a given column in the alignment. Similarly, the array [0 1 2 3 4 5 5 6 6 7]
holds the number of symbols of ATCGTCC used up to a given column. In figure below, we add a third array [↘ ↘ → ↘ ↘ ↓ ↘
↓ ↘
] recording whether each column represents a match/mismatch (↘/↘
), an insertion (→
), or a deletion (↓
).
This third array corresponds to a path from source to sink in an rectangular grid, shown in the figure below (left). The i-th node of this path is made up of the i-th element of [0 1 2 2 3 4 5 6 7 8] and ...