...

/

String Reconstruction With Overlap Graph: The Genome Vanishes

String Reconstruction With Overlap Graph: The Genome Vanishes

Explore how to find the disappeared genome path.

The genome can still be traced out in the graph in the figureFIGURE3_7 by following the horizontal path from TAA to GTT. But in genome sequencing, we don’t know in advance how to correctly order reads. Therefore we’ll arrange the 3-mers lexicographically, which produces the overlap graph shown in the figure below. The genome path has disappeared!

STOP and Think: Can any other strings be reconstructed by following a path visiting all the nodes in the above figure?

Finding genome path

The genome path may have disappeared to the naked eye, but it must still be there, since we’ve simply rearranged the nodes of the graph. Indeed, the figure below (top) highlights the genome path spelling out TAATGCCATGGGATGTT. However, ...