...

/

String Reconstruction With Overlap Graph: From String To Graph

String Reconstruction With Overlap Graph: From String To Graph

Let’s explore how a string can be represented using a directed graph.

From a string to a graph

Repeats in a genome necessitate some way of looking ahead to see the correct assembly in advance. Returning to our previous example, you may have already found that TAATGCCATGGGATGTT is a solution to the String Reconstruction Problem for the collection of fifteen 3-mers in the last section, as illustrated below.

Note that we use a different color for each interval of the string between occurrences of ATG.

STOP and Think: Is this the only solution to the String Reconstruction Problem for this collection of 3-mers?

In the figure above, consecutive 3-mers in TAATGCCATGGGATGTT are linked together to form the genome path.

String spelled by a genome path

String Spelled by a Genome Path Problem

Problem overview:

Reconstruct a string from its genome path.

Input: A sequence of k-mers Pattern1_{1}, . . . , Patternn_{n} ...