String Reconstruction With Overlap Graph: From String To Graph
Let’s explore how a string can be represented using a directed graph.
From a string to a graph
Repeats in a genome necessitate some way of looking ahead to see the correct assembly in advance. Returning to our previous example, you may have already found that TAATGCCATGGGATGTT is a solution to the String Reconstruction Problem for the collection of fifteen 3-mers in the last section, as illustrated below.
Note that we use a different color for each interval of the string between occurrences of ATG.
STOP and Think: Is this the only solution to the String Reconstruction Problem for this collection of 3-mers?
In the figure above, consecutive 3-mers in TAATGCCATGGGATGTT are linked together to form the genome path.
String spelled by a genome path
String Spelled by a Genome Path Problem
Problem overview:
Reconstruct a string from its genome path.
Input: A sequence of k-mers Pattern, . . . , Pattern ...