...

String Reconstruction with Gluing Nodes and De Bruijn Graphs

Let’s learn how we can reconstruct strings by gluing labeled nodes.

We'll cover the following...

Gluing nodes and de Bruijn graphs
Gluing labeled nodes
- Solution explanation
Charging Station (The Effect of Gluing on the Adjacency Matrix)

This time, instead of assigning these 3-mers to nodes, we’ll assign them to edges, as shown in the figure below. You can once again reconstruct the genome by following this path from left to right, adding one new nucleotide at each step. Since each pair of consecutive edges represent consecutive 3-mers that overlap in two nucleotides, we’ll label each node of this graph with a 2-mer representing the overlapping nucleotides shared by the edges on either side of the node. For example, the node with incoming edge CAT and outgoing edge ATG is labeled AT.

Gluing labeled nodes

Nothing seems new here until we start gluing identically labeled nodes. In the figure given below (top panels), we bring the three AT nodes closer and closer to each other until they’ve been glued into a single node. Note that there are also three nodes labeled by TG, which we glue together in the figure below (middle panels). Finally, we glue together the two nodes labeled GG (GG and GG), as shown in the figure below(bottom panels), which produces a special type of edge called a loop connecting GG to itself.

Before Getting Started

Where in the Genome Does DNA Replication Begin?

DNA Replication: Open Problems, Charging Stations, and Detours

How Do We Assemble Genomes?

Assemble Genomes: Charging Stations, and Detours

How Do We Compare Biological Sequences?

Biological Sequences: Detours

Conclusion

String Reconstruction with Gluing Nodes and De Bruijn Graphs

Gluing nodes and de Bruijn graphs

Gluing labeled nodes