Counting Words

Explore how to count frequent nucleotide sequences or k-mers in DNA to uncover regions vital for biological processes such as protein binding during DNA replication. Understand the application of algorithmic methods like sliding windows and pseudocode to analyze overlapping sequence occurrences.

We'll cover the following...

Identifying frequent words

Identifying frequent words

Operating under the assumption that DNA is a language of its own, let’s borrow Legrand’s method and see if we can find any surprisingly frequent “words” within the ori of Vibrio cholerae. We’ve added reason to look for frequent words in the ori because for various biological processes, certain nucleotide strings appear surprisingly often in small regions of the genome. This is because certain proteins can only bind to DNA if a specific string of nucleotides ...

1.Before Getting Started

2.Where in the Genome Does DNA Replication Begin?

3.DNA Replication: Open Problems, Charging Stations, and Detours

4.How Do We Assemble Genomes?

5.Assemble Genomes: Charging Stations, and Detours

6.How Do We Compare Biological Sequences?

7.Biological Sequences: Detours

8.Conclusion

Counting Words

Identifying frequent words