Charging Station: Conversions between Patterns and Numbers
Learn how to convert patterns into numbers and decode these numbers to get the original sequence.
We'll cover the following...
Our approach to computing PatternToNumber(Pattern) is based on a simple observation. If we remove the final symbol from all lexicographically ordered k-mers, the resulting list is still ordered lexicographically (think about removing the final letter from every word in a dictionary). In the case of DNA strings, every (k − 1)-mer in the resulting list is repeated four times.
Thus, the number of 3-mers occurring before AGT is equal to four times the number of 2-mers occurring before AG plus the number of 1-mers occurring before T. Therefore,
PatternToNumber( AGT ) = 4 · PatternToNumber( ...