Solution: Edit Distance

Solution 1: Recursive algorithm

An alignment of two strings is a two-row matrix such that the first (or second) row contains the ordered symbols of the first (or second) string, interspersed with the space symbols () in such a way that no two space symbols appear in the same column.

Exercise break: Compute the number of different alignments of two strings of lengths nn and mm.

We classify the columns of an alignment as follows:

  • A column with a symbol and a space is a deletion\color{green}\text{deletion}.
  • A column with a gap and a symbol is an insertion\color{blue}\text{insertion}.
  • A column with two equal symbols is a match\color{orange}\text{match}.
  • A column with two different symbols is a mismatch\color{purple}\text{mismatch}.

An alignment is optimal if it minimizes the total number of mismatches, deletions, and insertions among all possible alignments.

Exercise break: Prove that the Edit Distance Problem can be reduced to finding an optimal alignment of two strings.

Level up your interview prep. Join Educative to access 70+ hands-on prep courses.