How to find the edit distance between two strings

The edit distancealso known as the Levenshtein distance is the measure of how dissimilar two strings are. It counts the minimum number of operations required to turn one string into another. The following operations are taken into consideration:

  1. Insertion
  2. Deletion
  3. Substitution

For example, the edit distance between “life” and “love” would be 2 since it requires two operations of substitution.

To start with the algorithm, we first initialize an empty matrix of zeros with the width and height equal to the lengths of the two strings respectively. The first row and column are filled with numbered values to represent the placement of each character.

We run two for loops to traverse through every element of the matrix. If two letters are found to be the same, the new value at position [i, j] is set as the minimum value between position [i-1, j] + 1, position [i-1, j-1], and position [i, j-1] + 1.

If the two letters are not the same, the new value at position [i, j] is set as the minimum value between position [i-1, j] + 1, position [i-1, j-1] + 1, and position [i, j-1] + 1.

The edit distance will be the value at the lower right corner of the matrix.

import numpy as np
#initialize two strings
str1 = "lying"
print(str1)
str2 = "layout"
print(str2)
def lev(str1,str2):
#create matrix of zeros
len1=len(str1)+1
len2=len(str2)+1
matrix = np.zeros((len1,len2))
#number x and y indices in matrix
for i in range(len1):
matrix[i,0] = i
for j in range(len2):
matrix[0,j] = j
#two for loops to compare strings letter by letter
for i in range(1, len1):
for j in range(1, len2):
if str1[i-1]==str2[j-1]:
matrix[i,j] = min(
matrix[i-1,j] + 1,
matrix[i,j-1] + 1,
matrix[i-1,j-1])
else:
matrix[i,j] = min(
matrix[i-1,j] + 1,
matrix[i-1,j-1] + 1,
matrix[i,j-1] + 1)
print(matrix)
return(matrix[len1-1,len2-1])
m=lev(str1,str2)
print('Levenshtein distance is:',m)

Free Resources