Solution: Longest Common Subsequence of Two Sequences

Solution for the Longest Common Subsequence of Two Sequences Problem.

We'll cover the following

Solution

Consider a longest common subsequence C=(c1,,cp)C = (c_1 , \ldots, c_p ) specified by indices 1i1<i2<<ipn1≤i_1 <i_2 <\ldots<i_p ≤n and 1j1<j2<<jpm1≤j_1 <j_2 <\ldots<j_p ≤ m (therefore, for every 1qp,aiq=bjq=cq1≤q≤p, a_{i_{q}}=b_{j_{q}} =c_q).

  • Last symbols of AA and BB appear in CC. In this case, ip=ni_ p =n and jp=mj_p =m. Then, (c1,,cp1)(c_1,\ldots,c_{p−1}) is a longest common subsequence of (a1,,an1)(a_1,\ldots,a_{n−1}) and (b1,,bm1)(b_1,\ldots,b_{m−1}).

  • At least one of the last symbols of AA and BB doesn’t appear in CC. In this case, either ip<ni_p < n or jp<mj_p < m. Then, (c1,,cp1)(c_1,\ldots,c_{p−1}) lies entirely in either (a1,,an1)(a_1,\ldots,a_{n−1}) or (b1,,bm1)(b_1,\ldots,b_{m−1}).

This way, we reduce the problem for the initial strings AA and BB to the same problem on their prefixes. This motivates the following definition: LCS(i,j)LCS(i,j) is the length of the longest common subsequence of A[1i]A[1\ldots i] and B[1j]B[1\ldots j]. The discussion above implies that this function satisfies the following recurrence relation:

LCS(i,j)=max{LCS(i1,j)LCS(i,j1)LCS(i1,j1)+1if   A[i]=B[j]LCS(i , j) = \text{max} \begin{cases} LCS(i-1 , j) \\ LCS(i , j-1) \\ LCS(i-1 , j-1)+1 & \text{if }~~ A[i]=B[j] \end{cases}

The base case for this recurrence relation is i=0i = 0 or j=0j = 0:

LCS(0,j)=LCS(i,0)=0.LCS(0,j) = LCS(i,0) = 0.

The resulting algorithm is shown below. Its running time is O(nm)O(nm).


 LCS(A[1n],B[1m])LCS(A[1\ldots n],B[1 \ldots m]):
 tabletable ← 2d array of size (n+1)×(m+1)(n + 1) × (m + 1)
 table[i][0]0table[i][0] ← 0 and table[0][j]0table[0][j] ← 0 for all i,ji,j
 for all i,ji,j for ii from 11 to nn:
 for jj from 11 to mm:
  table[i][j]table[i1][j]table[i][j] ← table[i − 1][j]
  table[i][j]max(table[i][j],table[i][j1])table[i][j] ← max(table[i][j], table[i][j − 1])
  if A[i]=B[j]A[i] = B[j]:
   table[i][j]max(table[i][j],table[i1][j1]+1)table[i][j] ← max(table[i][j], table[i − 1][j − 1] + 1)
 return table[n][m]table[n][m]


Stop and think: Can you reduce the Longest Common Subsequence Problem to the Edit Distance Problem?

Level up your interview prep. Join Educative to access 70+ hands-on prep courses.