...

/

Solving the Longest Common Substring Problem

Solving the Longest Common Substring Problem

Let's solve the Longest Common Substring problem using Dynamic Programming.

Statement

Given two strings s1 and s2, you have to find the length of the Longest Common Substring (LCS) in both these strings.

Let’s say we have two strings, “helloworld” and “yelloword”, there are multiple common substrings, such as “llo”, “ello”, “ellowor”, “low”, and “d”. The longest common substring is “ellowor”, with length 7.

Constraints:

  • 11 \leq s1.length , s2.length \leq 400400

  • s1 and s2 consist of only lowercase English characters.

  • Spaces are not allowed.

Examples

No.

s1

s2

Length of LCS

1

helloworld

helloold

5

2

educativeexplore

educativeexpress

12

3

educativeisfun

algorithmsarefun

3

Try it yourself

Implement your solution in the following playground.

Press + to interact
Java
usercode > Main.java
import java.util.*;
public class Main{
public static int lcsLength (String input1, String input2) {
// Write Your Code Here
return 0;
}
}
Longest Common Substring

Note: If you clicked the “Submit” button and the code timed out, this means that your solution needs to be optimized in terms of running time.

Hint: Use dynamic programming and see the magic.

Solution

We will first explore the naive recursive solution to this problem and then see how it can be improved using dynamic programming.

Naive approach

Let’s look at this problem on a smaller scale. We are comparing each character one by one with the other string. There can be three possibilities for the ith^{th} character of s1 and the jth^{th} character of s2:

  1. If both characters match, these could be part of a common substring, meaning we should count this character towards the length of the longest common substring.

  2. ith^{th}character of s1 might match with (j+1)th^{th}character of s2.

  3. jth^{th}character of s2 might match with (i+1)th^{th}character of s1.

Therefore, we take the maximum among all three of these possibilities. Look at the following visualization.

Let’s implement the algorithm as discussed above:

import java.util.*;
import java.util.stream.*;
class Solution {
public static int lcsLengthRec(String s1, String s2, int i, int j, int count) {
// base case of when either string has been exhausted
if (i >= s1.length() || j >= s2.length()) {
return count;
}
// if i and j characters match, increment the count and compare the rest of the strings
if (s1.charAt(i) == s2.charAt(j)) {
count = lcsLengthRec(s1, s2, i + 1, j + 1, count + 1);
}
// compare s1[i:] with s2, s1 with s2[j:], and take max of current count and these two results
return Math.max(count, Math.max(lcsLengthRec(s1, s2, i + 1, j, 0), lcsLengthRec(s1, s2, i, j + 1, 0)));
}
public static int lcsLength(String s1, String s2) {
return lcsLengthRec(s1, s2, 0, 0, 0);
}
// Driver code
public static void main(String[] args) {
String[] s1 = {
"educative",
"bcdcdcd",
"arefun",
"yourocks",
"abc"
};
String[] s2 = {
"education",
"aacdcdcd",
"isfun",
"youawesome",
"def"
};
// You can uncomment the lines below and check how this recursive solution causes a time-out
// String temp1[] = Arrays.copyOf(s1, s1.length + 1);
// temp1[s1.length] = "ypzrvyigwdiqrnbglvviozqzruvmwivgvqvrfhqi";
// s1 = temp1;
// String temp2[] = Arrays.copyOf(s2, s2.length + 1);
// temp2[s2.length] = "wdiqrnbglvviozqzruvmwivgvqvrfhqiypzrvyigwdiqrn";
// s2 = temp2;
for (int i = 0; i < s1.length; i++) {
System.out.print(i + 1);
System.out.println(".\tInput string 1: \"" + s1[i] + "\"");
System.out.println("\tInput string 2: \"" + s2[i] + "\"");
System.out.println("\n\tThe Length of Longest Common Substring is: " + lcsLength(s1[i], s2[i]));
Stream.generate(() -> "-").limit(100).forEach(System.out::print);
System.out.println();
}
}
}
Longest Common Substring using recursion

Note: Please observe that if you include the test case commented out in the driver code, the solution is likely to time out. Try to solve the larger problem using the dynamic programming solutions provided ...