Problem Solving: Substring Search

Test your problem-solving skills in this challenging exercise.

Substring searching and its variants

Substring index

Write a program that takes two strings, S and sub, and returns the index of sub in S (if sub is the subset of S).

Sample program

Str1:	helloaaabbc	 
Str2:	abbc

helloaaabbcabbc
       ^
       abbc
Found at index: 7

Sliding window technique

Let’s suppose we have a window of the length of the substring, sub length, and we are placing that window at every index of S (giving the effect of sliding by moving one step forward) and checking if the window matches exactly, at the current index. We immediately return that index. Otherwise, we return -1, meaning that the substring is not found.

Look at the animation below:

#include <iostream>
using namespace std;
#include <cstring>
int substring_index(char S[ ], char sub[])
{
    // If substring found, return index otherwise return -1
    // Write your code here
}
int main()
{
    char S[ ] = "helloaaabbcabbc";
    char sub[] = "abbc";
    
    int fi = substring_index(S, sub);
    if(fi!=-1)
    {
        cout << S<<endl;
        for(int i=0;i<fi; i++)
            cout << ' ';
        cout << '^'<<endl;
        for(int i=0;i<fi; i++)
            cout << ' ';
        cout<< sub<<endl;
        cout<<"index: "<< fi<<endl;
    }
    return 0; 
}
Substring index

Look at the solution below:

Substring occurrences in a string

Write a program that takes two strings, S and sub, and returns all the occurrences of sub in S.

Sample input

Str1:	helloaaabbcabbc	 
Str2:	abbc
helloaaabbcabbc
       ^   ^
abbc found 2 times!

What changes do we need to make in the function above?

We just need to add an int array Is[], that saves the index when every substring character matches with the string.

Instead of returning immediately after finding the substring, we’ll save that index i inside Is[] and move the window forward, and exhaust all the indexes.

Look at the complete code below:

#include <iostream>
using namespace std;
#include <cstring>
int substring_index_all(char S[ ], char sub[], int Is[])
{
    int sub_length = strlen(sub);
    int S_length    = strlen(S);
    int ii = 0;
    for(int si=0; si<=S_length-sub_length; si++)
    {
        int count=0;
        for(int i=0; i<sub_length; i++)
        {   // if ith character of sub matches the ith character 
            if(S[si+i]==sub[i])  // of the window starting at si in S
            {
                count++;
            }
        }
        // if every character of substring sub is matched
        if(count==sub_length)  
        {
            Is[ii] = si; // save this index
            ii++;   // increment in ii so that the next index 
        }           // is saved at the next index in Is
    }
    return ii;
}
void printSpecial(int times, char S[], char sub[], int Is[])
{
    if(times!=0)
    {
        cout << S<<endl;
        int ti=0;
        for(int i=0;ti!=times; i++)   // Printing location where exactly found.
        {
            if(Is[ti]!=i)
                cout << ' ';
            else
                cout << '^', ti++;
                
        }
        cout<< endl << sub <<" found " << times << " times!" <<endl;
    }
    else
    {
        cout<< endl << sub <<" was not found "<<endl;
    }
}
int main()
{
    const int capacity = 100;
    char S[ ] = "helloaaabbcabbc";
    char sub[] = "abbc";
    
    int Is[capacity] = {};
    int times = substring_index_all(S, sub,Is);
    printSpecial(times, S, sub, Is);

    return 0;    
}

Subset occurrence in a string