Combined Words
Solve a hard-level problem of checking if a word combines two or more words using tries.
Problem statement
A combined word is defined as a string that is created by joining at least two shorter words given in the array. Given an array of unique strings, return all the combined words present in the given array.
Example 1
Sample input
words: ["rain", "drop", "raindrop"]
Sample output
["raindrop"]
Explanation
"raindrop" is a combination of two shorter words "rain" and "drop".
Example 2
Sample input
["cat", "fat", "sat", "mat", "on", "catsatonmat", "fatcatsat"]
Sample output
["catsatonmat", "fatcatsat"]
Explanation
"catsatonmat" is a combination of four shorter words "cat","sat","on" and "mat". "fatcatsat" is a combination of three shorter words "fat","cat" and "sat".
Example 3
Sample input
["pen", "pens", "ink", "pensinkpens", "inkpensink"]
Sample output
["pensinkpens", "inkpensink"]
Explanation
"pensinkpens" is a combination of three shorter words "pens","ink" and "pens". "inkpensink" is a combination of three shorter words "ink","pens" and "ink".
Try it yourself
Try to solve the problem yourself before reading the solution.
#include <iostream>#include <vector>using namespace std;vector<string> findCombinedWords(vector<string>& words) {// your code goes herevector<string> answer;return answer;}
Intuition
Observing the examples, we can see that a combined word is created by adding other words as prefixes. Let's consider the word W
and divide it into two parts at an index i
. Let's call the substring from index 0
to i
as P
and the substring from i+1 to length(w)
as S
. If we can find the P
word in the given list of words, it means we've found a valid prefix for the word S
. Now we can move ahead and break down the word S
into more such substrings and extract smaller words out of it.
So, essentially, we have to answer this question recursively each time: is a substring(word[i, word. length()-1])
prefixed with another word given in the list of words? Because we consider the smaller words as prefixes to the combined word, the trie data structure can naturally help us identify valid combined words.
To summarize, we iterate through the letters of the word, finding the nodes in the trie, until we encounter a node with isEndOfWord
set as true
. This implies we've found a shorter valid word in the given word. We increment the count of words and then recursively follow the same search procedure for the remaining substring from the start of the trie.
Let's dive into an example, where the words list = {"pen", "pens", "ink", "pensinkpens", "inkpensink"}
and we try to find other small words present in pensinkpens
.
We call a search method search("pensinkpens", 0)
, which results in two possible paths:
We can consider one of the smaller words as
pen
and further perform a search for the stringsinkpens
.We can consider
pens
as the word and perform a search forinkpens
.
The first path leads to no valid answers since sinkpens
cannot be broken down further into smaller words. The second path leads to finding a word (pens
), increasing the combined words count to 1
, and then performing search("inkpens", 1)
, which leads to another word (ink
) being found. The combined word count now increases to 2
and then a call to search("pens", 2)
again results in the word pens
being found and leads to the combined words count increasing to 3
. The entire word string is thus divided into three smaller words ("pens" + "ink" + "pens")
successfully.
Algorithm
Step 1: Insert the words into a trie
Insert all the words into a trie. Then, iterate again over the list and traverse the trie for each word to check if it's a combined word recursively.
Step 2: Recursively check for prefixes
Move from the root node downward with each character in the word; if you encounter isEndOfWord
as true, it means a valid prefix has been traversed. We can now jump to the root node and keep moving forward in the word or ignore the isEndOfWord
flag and keep moving in the word forward without jumping to the trie root node. We ...