Problem statement

Given a list of sentences and an integer $K$ , return the $K$ most frequent words present in those sentences, sorted by their frequency. In case there's a tie in frequency, sort the words lexicographically.

Example

Sample input

Press + to interact

// custom comparator for sorting
compare(pair a, pair b)
{
   	// a and b are pairs of format<int,string > to store frequency and word
   	// sort based on frequency
   	// first and then lexicographical order
    if (a.cnt is not equal to b.cnt)
        return a.cnt > b.cnt
    else
        return a.word < b.word
}
topKFrequent(sentences, k)
{
   	// create a hashmap
    mp = map(string, int)
   	// iterate through all sentences
    for every sentence in sentences
    {
       	//fetch the words in the sentences
        for every word in sentences
        {
            increment count
            for word in map mp
        }
    }
   	// create a list of pairs (int, string) to store frequency related data
    freqData = list(pair<int, string>)
   	// push map data into freqData list
    for (word, cnt) pair in mp
        freqData.add(new_pair{cnt, word})
   	// sort the freqData list using custom comparator 
    sort(freqData, compare)
    answer = list(string)
   	// pick the top K elements from the list
   	// add those to final answer
    for every(cnt, word) pair in frequencyData
    {
        if (k > 0)
        {
            answer.add(word)
            decrement k
        }
    }
    return answer
}

Let the total number of words combined from all the sentences be $N$ . The best-case time complexity of a standard sorting algorithm like merge sort is $O(N \space log(N))$ .

The frequency of each word is stored in an unordered map. The size of the map can be $O(N)$ , where $N$ is the number of unique words. Also, the space required by a sorting algorithm is $O(N)$ . So the final space complexity is $O(N)$ .

Hashmap and priority queue-based approach

We ignored one helpful parameter provided in the input in the previous sorting-based approach. We only need to find the top $K$ ...

Introduction to Tries

Prefix Search

Suffix Search

Bitwise Tries

Pattern Matching

File Systems

Trie Traversal

Search Engine

Miscellaneous

Conclusion

Frequent Words

Problem statement

Example

Try it yourself

Intuition

Sorting-based approach

Hashmap and priority queue-based approach