Frequent Words

Solve a medium-level problem to find the K most frequent words in sentences.

Problem statement

Given a list of sentences and an integer KK, return the KK most frequent words present in those sentences, sorted by their frequency. In case there's a tie in frequency, sort the words lexicographically.

Example

Sample input

words: ["i learn to code", "i love to code"]
K: 2

Sample output

["code", "i"]

Explanation

"i", "code" and "to" are the three most frequent words with a frequency of two. But only "i" and "code" are returned as output due to lexicographical sorting.

Try it yourself

Try to solve the problem yourself before reading the solution.

#include <iostream>
#include <vector>
using namespace std;
vector<string> topKFrequentWords(vector<string> words, int k) {
// your code goes here
vector<string> answer;
return answer;
}
Solve frequent words in C++

Intuition

Sorting-based approach

The most intuitive method to solve this problem is sorting with a custom comparator. The comparator first compares the frequency of two words and sorts them in descending order of frequency. If the frequency for two words is the same, then they are sorted in alphabetically ascending (lexicographical) order.

Press + to interact
// custom comparator for sorting
compare(pair a, pair b)
{
// a and b are pairs of format<int,string > to store frequency and word
// sort based on frequency
// first and then lexicographical order
if (a.cnt is not equal to b.cnt)
return a.cnt > b.cnt
else
return a.word < b.word
}
topKFrequent(sentences, k)
{
// create a hashmap
mp = map(string, int)
// iterate through all sentences
for every sentence in sentences
{
//fetch the words in the sentences
for every word in sentences
{
increment count
for word in map mp
}
}
// create a list of pairs (int, string) to store frequency related data
freqData = list(pair<int, string>)
// push map data into freqData list
for (word, cnt) pair in mp
freqData.add(new_pair{cnt, word})
// sort the freqData list using custom comparator
sort(freqData, compare)
answer = list(string)
// pick the top K elements from the list
// add those to final answer
for every(cnt, word) pair in frequencyData
{
if (k > 0)
{
answer.add(word)
decrement k
}
}
return answer
}

Let the total number of words combined from all the sentences be NN. The best-case time complexity of a standard sorting algorithm like merge sort is O(N log(N))O(N \space log(N)).

The frequency of each word is stored in an unordered map. The size of the map can be O(N)O(N), where NN is the number of unique words. Also, the space required by a sorting algorithm is O(N)O(N). So the final space complexity is O(N)O(N) ...

Access this course and 1400+ top-rated courses and projects.