Solution: Top K Frequent Elements

Let's solve the Top K Frequent Elements problem using the Top K Elements pattern.

Statement

Given an array of integers, arr, and an integer, k, return the kk most frequent elements.

Note: You can return the answer in any order.

Constraints:

  • 11 ≤\leq arr.length ≤\leq 10310^{3}
  • −10−4-10^{-4} ≤\leq arr[i] ≤\leq 10410^{4}
  • 1 ≤\leq k ≤\leq number of unique elements in an array.
  • It is guaranteed that the answer is unique.

Solution

So far, you’ve probably brainstormed some approaches and have an idea of how to solve this problem. Let’s explore some of these approaches and figure out which one to follow based on considerations such as time complexity and any implementation constraints.

Naive approach

The naive approach to finding the k most frequent elements involves iterating through each element in the input array and storing their frequencies in a hash map. For each element in the array, the algorithm checks if the element is already in the hash map. If it is not in the hash map, it is added with a frequency of 1. Otherwise, its frequency is incremented.

After counting the frequencies of each element in the array, the algorithm searches through the hash map for the element with the highest frequency. This involves iterating through the hash map, which takes O(n)O(n) time. Once the element with the highest frequency is found, it is removed from the hash map and added to a final list of the k most frequent elements.

The time complexity for this approach is O(k∗n)O(k*n), and the space complexity is O(n+k)O(n+k), where nn is the length of the input array.

Optimized approach using top k elements

The intuition behind this approach lies in using a min-heap. The key idea is to maintain the top kk elements in the min-heap based on their frequency. Initially, a frequency map is created to count the occurrences of each element. Then, we use a min-heap, which keeps elements ordered based on frequency. At first, up to kk, elements are added to the min-heap. When the size exceeds kk, and the current element has a higher frequency than the top of the heap, we replace the least common element from the heap (top of the min-heap) with the current element. This ensures that only the most frequent kk elements remain in the min-heap. By the end of the process, the min-heap contains exactly the k most frequent elements.

Now, let’s look at the workflow of the implementation:

The hash map will store the element as the key, and its corresponding frequency in the array as the value. When inserting elements from the hash map into the min-heap, the following steps are taken:

  • We’ll store a pair (a,b)(a, b) as a tuple (this can be in the form of any data structure like an array or a tuple) in the heap where aa will be the frequency of the element, and bb will be the element itself. This ensures that the elements in the min-heap are always sorted by frequency.

  • We’ll make sure that if the size of the min-heap becomes greater than k, that is, there are more than k elements in the min-heap, we pop the pair with the lowest frequency element from the min-heap. This ensures that we always have the k maximum frequent elements in the min-heap.

Once we have added the pairs from the hash map to the min-heap, the min-heap will have the pairs with the top k frequencies. We will traverse the min-heap and add the second element of each pair from the min-heap to this new array. This is done since we only need the top k frequent elements, not their respective frequencies. Finally, we will return this new array.

The illustration below shows the whole process: