Solution: Sliding Window Median

Let's solve the Sliding Window Median problem using the Two Heaps pattern.

Statement

Given an integer array, nums, and an integer, k, there is a sliding window of size k, which is moving from the very left to the very right of the array. We can only see the k numbers in the window. Each time the sliding window moves right by one position.

Given this scenario, return the median of the each window. Answers within 10510^{-5} of the actual value will be accepted.

Constraints:

  • 11 \leq k \leq nums.length 103\leq 10^3
  • 231-2^{31} \leq nums[i] 2311\leq 2^{31} - 1

Solution

So far, you’ve probably brainstormed some approaches and have an idea of how to solve this problem. Let’s explore some of these approaches and figure out which one to follow based on considerations such as time complexity and any implementation constraints.

Naive approach

The naive solution is to use nested loops to traverse over the array. The outer loop ranges over the entire array, and the nested loop is used to iterate over windows of kk elements. For each window, we’ll first sort the elements and then compute the median. We’ll append this value to the median list and move the window one step forward.

The above algorithm will have a total time complexity of O(nk+1)O(klogk)O(n - k + 1) \cdot O(k \log k), that is, O((nk)(klogk))O((n-k) \cdot (k \log k)):

  • Traversal: O(nk+1)O(n - k + 1), where kk is the window size and nn is the number of elements in the array.
  • Sorting: Assuming we perform sorting using quicksort, the complexity would be O(klogk)O(k \log k).

The space complexity would be O((nk)logk)O((n-k) \cdot \log k) because the space complexity of quicksort is O(logk)O(\log k).

Optimized approach using two heaps

Since the median, let’s call it x, is the middle value in a list of sorted elements, we know that half of the elements will be smaller than (or equal to) x, and the other half will be greater than (or equal to) x.

We can divide the list into two halves. One half will store the smaller numbers—let’s call it small_list. The other half will store the larger numbers—let’s call it large_list. If the total number of elements is odd, we keep the n2th\lceil \frac{n}{2} \rceil ^{th} element in small_list. The median will be the largest number in small_list if the number of elements is odd. Otherwise, if the total number of elements is even, the median will be the mean of the largest number in small_list and the smallest number in large_list. The best data structure for finding the smallest or largest number in a list of numbers is a heap.

  • We store the first half of the numbers of the window in a max-heap. We use a max-heap because we want to know the largest number in the first half of the window.

  • We use a min-heap to store the second half of the numbers, since we want to know the smallest number in the second half of the window.

Having said that, the two heaps pattern is a perfect fit for this problem.

While sliding the window over the array, as soon as an element leaves the window, we need to remove it form the heap, too. Removing an element from the heap takes O(m)O(m) time, where mm is the size of the heap. This is the time spent in finding the element in the heap. This makes our sliding function very costly. Therefore, we don’t remove an element right away. Instead, we maintain a hash map to keep track of the elements to be removed from the heaps. We only remove an element from the heap if it’s at the top of the heap, which reduces the time complexity to O(logn)O(logn). In addition, if the element to be removed is not at the top of the heap, it doesn’t disturb the calculation of the medians.

Here’s what the algorithm will do:

  1. Add k elements to small_list. Since we’re implementing a max-heap, we’ll multiply each element with -1.

  2. Transfer the top k2\lfloor \frac{k}{2} \rfloor elements from small_list to large_list. We’ll multiply each element by -1 again to convert it to its original value.

  3. Append the median to the median list. In case of an odd window size, the median is the top element of small_list. Otherwise, it’s the mean of the top elements of the two lists.

  4. We use a balance variable to check if k2\lceil \frac{k}{2} \rceil members belonging to the sliding window are present in the max-heap (small_list). If there’s an extra element, we transfer the top element from the max-heap to the min-heap. The second highest element then springs to the top of the max-heap. Similarly, if more than k2\lfloor \frac{k}{2} \rfloor elements end up in the min-heap (large_list), we restore the balance by popping the smallest element from the min-heap and adding it to the max-heap.

  5. Move the window one step forward and add the outgoing element to the hash map. If the top element of the small list or the large list is present in the hash map with a frequency greater than 0, we remove it from the respective heap.

  6. Repeat steps 3–5 until all the numbers in the nums have been processed.

Level up your interview prep. Join Educative to access 70+ hands-on prep courses.