Solution: Sliding Window Median
Let's solve the Sliding Window Median problem using the Two Heaps pattern.
Statement
Given an integer array, nums
, and an integer, k
, there is a sliding window of size k
, which is moving from the very left to the very right of the array. We can only see the k
numbers in the window. Each time the sliding window moves right by one position.
Given this scenario, return the median of the each window. Answers within of the actual value will be accepted.
Constraints:
-
k
nums.length
-
nums[i]
Solution
So far, you’ve probably brainstormed some approaches and have an idea of how to solve this problem. Let’s explore some of these approaches and figure out which one to follow based on considerations such as time complexity and any implementation constraints.
Naive approach
The naive solution is to use nested loops to traverse over the array. The outer loop ranges over the entire array, and the nested loop is used to iterate over windows of elements. For each window, we’ll first sort the elements and then compute the median. We’ll append this value to the median list and move the window one step forward.
The above algorithm will have a total time complexity of , that is, :
- Traversal: , where is the window size and is the number of elements in the array.
- Sorting: Assuming we perform sorting using quicksort, the complexity would be .
The space complexity would be because the space complexity of quicksort is .
Optimized approach using two heaps
Since the median, let’s call it x, is the middle value in a list of sorted elements, we know that half of the elements will be smaller than (or equal to) x, and the other half will be greater than (or equal to) x.
We can divide the list into two halves. One half will store the smaller numbers—let’s call it small_list
. The other half will store the larger numbers—let’s call it large_list
. If the total number of elements is odd, we keep the element in small_list
. The median will be the largest number in small_list
if the number of elements is odd. Otherwise, if the total number of elements is even, the median will be the mean of the largest number in small_list
and the smallest number in large_list
. The best data structure for finding the smallest or largest number in a list of numbers is a heap.
-
We store the first half of the numbers of the window in a max-heap. We use a max-heap because we want to know the largest number in the first half of the window.
-
We use a min-heap to store the second half of the numbers, since we want to know the smallest number in the second half of the window.
Having said that, the two heaps pattern is a perfect fit for this problem.
While sliding the window over the array, as soon as an element leaves the window, we need to remove it form the heap, too. Removing an element from the heap takes time, where is the size of the heap. This is the time spent in finding the element in the heap. This makes our sliding function very costly. Therefore, we don’t remove an element right away. Instead, we maintain a hash map to keep track of the elements to be removed from the heaps. We only remove an element from the heap if it’s at the top of the heap, which reduces the time complexity to . In addition, if the element to be removed is not at the top of the heap, it doesn’t disturb the calculation of the medians.
Here’s what the algorithm will do:
-
Add
k
elements tosmall_list
. Since we’re implementing a max-heap, we’ll multiply each element with -1. -
Transfer the top elements from
small_list
tolarge_list
. We’ll multiply each element by -1 again to convert it to its original value. -
Append the median to the median list. In case of an odd window size, the median is the top element of
small_list
. Otherwise, it’s the mean of the top elements of the two lists. -
We use a
balance
variable to check if members belonging to the sliding window are present in the max-heap (small_list
). If there’s an extra element, we transfer the top element from the max-heap to the min-heap. The second highest element then springs to the top of the max-heap. Similarly, if more than elements end up in the min-heap (large_list
), we restore the balance by popping the smallest element from the min-heap and adding it to the max-heap. -
Move the window one step forward and add the outgoing element to the hash map. If the top element of the small list or the large list is present in the hash map with a frequency greater than 0, we remove it from the respective heap.
-
Repeat steps 3–5 until all the numbers in the
nums
have been processed.
Level up your interview prep. Join Educative to access 70+ hands-on prep courses.