Home/Blog/Interview Prep/Ace the Netflix Interview: Must-Know Python Coding Problems

Ace the Netflix Interview: Must-Know Python Coding Problems

29 min read

May 31, 2024

content

Pattern 1: Modified Binary Search

Random Pick with Weight

Solution

Pattern 2: Stacks

Flatten Nested List Iterator

Solution

Pattern 3: Custom Data Structures

Time-Based Key-Value Store

Solution

Insert Delete GetRandom O(1)

Solution

Implement LRU Cache

Solution

Pattern 4: Hash Maps

Logger Rate Limiter

Solution

Pattern 5: Cyclic Sort

First Missing Positive

Solution

Pattern 6: Top K Elements

Top K Frequent Elements

Solution

Kth Largest Element in an Array

Solution

Pattern 7: Union Find

Longest Consecutive Sequence

Solution

Netflix’s software engineers are some of the most data-driven in FAANG. No change is fully incorporated without rigorous A/B testing, and to enable this constant data stream, Netflix Engineering ships thousands of lines of code daily.

Netflix describes its engineering org as a "dream team," where players push each other to facilitate rapid innovation. Engineers have tons of freedom to experiment and find the next great solution. As a result, engineering managers look for candidates who excel in creative problem-solving.

One way interviewers screen for this quality is through coding problems. While the interview process varies by role, you can expect to work through a series of coding problems within the first two rounds of your interview.

So, how can you tackle coding problems like a member of the Netflix dream team?

Look for underlying patterns.

Even if a coding problem is brand new, you can spot familiar patterns and then use this knowledge to apply relevant algorithms and strategies.

To help you prepare, we've identified the 7 most important patterns to understand for your technical interview at Netflix. We'll walk you through each pattern and provide examples of coding problems that incorporate it.

By familiarizing yourself with these patterns, you'll be equipped to tackle a wide range of coding problems in your Netflix interview. Let's get started!

Pattern 1: Modified Binary Search#

In many coding interviews, candidates often encounter problems where binary search is handy. It's known for its logarithmic time complexity, which makes it super efficient. However, it only works when the input data is already sorted. That's where the Modified Binary Search pattern steps in. It is an advanced adaptation of the traditional binary search algorithm, modified to handle more complex scenarios where elements may not strictly meet the standard sorted criteria. This pattern excels in efficiently locating elements or conditions that are not straightforward to find through linear searching, particularly when dealing with rotated arrays, finding boundaries, or solving the random pick weight problem.

By dividing the search space in half, this method significantly reduces the time complexity to $O(\log n)$ , making it highly effective for large datasets. Like the traditional binary search, it operates by comparing midpoints and systematically narrowing down the search range, but it incorporates additional logic to handle irregularities and special cases.

The adaptability of the Modified Binary Search pattern makes it a powerful tool in software development. It enhances the ability to manage and retrieve data efficiently in scenarios where direct comparisons and typical ordering do not apply. This pattern not only streamlines data retrieval processes but also aids in optimizing performance across various programming tasks.

Let’s see how the following example illustrates the application of the Modified Binary Search pattern to efficiently solve the given coding problem:

Random Pick with Weight#

We’re given an array of positive integers, weights, where weights[i] is the weight of the $i^{th}$ index. The task here is to write a function, Pick Index(), which performs weighted random selection to return an index from the weights array. The larger the value of weights[i], the heavier the weight is, and the higher the chances of its index being picked.

Suppose that the array consists of the weights $[12, 84, 35]$ . In this case, the probabilities of picking the indexes will be as follows:

Index 0: $12/(12 + 84 + 35) = 9.2\%$
Index 1: $84/(12+84+35)=64.1\%$
Index 2: $35/(12+84+35)=26.7\%$

Note: Since we’re randomly choosing from the options, there is no guarantee that in any specific run of the program, any of the elements will be selected with the exact expected frequency.

Solution#

We can use the Modified Binary Search pattern to speed up the random index-picking process. It reduces the index searching time to $O(\log n)$ which otherwise would have been $O(n)$ with linear scanning. We start by preprocessing the given weights by summing them to obtain the total weight. Then, we construct a prefix sum array where each index i stores the cumulative sum of weights up to index i. Next, we generate a random number between 1 and the total weight. Finally, we use binary search to find the index corresponding to the randomly generated number in the prefix sum array. This approach ensures that elements with higher weights have a proportionally higher chance of being selected while maintaining randomness.

Here’s how the algorithm works:

The Init() method generates a list of cumulative sums using the given list of weights.
The Pick Index() method returns a randomly selected index while considering the provided weights. It works as follows:
- Generates a random number, target, between $1$ and $k$ , where $k$ is the largest value in the list of cumulative sums of weights.
- Uses binary search to find the index of the first cumulative sum greater than the random value. Initialize the low index to $0$ and the high index to the length of the list of cumulative sums of weights. While the low index is less than or equal to the high index, the algorithm:
  - Calculates the mid index as $⌊$ (low $+$ (high $−$ low) $//2⌋$ .
  - If the cumulative sum at the mid index is less than or equal to the target, the low index is updated to mid + 1.
  - Otherwise, the high index is updated to mid.
- At the end of the binary search, the low pointer will point to the index of the first cumulative sum greater than the target. Return this index as the chosen index.

Let’s look at the code for this solution below:

import random
class RandomPickWithWeight:
    
	# Constructor
    def __init__(self, weights):
        
		# List to store running sums of weights
        self.running_sums = []
        
		# Variable to calculate a running sum
        running_sum = 0
        # Iterate through the given weights
        for w in weights:
            
			# Add the current weight to the running sum
            running_sum += w
            
			# Append the running sum to the running_sums list
            self.running_sums.append(running_sum)
        
        # Store the total sum of weights
        self.total_sum = running_sum
    # Method to pick an index based on the weights
    def pick_index(self):
        
		# Generate a random number between 1 and the total sum of the array
        target = random.randint(1, self.total_sum)
        
		# Initialize low and high variables for binary search
        low = 0
        high = len(self.running_sums)
        # Perform binary search to find the first value higher than the target
        while low < high:
            mid = low + (high - low) // 2
            if target > self.running_sums[mid]:
                low = mid + 1
            else:
                high = mid
        
        # Return the index (low) found
        return low
# Driver code
def main():
    counter = 900
    weights = [[1, 2, 3, 4, 5],
                [1, 12, 23, 34, 45, 56, 67, 78, 89, 90],
                [10, 20, 30, 40, 50],
                [1, 10, 23, 32, 41, 56, 62, 75, 87, 90],
                [12, 20, 35, 42, 55],
                [10, 10, 10, 10, 10],
                [10, 10, 20, 20, 20, 30],
                [1, 2, 3],
                [10, 20, 30, 40],
                [5, 10, 15, 20, 25, 30]]
    dict = {}
    for i in range(len(weights)):
        print(i + 1, ".\tList of weights: ", weights[i], ", pick_index() called ", counter, " times", "\n", sep="")
        [dict.setdefault(l, 0) for l in range(len(weights[i]))]
        sol = RandomPickWithWeight(weights[i])
        for j in range(counter):            
            index = sol.pick_index()
            dict[index] += 1
        print("-"*105)
        print("\t{:<10}{:<5}{:<10}{:<5}{:<15}{:<5}{:<20}{:<5}{:<15}".format( \
            "Indexes", "|", "Weights", "|", "Occurences", "|", "Actual Frequency", "|", "Expected Frequency"))
        print("-"*105)
        for key, value in dict.items():
            print("\t{:<10}{:<5}{:<10}{:<5}{:<15}{:<5}{:<20}{:<5}{:<15}".format(key, "|", weights[i][key], "|", value, "|", \
                str(round((value/counter)*100, 2)) + "%", "|", str(round(weights[i][key]/sum(weights[i])*100, 2))+"%"))
        dict = {}
        print("\n", "-"*105, "\n", sep="")
if __name__ == '__main__':
    main()

Random Pick with Weight

Stacks are commonly used for tasks like expression evaluation, syntax parsing, or tracking state changes in algorithms. To identify if a problem can be solved using the Stacks pattern, look for scenarios where the last in, first out property is advantageous or where tracking state changes in a last in, first out manner is necessary. Examples of common interview problems that can be tackled using the Stacks pattern include evaluating arithmetic expressions, checking balanced parentheses, or implementing a browser’s back functionality.

Let’s see how the following example illustrates the application of the Stacks pattern to efficiently solve the given coding problem:

Flatten Nested List Iterator#

Given a nested list of integers where each element is either an integer or a list whose elements may also be integers or other integer lists, implement an iterator to flatten the nested list.

Implement the Nested Iterator class that has the following functions:

Init: This initializes the iterator with the nested list.
Next (): This returns the next integer in the nested list.
Has Next (): This returns TRUE if there are still some integers in the nested list. Otherwise, it returns FALSE.

Solution#

In this solution, we use a stack to store individual integers and nested lists of integers. We push all the elements of the nested list onto the stack in the reverse order during initialization. We do this to ensure correct processing of the nested list when using a stack-based iterator. This ensures that when we pop elements off the stack, they are in the correct order as they appeared in the original nested list.

The Has Next function performs a set of push and pop operations on the stack as a loop. It checks if the top element of the stack is an integer. If so, it returns TRUE. Otherwise, if the top element is a list of integers, then it pops from the stack and pushes each element of the list onto the stack in reverse order. This way, the lists at the top of the stack are converted into individual integers whenever the Has Next function is called. If the stack is empty, the function returns FALSE.

The Next function first calls the Has Next function to check if there is an integer in the stack. If the Has Next function returns TRUE, it pops from the stack and returns this popped value.

Let’s look at the code for this solution:

main.py

nested_integers.py

from nested_integers import NestedIntegers
class NestedIterator:
    # NestedIterator constructor initializes the stack using the 
    # given nested_list list
    def __init__(self, nested_list):
        self.nested_list_stack = list(reversed([NestedIntegers(val) for val in nested_list]))
    # has_next() will return True if there are still some integers in the 
    # stack (that has nested_list elements) and, otherwise, will return False.
    def has_next(self):
        # Iterate in the stack while the stack is not empty
        while len(self.nested_list_stack) > 0: 
            # Save the top value of the stack
            top = self.nested_list_stack[-1]
            # Check if the top value is integer, if true return True, 
            # if not continue
            if top.is_integer():
                return True
            # If the top is not an integer, it must be the list of integers
            # Pop the list from the stack and save it in the top_list
            top_list = self.nested_list_stack.pop().get_list()
            # Save the length of the top_list in i and iterate in the list
            i = len(top_list) - 1
            while i >= 0:
                # Append the values of the nested list into the stack
                self.nested_list_stack.append(top_list[i])
                i -= 1
        return False
    # next will return the integer from the nested_list 
    def next(self): 
        # Check if there is still an integer in the stack
        if self.has_next():
            # If true pop and return the top of the stack
            return self.nested_list_stack.pop().get_integer()
        return None
# Driver code
def create_nested_iterator_structure(input_list):
    def parse_input(nested, input_list):
        if isinstance(input_list, int):
            nested.set_integer(input_list)
        else:
            for item in input_list:
                child = NestedIntegers()
                nested.add(child)
                parse_input(child, item)
    nested_structure = NestedIntegers()
    parse_input(nested_structure, input_list)
    return nested_structure
def create_nested_iterator_from_structure(nested_structure):
    def flatten(nested, result):
        if nested.is_integer():
            result.append(nested.get_integer())
        else:
            for child in nested.get_list():
                flatten(child, result)
    flattened_list = []
    flatten(nested_structure, flattened_list)
    return NestedIterator(flattened_list)
# Driver code
def main():
    inputs = [
        [1, [2, 3], 4],
        [3, [2, 3, 4], 4, [2, 3]],
        [[2, 3], 3, [2, 3], 4, [2, 3, 4, 5]],
        [1, [3, [4, [5, 6], 7], 8], 9],
        [[2, 3, [2, 3]]]
    ]
    for i, test_case_input in enumerate(inputs, start=1):
        print(i, ".\tOriginal structure: ", inputs[i-1], sep="")
        print("\n\tOutput:\n")
        nested_structure = create_nested_iterator_structure(test_case_input)
        test_case = create_nested_iterator_from_structure(nested_structure)
        #result = []
        while test_case.has_next():
            print("\titr.next(): ", test_case.next())
        print("-"*100)
if __name__ == '__main__':
    main()

Flatten Nested List Iterator

We've looked at how stacks can be used as a coding pattern to solve problems that require data to be processed in LIFO order, now let's move on to the next pattern.

Pattern 3: Custom Data Structures#

Custom data structures are essentially modified versions of existing data structures tailored to address specific needs. We often need to go beyond standard data structures like arrays and hash tables to tackle unique challenges more effectively. For instance, a web crawler that processes numerous pages and URLs might use a specialized "URL queue" to manage these URLs efficiently, ensuring they are unique and prioritized based on relevance. Custom data structures involve creating custom classes that encapsulate the necessary functionality and properties to efficiently manage and manipulate the data. By designing data structures optimized for the problem domain, we can improve the performance and readability of our code while simplifying complex operations. To determine if a problem can benefit from the Custom Data Structures pattern, consider scenarios where standard data structures like arrays, lists, or maps are not sufficient or where specialized operations need to be performed frequently. Common problems suitable for this pattern include implementing priority queues, disjoint-set data structures, or specialized graph representations.

Let’s see how the following examples illustrate the application of the Custom Data Structures pattern to efficiently solve these problems:

Time-Based Key-Value Store#

Implement a data structure that can store multiple values of the same key at different timestamps and retrieve the key’s value at a certain timestamp.

We’ll need to implement the TimeStamp class. This class has the following functions:

Init(): This function initializes the values dictionary and timestamp dictionary.
Set Value(key, value, timestamp): This function stores the key and value at any given timestamp.
Get Value(key, timestamp): This function returns the value set for this key at the specified timestamp.

Note: When a query requests the value of a key at a timestamp that is more recent than the most recent entry for that key, our data structure should return the value corresponding to the most recent timestamp.

Solution#

In this solution, we will use two dictionaries to implement the required class effectively. The first dictionary, values_dict, stores the values against a specific key. The second dictionary, timestamps_dict, stores the timestamps corresponding to the same key to keep track of the values stored at a specific timestamp.

Additionally, we aim to reduce the overall time complexity by using the binary search instead of linear search.

Now, let's go over the implementation of two of our fundamental functions:

Set Value(key, value, timestamp): This function adds the key with the value for the given timestamp. For this, we check if the key already exists in the values_dict dictionary.

If the key exists and the provided value does not equal the last value stored in the values_dict for this key, we append the value and timestamp to the lists associated with that key in the values_dict and timestamps_dict, respectively.
If the key does not exist in the values_dict dictionary, it creates a new entry in both the values_dict and timestamps_dict dictionaries.

Get Value(key, timestamp): This function returns the value set previously, with the highest timestamp for the respective key. This function uses the search_index function, which uses the binary search in its implementation. To implement this function, we initialize the left and right variables as starting and ending positions of the timestamps_dict dictionary. We then find the middle position and move these pointers to get the required value. If the required value is less than the middle value, we increment the right pointer. Otherwise, we increment the left pointer.

We check the following conditions to get the required value:

We first verify whether or not the required key exists. If it does not exist, then return the empty string.
If the key exists, we check the following conditions:
- If the given timestamp does not exist but is greater than the timestamp that was set previously, it returns the value associated with the nearest smaller timestamp.
- If the given timestamp exists, it returns the value associated with the given key and timestamp.

Let’s look at the code for this solution below:

import random
class TimeStamp:
    def __init__(self):
        self.values_dict = {}
        self.timestamps_dict = {}
    #  Set TimeStamp data variables
    def set_value(self, key, value, timestamp):
        # Check if the given key already exists in the values dictionary.
        if key in self.values_dict:            
            
			# Check if the given value of the key already exists in the values dictionary.
            if value != self.values_dict[key][len(self.values_dict[key])-1]:
                
				# Store values for the given key in the values dictionary.
                self.values_dict[key].append(value)
                
				# Store the timestamp for the given key in the timestamp dictionary.
                self.timestamps_dict[key].append(timestamp)
        else:
            
			# Store the value and key for the given key in the values dictionary.
            self.values_dict[key] = [value]
            
			# Store the timestamp for the given key in the timestamp dictionary.
            self.timestamps_dict[key] = [timestamp]
    # Find the index of right most occurrence of the given timestamp
    # using binary search
    def search_index(self, n, key, timestamp):
        left = 0
        j=right = n
        mid = 0
        while left < right:
            
			# It will return the mid of the current dictionary.
            mid = (left+right) >> 1
            # Increase index value if required index is less than the
            # current index otherwise decrease it.
            if self.timestamps_dict[key][mid] <= timestamp:
                left = mid + 1
            else:
                right = mid
        return left-1
    # Get time_stamp data variables
    def get_value(self, key, timestamp):
        
		# Check if the given key is present in the values dictionary.
        if key not in self.values_dict:
            
			# Return empty string if item does not exist.
            return ""
        else:
            
			# Find the right most occurrence of the given timestamp.
            index = self.search_index(len(self.timestamps_dict[key]),
                                      key, timestamp)
            
			# If the timestamp exist in the timestamp dictionary, return the
            # value with that timestamp.
            if index > -1:
                return self.values_dict[key][index]
            # return empty string if the timestamp does not exist.
            return ""
# Driver code
def main():
    ts = TimeStamp()
    num = 1
    random_value = 0
    input = (
            ("course", "OOP", 3),
            ("course", "PF", 5),
            ("course", "OS", 7),
            ("course", "ALGO", 9),
            ("course", "DB", 10)
        )
    for i in input:
        print(num, ".\tAdd value: (", '"', i[0], '"',
                   ", ", '"', i[1], '"', ", ", i[2], ")", sep="")
        ts.set_value(i[0], i[1], i[2])
        random_value = random.randint(0, 10)
        print("\n\tGet value for:")
        print("\t\tKey = course  \n\t\tTimestamp = ", random_value, sep="")
        print("\n\tReturned value = ", '"',
              ts.get_value("course", random_value), '"', sep="")
        num += 1
        print("-" * 100, sep="")
if __name__ == "__main__":
    main()

Time-Based Key-Value Store

Now, let's look at another problem that can be solved using the Custom Data Structures pattern.

Insert Delete GetRandom O(1)#

Implement a Random Set data structure that can perform the following operations:

Init(): This initializes the Random Set object.
Insert(): This function takes an integer, data, as its parameter and, if it does not already exist, adds it to the set, returning TRUE. If the integer already exists in the set, the function returns FALSE.
Delete(): This function takes an integer, data, as its parameter and, if it exists in the set, removes it, returning TRUE. If the integer does not exist in the set, the function returns FALSE.
GetRandom(): This function takes no parameters. It returns an integer chosen at random from the set.

Note: Your implementation should aim to have a running time of $O(1)$ (on average) for each operation.

Solution#

Arrays support constant time lookups, but deletion operations in arrays typically take $O(n)$ time complexity due to the need to shift elements. To overcome this limitation, we can go for a data structure with constant time deletion—hash map. We can design our custom data structure by combining arrays for efficient lookups with hash maps for rapid deletions. The implementation of our custom data structure's methods is as follows:

Insert(): When inserting data, we will store the new element in our array. Then, insert a key-value pair in our hash map, where the key will be the new element, and the value will be its index in the array. Both of these operations have a time complexity of $O(1)$ .
Delete(): Using the hash map, we find the index in the array at which the element is located. Swap the last element in the array with the one to be removed. Then, in the hash map, update the location of the element we just swapped with the target element and remove the entry for the target element from the hash map. Lastly, pop out the target element from the array. Each of these five operations has a time complexity of $O(1)$ .
GetRandom(): For this operation, we can choose a number at random between $0$ and $n−1$ , where $n$ is the size of the array. We return the integer located at this random index.

Let’s take a look at the code for this solution below:

from random import choice
class RandomSet():
    def __init__(self):
        self.indexor = {}  # Maps the actual value to its index
        self.store = []   # Store the actual values in an array
    # Function to insert a value
    def insert(self, val):
        """
        Inserts a value in the data structure.
        Returns True if it does not already contain the specified element.
        """
        if val in self.indexor:
            return False
        
        # Insert the actual value as a key and its index as a value
        self.indexor[val] = len(self.store)
        
        # Append a new value to the array, store
        self.store.append(val)
        return True
    # Function to remove a value
    def delete(self, val):
        """
        Removes a value from the data structure.
        Returns True if it contains the specified element.
        """
        if val in self.indexor:
            
            # swap the last element in the array with the element
            # to delete, update the location of the moved element
            # in its entry in the hash map
            last, i = self.store[-1], self.indexor[val]
            self.store[i], self.indexor[last] = last, i
            
            # delete the last element
            del self.indexor[val]
            self.store.pop()
            return True
        return False
    # Function to generate a random value
    def get_random(self):
        """
        Choose an element at random from the data structure.
        """
        return choice(self.store)
# Driver code
def main():
    commands = [["I", "G", "I", "I", "R", "G"],
                ["I", "I", "R", "G", "R", "I"]]
    values = [[10, -1, 100, 1000, 200, -1], [30, 60, 10, -1, 30, 90]]
    for i in range(len(commands)):
        dataset = RandomSet()
        print(i + 1, ". Starting operations:", "\n", sep="")
        for j in range(len(commands[i])):
            if commands[i][j] == "I":
                print("\tInsert (", values[i][j], ") returns ",
                      dataset.insert(values[i][j]), sep="")
            elif commands[i][j] == "R":
                print("\tDelete (", values[i][j], ") returns ",
                      dataset.delete(values[i][j]), sep="")
            elif commands[i][j] == "G":
                print("\tGenerate Random() returns ",
                      dataset.get_random(), sep="")
        print("-" * 100)
if __name__ == '__main__':
    main()

Insert Delete GetRandom O(1)

Now, let's move to the next example of Custom Data Structures.

Implement LRU Cache#

Implement an LRU cache class with the following functions:

Init(capacity): Initializes an LRU cache with the capacity size.
Set(key, value): Adds a new key-value pair or updates an existing key with a new value.
Get(key): Returns the value of the key, or −1 if the key does not exist.

If the number of keys has reached the cache capacity, evict the least recently used key and add the new key.

As caches use relatively expensive, faster memory, they are not designed to store large data sets. Whenever the cache becomes full, we must evict some data from it. There are several caching algorithms to implement a cache eviction policy. LRU is a very simple and commonly used algorithm. The core concept of the LRU algorithm is to evict the oldest data from the cache to accommodate more data.

Solution#

This problem can be solved efficiently if we combine two data structures and use their respective functionalities, as well as the way they interact with each other, to our advantage. A doubly linked list allows us to arrange nodes by the time they were last accessed. However, accessing a value in a linked list is $O(n)$ . On the other hand, a hash table has $O(1)$ lookup time but has no concept of recency. We can combine these two data structures to get the best properties.

Here is the algorithm for the LRU cache:

Set:

If the element exists in the hash map, then update its value and move the corresponding linked list node to the head of the linked list.
Otherwise, if the cache is already full, remove the tail element from the doubly linked list. Then delete its hash map entry, add the new element at the head of the linked list, and add the new key-value pair to the hash map.

Get:

If the element exists in the hash map, move the corresponding linked list node to the head of the linked list and return the element value.
Otherwise, return -1.

Note that the doubly linked list keeps track of the most recently accessed elements. The element at the head of the doubly linked list is the most recently accessed element. All newly inserted elements (in Set) go to the head of the list. Similarly, any element accessed (in the Get operation) goes to the head of the list.

Let’s look at the code for this solution below:

main.py

linked_list_node.py

linked_list.py

from linked_list import LinkedList
# We will use a linked list having pair of integers
# where the first integer will be the key
# and the second integer will be the value
class LRUCache:
    # Initializes an LRU cache with the capacity size
    def __init__(self, capacity):
        self.cache_capacity = capacity
        self.cache_map = {}
        self.cache_list = LinkedList()
    # Returns the value of the key, or -1 if the key does not exist.
    def get(self, key):
        # If the key doesn't exist, we return -1
        found_itr = None
        if key in self.cache_map:
            found_itr = self.cache_map[key]
        else:
            return -1
        list_iterator = found_itr
        # If the key exists, we must move it to the front of the list
        self.cache_list.move_to_head(found_itr)
        return list_iterator.pair[1]
    # Adds a new key-value pair or updates an existing key with a new value
    def set(self, key, value):
        # Check if the key exists in the cache hashmap
        # If the key exists
        if key in self.cache_map:
            found_iter = self.cache_map[key]
            list_iterator = found_iter
            # Move the node corresponding to key to front of the list
            self.cache_list.move_to_head(found_iter)
            # We then update the value of the node
            list_iterator.pair[1] = value
            return
        # If key does not exist and the cache is full
        if len(self.cache_map) == self.cache_capacity:
            # We will need to evict the LRU entry
            # Get the key of the LRU node
            # The first element of each cache entry is the key
            key_tmp = self.cache_list.get_tail().pair[0]
            # This is why we needed to store a <key, value> pair
            # in the cacheList. We would not have been able to get
            # the key if we had just stored the values
            # Remove the last node in list
            self.cache_list.remove_tail()
            # Remove the entry from the cache
            del self.cache_map[key_tmp]
        # The insert_at_head function inserts a new element at the front
        # of the list in constant time
        self.cache_list.insert_at_head([key, value])
        # We set the value of the key as the list begining
        # since we added the new element at head of the list
        self.cache_map[key] = self.cache_list.get_head()
    def print(self):
        print("Cache current size: ", self.cache_list.size,
              ", ", end="")
        print("Cache contents: {", end="")
        node = self.cache_list.get_head()
        while node:
            print("{", str(node.pair[0]), ",", str(node.pair[1]),
                  "}", end="")
            node = node.next
            if node:
                print(", ", end="")
        print("}")
        print("-"*100, "\n")
# Driver code
def main():
    # Creating a cache of size 2
    cache_capacity = 2
    cache = LRUCache(cache_capacity)
    print("Initial state of cache")
    print("Cache capacity: " + str(cache_capacity))
    cache.print()
    keys = [10, 10, 15, 20, 15, 25, 5]
    values = ["20", "get", "25", "40", "get", "85", "5"]
    for i in range(len(keys)):
        if values[i] == "get":
            print("Getting by Key: ", keys[i])
            print("Cached value returned: ", cache.get(keys[i]))
        else:
            print("Setting cache: Key: ", keys[i], ", Value: ", values[i])
            cache.set(keys[i], int(values[i]))
        cache.print()
if __name__ == '__main__':
    main()

Implement LRU Cache

Now that we've explored the design and implementation of Custom Data Structures, let's explore the next coding pattern in the list.

Pattern 4: Hash Maps#

The Hash Maps pattern is a tool for storing and retrieving key-value pairs, offering constant-time complexity for insertion, deletion, and lookup operations on average. Quicker lookup time makes hash maps ideal for tasks like caching, indexing, or frequency counting.

The core operations of hash maps are the following:

1. Insert: A key-value pair is added, with the hash function determining the index for storage. While typically quick ( $O(1)$ on average), this can degrade to $O(n)$ if many keys collide at the same index.

2. Search: Values are retrieved by applying the hash function to compute the key’s index, typically a quick operation ( $O(1)$ ), though it can degrade to $O(n)$ when handling collisions or during resizing.

3. Remove: Values are removed by deleting the entry at the key’s computed index. Usually quick ( $O(1)$ ), the process can slow down to $O(n)$ in certain scenarios like collisions or resizing.

Using hash maps significantly reduces lookup times to an average of $O(1)$ compared to managing separate arrays for related data. This efficiency makes hash maps a useful tool for optimizing performance in algorithmic problems. Common interview questions where the Hash Maps pattern demonstrates effectiveness include Logger Rate Limiter, Next Greater Element, Isomorphic Strings, and Longest Palindrome.

Let’s see how the following example illustrates the application of the Hash Maps pattern to efficiently solve the given coding problems:

Logger Rate Limiter#

For the given stream of message requests and their timestamps as input, you must implement a logger rate limiter system that decides whether the current message request is displayed. The decision depends on whether the same message has been displayed in the last $S$ seconds. If yes, then the decision is FALSE, as this message is considered a duplicate. Otherwise, the decision is TRUE.

Note: Several message requests, though received at different timestamps, may carry identical messages.

Solution#

We need to know if a message has been displayed and keep track of its time limit. For problems where two associated values need to be checked, we can use a hash map.

We can use all incoming messages as keys and their respective time limits as values. This will help us eliminate duplicates while respecting the time limit of $S$ seconds.

Here is how we’ll implement our algorithm using hash maps:

Initialize a hash map.
When a request arrives, check if it’s a new request (the message is not among the keys stored in the hash map) or a repeated request (an entry for this message already exists in the hash map). If it’s a new request, accept it and add it to the hash map.
If it’s a repeated request, compare the difference between the timestamp of the incoming request and the timestamp of the previous request with the same message. If this difference exceeds the time limit $S$ , accept it and update the timestamp for that specific message in the hash map. Otherwise, reject it.

Let’s look at the code for this solution below:

class RequestLogger:
    # initailization of requests hash map
    def __init__(self, time_limit):
        self.requests = {}
        self.limit = time_limit
    # function to accept and deny message requests
    def message_request_decision(self, timestamp, request):
        # checking whether the specific request exists in
        # the hash map or not if it exists, check whether its
        # time duration lies within the defined timestamp
        if request not in self.requests or timestamp - self.requests[request] >= self.limit:
            # store this new request in the hash map, and return true
            self.requests[request] = timestamp
            return True
        else:
            # the request already exists within the timestamp
            # and is identical,so it should
            # be rejected, return false
            return False
# driver code
def main():
    # here we will set the time limit to 7
    new_requests = RequestLogger(7)
    times = [1, 5, 6, 7, 15]
    messages = ["good morning",
                "hello world",
                "good morning",
                "good morning",
                "hello world"]
    # loop to execute over the input message requests
    for i in range(len(messages)):
        print(i + 1, ".\t Time, Message: {",
              times[i], ", '", messages[i], "'}", sep="")
        print("\t Message request decision: ",
              new_requests.message_request_decision(
                times[i], messages[i]), sep="")
        print("-" * 100)
if __name__ == '__main__':
    main()

Logger Rate Limiter

With our understanding of Hash Maps established, let's discuss the next coding pattern.

Pattern 5: Cyclic Sort#

The Cyclic Sort pattern is a strategy for sorting arrays containing a known range of numbers from 1 to n. This method involves iterating through the array and placing each element at its correct position, where the value of each element matches with its index in the array. Through a series of swaps, each element is shifted to its rightful position, guaranteeing a sorted array without needing additional space. This sorts the array differently than comparison-based algorithms like Quicksort or Merge Sort. To determine if a problem can be solved using the Cyclic Sort pattern, examine scenarios where the array consists of distinct numbers within a specific range. Common problems well-suited for this pattern include finding missing or duplicate numbers, sorting arrays of numbers from 1 to n, or identifying the smallest missing positive integer.

Let’s explore the following problem to understand how easy it is to use Cyclic Sort to solve the given coding problem:

First Missing Positive#

Given an unsorted integer array, nums, return the smallest missing positive integer. Create an algorithm that runs with an $O(n)$ time complexity and utilizes a constant amount of space.

Note: The smallest missing positive isn’t the first positive number that’s missing in the range of elements in the input, but the first positive number that’s missing if we start from $1$ .

Solution#

An array of size $n$ containing numbers ranging from $1$ to $n$ can be sorted using the cyclic sort approach in $O(n)$ . Here, we have two challenges. First, the array may have negative numbers as well. Second, the array does not have all consecutive numbers.

The first challenge can be solved by simply ignoring the negative values. The second problem can be solved by linearly scanning the sorted array in $O(n)$ . So, overall, our problem is solved in $O(n)$ time and $O(1)$ space because cyclic sort sorts the elements in-place with $O(1)$ extra space. Now, we can proceed further and discuss our algorithm.

The algorithm iterates over each element of the array, and for each element, it determines the correct position where the element should be placed. It subtracts $1$ from the element value because of 0-based indexing. It checks the following conditions for each element after determining its correct position:

If the current element is in the $[1−n]$ range, and if it is not already at its correct position, swap the current element with the element at its correct position.
Otherwise, move on to the next element.

After placing all the elements at the correct positions, traverse the array again and check if the value at each index is equal to its index plus $1$ . If it is not, return the index plus $1$ as the smallest missing positive integer.

If all the values are in order, return the length of the array plus $1$ as the smallest missing positive integer.

Let’s look at the code for this solution below:

def smallest_missing_positive_integer(nums):
	i = 0
	while i < len(nums):
		correct_spot = nums[i] - 1  # determining the correct position of the current element
		# check if the current element is in the range [1, n] and is not already at its correct position
		if 0 <= correct_spot < len(nums) and nums[i] != nums[correct_spot]:
			# swap the current element with the element at its correct position
			nums[i], nums[correct_spot] = nums[correct_spot], nums[i]
		else:
			i += 1  # move on to the next element if the current element is already at its correct position or out of range
	# iterate through the array again and check if each element is equal to its index plus 1
	for i in range(len(nums)):
		if i + 1 != nums[i]:
			return i + 1  # return the smallest missing positive integer
	return len(nums) + 1  # if all the elements are in order, return the next positive integer
# Driver code
def main():
	input_array = [[1, 2, 3, 4], [-1, 3, 5, 7, 1], [1, 5, 4, 3, 2], [-1 , 0, 2, 1, 4], [1,4,3]]
	x = 1
	for i in range(len(input_array)):
		print(x, ".\tThe first missing positive integer in the list ", input_array[i], " is: ", sep = "")
		print("\t" + str(smallest_missing_positive_integer(input_array[i])))
		print("-" * 100)
		x = x + 1
if __name__ == '__main__':
  	main()

First Missing Positive

After understanding how to use the Cyclic Sort effectively, it's time to explore the next coding pattern.

Pattern 6: Top K Elements#

Unlike other techniques that require sorting the entire data to find the top or bottom $k$ elements, this technique uses a heap to access the required elements from unsorted data. It begins by processing the elements of the dataset and inserting them into the heap. As new elements are encountered, they are compared with those in the collection. If the new element is larger (or smaller, depending on whether we are finding the top $k$ largest or smallest elements), it replaces one of the elements in the collection. This ensures that the collection always contains the top $k$ elements seen so far. This makes it useful for handling large datasets or real-time data where performance matters. Using this pattern saves time and computational resources while solving problems similar to finding the K Largest Elements in a Stream, Top K Frequent Words, and K Smallest Elements in an Array, etc.

Let’s see how the following examples illustrate the application of the Top K Elements pattern to efficiently solve these problems:

Top K Frequent Elements#

Given an array of integers, arr, and an integer, k, return the $k$ most frequent elements.

Note: You can return the answer in any order.

Solution#

Finding the top $k$ frequent elements typically involves iterating through all elements. We can optimize this process using the Top K Elements pattern. We begin by determining the frequency of each element using a hash map. Once we have the frequency map, the min-heap is the best data structure to keep track of the top $k$ frequencies. After storing the frequencies of the elements, we’ll iterate through the hash map, insert each element into our heap, and find the top $k$ frequent elements.

The hash map will store the element as the key, and its corresponding frequency in the array as the value. When inserting elements from the hash map into the min-heap, the following steps are taken:

We’ll store a pair $(a,b)$ as a tuple (this can be in the form of any data structure like an array or a tuple) in the heap where $a$ will be the frequency of the element, and $b$ will be the element itself. This ensures that the elements in the min-heap are always sorted by frequency.
We’ll make sure that if the size of the min-heap becomes greater than $k$ , that is, there are more than $k$ elements in the min-heap, we pop the pair with the lowest frequency element from the min-heap. This ensures we always have the $k$ maximum frequent elements in the min-heap.

Once we have added the pairs from the hash map to the min-heap, the min-heap will have the pairs with the top $k$ frequencies. We will traverse the min-heap and add the second element of each pair from the min-heap to this new array. This is done since we only need the top $k$ frequent elements, not their respective frequencies. Finally, we will return this new array.

Let’s look at the code for this solution below:

from heapq import heappush, heappop
def top_k_frequent(arr, k):
    # find the frequency of each number
    num_frequency_map = {}
    for num in arr:
        num_frequency_map[num] = num_frequency_map.get(num, 0) + 1
    top_k_elements = []
    # go through all numbers of the num_frequency_map
    # and push them in the top_k_elements, which will have
    # top k frequent numbers. If the heap size is more than k,
    # we remove the smallest(top) number
    for num, frequency in num_frequency_map.items():
        heappush(top_k_elements, (frequency, num))
        if len(top_k_elements) > k:
            heappop(top_k_elements)
    # create a list of top k numbers
    top_numbers = []
    while top_k_elements:
        top_numbers.append(heappop(top_k_elements)[1])
    return top_numbers
# Driver code
def main():
    arr = [[1, 3, 5, 12, 11, 12, 11, 12, 5], [1, 3, 5, 14, 18, 14, 5],
           [2, 3, 4, 5, 6, 7, 7], [9, 8, 7, 6, 6, 5, 4, 3, 2, 1], 
           [2, 4, 3, 2, 3, 4, 5, 4, 4, 4], [1, 1, 1, 1, 1, 1], [2, 3]]
    k = [3, 2, 1, 1, 3, 1, 2]
    for i in range(len(k)):
        print(i+1, ". \t Input: (", arr[i], ", ", k[i], ")", sep="")
        print("\t Top", k[i], "frequent Elements: ",
              top_k_frequent(arr[i], k[i]))
        print("-"*100)
if __name__ == '__main__':
    main()

Top K Frequent Elements

Now, let's move to the next example for Top K Elements.

Kth Largest Element in an Array#

Given an unsorted array, find the $k^{th}$ largest element.

Note: We need to find the $k^{th}$ largest element in the sorted order, not the $k^{th}$ distinct element.

Solution#

We can use a min-heap to efficiently return the $k^{th}$ largest number from the given unsorted array. Min-heap ensures the minimum element is always at the root, and it will keep the $k$ largest elements on top. Once we have inserted all the elements from the array onto the min-heap, we can simply return the root as the $k^{th}$ largest element.

The algorithm works as follows:

Insert the first $k$ elements onto the min-heap.
For each subsequent element in the array, compare it with the root (minimum element) of the min-heap.
- If the current element in the array is greater than the root element in our min-heap, remove the root element and insert the current element from the array.
- Continue inserting and removing elements in the min-heap until we have processed all elements in the array.
After processing all the elements, the min-heap will contain the $k$ largest elements from the input array. The smallest of them, the $k^{th}$ largest element, will be at the root of the min-heap.

Let’s look at the code for this solution:

import heapq 
def find_kth_largest(nums, k):
    # Initialize an empty list
    k_numbers_min_heap = []
    # Add first k elements to the list
    for i in range(k):
        k_numbers_min_heap.append(nums[i])
    # Convert the list into a min-heap
    heapq.heapify(k_numbers_min_heap)
    # Loop through the remaining elements in the 'nums' array
    for i in range(k, len(nums)):
        
		# Compare the current element with the minimum 
        # element (root) of the min-heap
        if nums[i] > k_numbers_min_heap[0]:
            
			# Remove the smallest element
            heapq.heappop(k_numbers_min_heap)
            
			# Add the current element
            heapq.heappush(k_numbers_min_heap, nums[i])
    # The root of the heap has the Kth largest element
    return k_numbers_min_heap[0]
# Driver code
def main():
    input_array = [
                    [1, 5, 12, 2, 11, 9, 7, 30, 20],
                    [5, 2, 9, -3, 7],
                    [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
                    [1, 4, 6, 0, 2],
                    [3, 5, 2, 3, 8, 5, 3]
                ]
    k = [3, 1, 9, 1, 4]
    for i in range(len(input_array)):
        print(i + 1, ".\tInput array: ", input_array[i], sep="")
        print("\tValue of k: ", k[i], sep="")
        print("\tkth largest number: ", find_kth_largest(input_array[i], k[i]), sep="")
        print("-" * 100)
if __name__ == '__main__':
    main()

Kth Largest Element in an Array

Now that we've explored the Top K Elements pattern, let's explore the last, but certainly not the least, coding pattern from the list of frequently asked patterns by Netflix.

Pattern 7: Union Find#

The Union Find pattern groups elements into non-overlapping sets, with each set forming a tree where every element points to its parent and the root points to itself. Through optimized union and find operations, sets can be merged, and their relationships determined as follows:

Union operation: The union operation merges two sets into a single set by connecting the roots of their respective trees. This process involves finding the roots of the sets to be merged and then updating the parent pointers accordingly.
Find operation: The find operation determines the set to which a particular element belongs by tracing its parent pointers until it reaches the root of the tree, which represents the set itself.

To further enhance performance, additional optimizations such as path compression and union by rank are commonly employed, streamlining the process by shortening the paths between nodes and prioritizing the merging of smaller trees, respectively. Union by rank involves linking smaller trees under larger ones to maintain balance, while path compression flattens the structure during the find operation, directly connecting nodes to the root.

To determine if a problem can use the Union Find pattern, consider situations where we need to efficiently manage disjoint sets, perform union operations, or determine the connected components within a dataset. Common problems suitable for this pattern include detecting cycles in graphs, implementing Kruskal’s algorithm for minimum spanning trees, or solving dynamic connectivity problems.

Let’s see how the following example illustrates the application of the Union Find pattern to efficiently solve the given coding problem:

Longest Consecutive Sequence#

Given an unsorted array, nums, return the length of the longest consecutive sequence of elements. The consecutive sequence of elements is such that there are no missing elements. The consecutive elements can be present anywhere in the input array.

Note: Two elements, $x$ and $y$ , are called consecutive if the difference between them is equal to $1$ .

Solution#

The consecutive sequence of elements in the given array can be considered as connected components. The idea is to combine all the consecutive elements into single connected components, which makes it easier to determine the length of consecutive sequences. We can streamline this process using the union find pattern. We iterate through each number in the array, and for each number encountered, we check if the next consecutive number exists in the array. If it does, the current number is merged with the next consecutive number into the same set within the union-find structure using the union method. We also maintain the size of each set that reflects the total number of elements within that set. Once all numbers have been processed, we return the length of the connected component with most numbers, providing the length of the longest consecutive sequence within the given array.

The algorithm works as follows:

Initialize a union find data structure, ds, with the given, nums, which results in creating data structure(s) necessary for proceeding with the algorithm. For this specific algorithm, two dictionaries, parent and size, and a variable, max_length are created.
- The parent dictionary is initialized such that each number in nums is its own parent initially. This means that each number is considered to be in its own sequence, and it is the root of that sequence.
- The size dictionary is initialized, with each number having a size of $1$ . This represents that each sequence initially contains only one element.
- The max_length variable is initialized to $1$ . This variable will keep track of the length of the longest consecutive sequence found so far.
Loop through each number, nums[i], in nums. For each number, nums[i], in nums, we’ll check if nums[i] + 1 is present in the parent dictionary. If it is present, we’ll apply a union method of the UnionFind class on nums[i] and nums[i] + 1 as follows:
- Find the roots of sequences containing the given elements using the find method. If they are not equal, combine the roots of these consecutive sequences by setting the parent of the root of the smaller sequence to the root of the larger sequence.
- Update the length of the larger sequence by adding the current sequence’s length and the smaller sequence’s length.
- Update max_length as shown below. This compares the current value of max_length with the updated length of the larger sequence (x_root) and stores the larger of the two values in max_length.

main.py

union_find.py

from union_find import *
def longest_consecutive_sequence(nums):
    if len(nums) == 0:
        return 0
    # data structure for implementing union find
    ds = UnionFind(nums)
    for num in nums:
        # check if the next consecutive number 
        # is in the input array
        if num + 1 in ds.parent:
            ds.union(num, num + 1)
    return ds.max_length
# driver code
def main():
    input_nums = [
        [150, 14, 200, 1, 3, 2],
        [1, 2, 3, 4, 5, 6, 7],
        [1, 3, 5, 7],
        [7, 6, 5, 4, 3, 2, 1],
        [7, 6, 5, 1],
    ]
    for i in range(0, len(input_nums)):
        print(i+1, ".\tnums = ", input_nums[i], sep="")
        print("\tThe length of the longest consecutive sequence is:", longest_consecutive_sequence(input_nums[i]))
        print("-"*100)
if __name__ == '__main__':
    main()

Longest Consecutive Sequence

That's about exploring the coding patterns based on the frequently asked coding questions by Netflix.

To ace your Netflix interview, mastering the patterns we have just discovered is important. Understanding the underlying patterns behind the solutions you devise will not only help you tackle similar problems in the future but also demonstrate your depth of understanding to interviewers. We have explored some of the most common coding patterns with the help of interview questions frequently asked by Netflix, but it’s just a start. Remember, practice makes perfect, so dedicate time to solving problems regularly and seek feedback to improve further. You may explore the following courses by Educative for even better preparation because they cover a wide range of coding patterns as well as Dynamic Programming patterns, and that too in various programming languages:

Written By:

Minahil Yaser

Join 2.5 million developers at

Explore the catalog

Free Resources

Operation	Time Complexity	Description
Push	O(1)	Adds the element at the top of the stack.
Pop	O(1)	Removes and returns the element from the top of the stack.
Peek	O(1)	Returns the element at the top of the stack without removing it.
IsEmpty	O(1)	Checks whether the stack is empty or not. Returns TRUE if the stack is empty, FALSE otherwise.
Size	O(1)	Returns the total number of elements in the stack.

Ace the Netflix Interview: Must-Know Python Coding Problems

Pattern 1: Modified Binary Search#

Random Pick with Weight#

Solution#

Pattern 2: Stacks#

Flatten Nested List Iterator#

Solution#

Pattern 3: Custom Data Structures#

Time-Based Key-Value Store#

Solution#

Insert Delete GetRandom O(1)#

Solution#

Implement LRU Cache#

Solution#

Pattern 4: Hash Maps#

Logger Rate Limiter#

Solution#

Pattern 5: Cyclic Sort#

First Missing Positive#

Solution#

Pattern 6: Top K Elements#

Top K Frequent Elements#

Solution#

Kth Largest Element in an Array#

Solution#

Pattern 7: Union Find#

Longest Consecutive Sequence#

Solution#