Google looks for software engineers who want to grow and evolve with their fast-paced business. If you're eager to embrace new problems, you can find many opportunities to switch teams, lead projects, and pursue your career goals at Google.
Google coding interviews are crucial for screening candidates who will thrive in the long term. This allows interviewers to evaluate your technical skills and ability to think through complex problems.
The exact setup of your interviews can vary across teams. However, software engineering candidates can expect to write code several times during the interview. You'll need to be ready to solve brand-new coding problems in real time.
So, how can you prepare to tackle coding problems in your Google interview?
Fortunately, you can solve most of Google's coding questions by identifying the underlying patterns. These will help you break down problems into more manageable parts. From there, you can apply strategies and algorithms that help you reach a solution quickly.
To streamline your interview prep, we’ve identified 9 patterns that you're most likely to see in Google coding problems.
We'll review each pattern in depth and share examples of common coding problems that use it. Let's get started!
The Hash Maps pattern is a tool for storing and retrieving key-value pairs. On average, it offers constant-time complexity for insertion, deletion, and lookup operations. Quicker lookup time makes hash maps ideal for tasks like caching, indexing, or frequency counting.
The core operations of hash maps are the following:
1. Insert: A key-value pair is added, with the hash function determining the index for storage. While typically quick (
2. Search: Values are retrieved by applying the hash function to compute the key’s index, typically a quick operation (
3. Remove: Values are removed by deleting the entry at the key’s computed index. Usually quick (
Using hash maps significantly reduces lookup times to an average of
Let’s see how the following example illustrates the application of the Hash Maps pattern to efficiently solve the given coding problem:
For the given stream of message requests and their timestamps as input, you must implement a logger rate limiter system that decides whether the current message request is displayed. The decision depends on whether the same message has already been displayed in the last
Note: Though received at different timestamps, several message requests may carry identical messages.
We need to know if a message already exists and keep track of its time limit. For problems where two associated values need to be checked, we can use a hash map.
We can use all incoming messages as keys and their respective time limits as values. This will help us eliminate duplicates while respecting the time limit of
Here is how we’ll implement our algorithm using hash maps:
Initialize a hash map.
When a request arrives, check if it’s a new request (the message is not among the keys stored in the hash map) or a repeated request (an entry for this message already exists in the hash map). If it’s a new request, accept it and add it to the hash map.
Let’s look at the code for this solution below:
class RequestLogger:# Initailization of requests hash mapdef __init__(self, time_limit):self.requests = {}self.limit = time_limit# Function to accept and deny message requestsdef message_request_decision(self, timestamp, request):# Check whether the specific request exists in# the hash map or not if it exists, check whether its# time duration lies within the defined timestampif request not in self.requests or timestamp - self.requests[request] >= self.limit:# Store this new request in the hash map and return the trueself.requests[request] = timestampreturn Trueelse:# The request already exists within the timestamp# and is identical, the request should# be rejected, return falsereturn False# Driver codedef main():# Here we will set the time limit to 7new_requests = RequestLogger(7)times = [1, 5, 6, 7, 15]messages = ["good morning","hello world","good morning","good morning","hello world"]# Loop to execute over the input message requestsfor i in range(len(messages)):print(i + 1, ".\t Time, Message: {",times[i], ", '", messages[i], "'}", sep="")print("\t Message request decision: ",new_requests.message_request_decision(times[i], messages[i]), sep="")print("-" * 100)if __name__ == '__main__':main()
With our understanding of Hash Maps established, let's discuss the next coding pattern.
The Merge Intervals pattern is a powerful coding technique for problems involving meeting times or intervals of some nature. This technique is particularly useful when we need to deal with a set of intervals and perform operations such as merging overlapping intervals or determining their intersections.
In this technique, we typically start by sorting the given intervals based on their start or end times, which helps identify the overlapping intervals efficiently. Once we have this interval information, we can swiftly perform the tasks based on the problem's requirements. The Merge Intervals pattern has many applications in multiple scenarios, including scheduling algorithms, resource allocation problems, and calendar management systems. From analyzing time-based data to consolidating meeting schedules, this coding technique offers an elegant solution for handling interval-related operations effectively.
Let’s see how the following examples illustrate the application of the Merge Intervals pattern to efficiently solve these problems:
We're given a list containing the schedules of multiple employees. Each person's schedule is a list of non-overlapping intervals in sorted order. An interval is specified with the start and end times, both positive integers. Find the list of finite intervals representing the free time for all the employees.
The intuition behind this solution involves merging the individual schedules of all employees into a unified timeline. By doing this, we can identify the common free time intervals where none of the employees are occupied. The key idea is to find the gaps or intervals between the time slots of these merged schedules.
We use the following variables in our solution:
previous
: Stores the end time of the previously processed interval.
i
: Stores the employee’s index value.
j
: Stores the interval’s index of the employee, i
.
result
: Stores the free time intervals.
The steps of the algorithm are given below:
We store the start time of each employee’s first interval along with its index value and a value
We set previous
to the start time of the first interval present in a heap.
Then we iterate a loop until the heap is empty, and in each iteration, we do the following:
Pop an element from the min-heap and set i
and j
to the second and third values, respectively, from the popped value.
Select the interval from input located at i,j
.
If the selected interval’s start time is greater than previous
, it means that the time from previous
to the selected interval’s start time is free. So, add this interval to the result
array.
Now, update the previous
as
If the current employee has any other interval, push it into the heap.
After all the iterations, when the heap becomes empty, return the result
array.
Let’s look at the code for this solution below:
from interval import Intervalimport heapqdef employee_free_time(schedule):heap = []# Iterate for all employees' schedules# and add start of each schedule's first interval along with# its index value and a value 0.for i in range(len(schedule)):heap.append((schedule[i][0].start, i, 0))# Create heap from array elements.heapq.heapify(heap)# Take an empty array to store results.result = []# Set 'previous' to the start time of first interval in heap.previous = schedule[heap[0][1]][heap[0][2]].start# Iterate till heap is emptywhile heap:# Pop an element from heap and set value of i and j_, i, j = heapq.heappop(heap)# Select an intervalinterval = schedule[i][j]# If selected interval's start value is greater than the# previous value, it means that this interval is free.# So, add this interval (previous, interval's end value) into result.if interval.start > previous:result.append(Interval(previous, interval.start))# Update the previous as maximum of previous and interval's end value.previous = max(previous, interval.end)# If there is another interval in current employees' schedule,# push that into heap.if j + 1 < len(schedule[i]):heapq.heappush(heap, (schedule[i][j+1].start, i, j+1))# When the heap is empty, return result.return result# Function for displaying interval listdef display(vec):string = "["if vec:for i in range(len(vec)):string += str(vec[i])if i + 1 < len(vec):string += ", "string += "]"return string# Driver codedef main():inputs = [[[Interval(1, 2), Interval(5, 6)], [Interval(1, 3)], [Interval(4, 10)]],[[Interval(1, 3), Interval(6, 7)], [Interval(2, 4)], [Interval(2, 5), Interval(9, 12)]],[[Interval(2, 3), Interval(7, 9)], [Interval(1, 4), Interval(6, 7)]],[[Interval(3, 5), Interval(8, 10)], [Interval(4, 6), Interval(9, 12)], [Interval(5, 6), Interval(8, 10)]],[[Interval(1, 3), Interval(6, 9), Interval(10, 11)], [Interval(3, 4), Interval(7, 12)], [Interval(1, 3), Interval(7, 10)], [Interval(1, 4)], [Interval(7, 10), Interval(11, 12)]],[[Interval(1, 2), Interval(3, 4), Interval(5, 6), Interval(7, 8)], [Interval(2, 3), Interval(4, 5), Interval(6, 8)]],[[Interval(1, 2), Interval(3, 4), Interval(5, 6), Interval(7, 8), Interval(9, 10), Interval(11, 12)], [Interval(1, 2), Interval(3, 4), Interval(5, 6), Interval(7, 8), Interval(9, 10), Interval(11, 12)], [Interval(1, 2), Interval(3, 4), Interval(5, 6), Interval(7, 8), Interval(9, 10), Interval(11, 12)], [Interval(1, 2), Interval(3, 4), Interval(5, 6), Interval(7, 8), Interval(9, 10), Interval(11, 12)]]]i = 1for schedule in inputs:print(i, '.\tEmployee Schedules:', sep="")for s in schedule:print("\t\t", display(s), sep="")print("\tEmployees' free time", display(employee_free_time(schedule)))print('-'*100)i += 1if __name__ == "__main__":main()
Now, let's look at another problem that can be solved using the Merge Intervals pattern.
Given an input array of meeting time intervals, intervals
, where each interval has a start time and an end time, find the minimum number of meeting rooms required to hold these meetings.
An important thing to note here is that the specified end time for each meeting is exclusive.
The optimized approach to solve this problem is to use the Merge Intervals technique. In this approach, we sort the given meetings by their start time and keep track of the end times of each meeting. We do this by initializing a min-heap and adding the end time of the first meeting to the heap. The heap data structure enables us to efficiently keep track of the meeting with the earliest end time, thereby providing insight into the availability of meeting rooms.
Then, for each subsequent meeting, we check if the room occupied by the earliest ending meeting, the minimum element of the heap, is free at the time of the current meeting. If the meeting with the earliest end time has already ended before the start of the current meeting, then we can use the same meeting room again for the current meeting. We remove the meeting with the earliest end time from the heap and add the end time of the current meeting to the heap. If the earliest ending meeting has not ended by the start of the current meeting, then we know that we have to allocate a new room for the current meeting therefore, we add its end time to the heap.
After processing all the meeting intervals
, the size of the heap will be equal to the number of meeting rooms allocated. This will be the minimum number of rooms needed to accommodate all the meetings.
Let’s look at the code for this solution:
import heapqdef find_sets(intervals):if not intervals:return 0# Meetings are sorted according to their start timeintervals.sort(key=lambda x: x[0])# Initialize a new heap and add the ending time of the first meeting to the heapend_times_heap = []heapq.heappush(end_times_heap, intervals[0][1])for i in range(1, len(intervals)):# Check if the minimum element of the heap (i.e., the earliest ending meeting) is freeif intervals[i][0] >= end_times_heap[0]:# If the room is free, extract the earliest ending meeting and add the ending time of the current meetingheapq.heappop(end_times_heap)# Add the ending time of the current meeting to the heapheapq.heappush(end_times_heap, intervals[i][1])# The size of the heap tells us the number of rooms allocatedreturn len(end_times_heap)# Driver codedef main():schedule_meetings = [[[0,10], [2,10], [11,30]],[[3,7], [2,12], [10,20], [8,24]],[[1,9], [5,8], [4,14], [3,10], [11,25]],[[1,4], [3,8], [8,11], [3,17], [9,15], [16,18]],[[4,12], [5,11], [4,9], [2,12], [9,22]],]for i in range(len(schedule_meetings)):print(str(i+1),".\tScheduled meetings:", schedule_meetings[i])print("\tRooms required:", find_sets(schedule_meetings[i]))print("-"*100)if __name__ == '__main__':main()
Now that we've covered the Merge Intervals, let's move on to another frequently asked coding pattern.
The Knowing What to Track pattern is a strategy for efficiently solving problems by tracking certain properties of the input elements. An example of such a property is the frequency of the occurrence of elements in an array or a string. Tracking such a property can often help derive efficient solutions. This pattern operates in two main phases: the tracking phase and the utilization phase. During the tracking phase, we iterate through the dataset and tally the frequency of each element using appropriate data structures like hash maps or arrays. Once frequencies are calculated, we transition to the utilization phase, where we apply this frequency information to solve specific problems. These problems often involve finding the most frequent element, identifying unique occurrences, or detecting patterns within the dataset. To determine if a problem can be solved using this pattern, look for scenarios where frequency tracking or pattern recognition is essential. The famous interview problems that can be solved with this pattern include Palindrome Permutation, Valid Anagram, Design Tic-Tac-Toe, and Group Anagrams.
Let’s take a closer look at how the following coding problem can be efficiently solved with the Knowing What to Track pattern:
Given an array of integers, arr
, and a target, t
, identify and return the two indexes of the two elements that add up to the target t
. Moreover, the same index can’t be used twice, and there will be only one solution.
Note: We will assume that the array is zero-indexed and the output order doesn’t matter.
We will use a hash map to solve the two-sum problem because it allows us to perform lookups in a constant time, enabling us to quickly check if the difference between the target value and each value of the array already exists in the hash map. If the difference exists in the hash map, we have found the two numbers that add up to the target value, and we can return their indexes. If not, we add the current number and index to the hash map and continue iterating through the input array.
We will first create an empty hash map to store the numbers and their indexes to implement this algorithm. Then, we will iterate over the input array, and for each number in the array, we will calculate its difference (the difference between the target value and the number). Next, we will check if the difference exists in the hash map as a key. If it does, we will retrieve the value of the difference from the hash map, which is the index of the difference value in the array and return it with the current index as the solution to the problem. If the difference is not in the hash map, we will add the current number as a key and its index i
as a value to the hash map. We will continue iterating through the input array until we find a pair of numbers adding to the target value. Once we find such a pair, we will return their indexes.
Let’s look at the code for this solution below:
def two_sum(arr, t):# Create an empty hash map to store numbers and their indiceshashmap = {}# Iterating over the array of numbersfor i in range(len(arr)):# Calculating the difference between the current and target numberdifference = t - arr[i]# Checking if the difference already exists in the hash mapif difference in hashmap:# Returning the indices of the two numbers that add up to the targetreturn [i, hashmap[difference]]# Adding the current number and its index to the hash maphashmap[arr[i]] = i# Driver codedef main():inputs = [[1, 10, 8, 4, 9],[5, 12, 15, 21, 6, 17],[2, 4, 6, 8, 10, 19],[-4, -8, 0, -7, -3, -10],[49, 17, 15, 22, -45, 29, 18, -15, 11, 37, 12, -52]]targets = [17, 33, 21, -15, 0]for i in range(len(targets)):print(i + 1, ". Input array = ", inputs[i], sep="")print(" Target = ", targets[i], sep="")print(" Indices of two numbers = ", two_sum(inputs[i], targets[i]), sep="")print("-" * 100)if __name__ == "__main__":main()
Now that we've discussed Knowing What to Track, let's focus on another important coding pattern.
Custom data structures are essentially modified versions of existing data structures tailored to address specific needs. We often must go beyond standard data structures like arrays and hash tables to tackle unique challenges more effectively. For instance, a web crawler that processes numerous pages and URLs might use a specialized "URL queue" to manage these URLs efficiently, ensuring they are unique and prioritized based on relevance. Custom data structures involve creating custom classes that encapsulate the necessary functionality and properties to efficiently manage and manipulate the data. By designing data structures optimized for the problem domain, we can improve the performance and readability of our code while simplifying complex operations. To determine if a problem can benefit from the Custom Data Structures pattern, consider scenarios where standard data structures like arrays, lists, or maps are not sufficient or where specialized operations need to be performed frequently. Common problems suitable for this pattern include implementing priority queues, disjoint-set data structures, or specialized graph representations.
Let’s see how the following example illustrates the application of the Custom Data Structures pattern to efficiently solve the given coding problem:
Implement an LRU cache class with the following functions:
Init(capacity): Initializes an LRU cache with the capacity size.
Set(key, value): Adds a new key-value pair or updates an existing key with a new value.
Get(key): Returns the value of the key, or −1 if the key does not exist.
If the number of keys has reached the cache capacity, evict the least recently used key and add the new key.
As caches use relatively expensive, faster memory, they are not designed to store large data sets. Whenever the cache becomes full, we must evict some data from it. There are several caching algorithms to implement a cache eviction policy. LRU is a very simple and commonly used algorithm. The core concept of the LRU algorithm is to evict the oldest data from the cache to accommodate more data.
This problem can be solved efficiently if we combine two data structures and use their respective functionalities, as well as the way they interact with each other, to our advantage. A doubly linked list allows us to arrange nodes by the time they were last accessed. However, accessing a value in a linked list is
Here is the algorithm for the LRU cache:
Set:
If the element exists in the hash map, then update its value and move the corresponding linked list node to the head of the linked list.
Otherwise, if the cache is already full, remove the tail element from the doubly linked list. Then delete its hash map entry, add the new element at the head of the linked list, and add the new key-value pair to the hash map.
Get:
If the element exists in the hash map, move the corresponding linked list node to the head of the linked list and return the element value.
Otherwise, return -1.
Note that the doubly linked list keeps track of the most recently accessed elements. The element at the head of the doubly linked list is the most recently accessed element. All newly inserted elements (in Set) go to the head of the list. Similarly, any element accessed (in the Get operation) goes to the head of the list.
Let’s look at the code for this solution below:
from linked_list import LinkedList# We will use a linkedlist of a pair of integers# where the first integer will be the key# and the second integer will be the valueclass LRUCache:# Initializes an LRU cache with the capacity sizedef __init__(self, capacity):self.cache_capacity = capacityself.cache_map = {}self.cache_list = LinkedList()# Returns the value of the key, or -1 if the key does not exist.def get(self, key):# If the key doesn't exist, we return -1found_itr = Noneif key in self.cache_map:found_itr = self.cache_map[key]else:return -1list_iterator = found_itr# If the key exists, we need to move it to the front of the listself.cache_list.move_to_head(found_itr)return list_iterator.pair[1]# Adds a new key-value pair or updates an existing key with a new valuedef set(self, key, value):# Check if the key exists in the cache hashmap# If the key existsif key in self.cache_map:found_iter = self.cache_map[key]list_iterator = found_iter# Move the node corresponding to key to front of the listself.cache_list.move_to_head(found_iter)# We then update the value of the nodelist_iterator.pair[1] = valuereturn# If key does not exist and the cache is fullif len(self.cache_map) == self.cache_capacity:# We will need to evict the LRU entry# Get the key of the LRU node# The first element of each cache entry is the keykey_tmp = self.cache_list.get_tail().pair[0]# This is why we needed to store a <key, value> pair# in the cacheList. We would not have been able to get# the key if we had just stored the values# Remove the last node in listself.cache_list.remove_tail()# Remove the entry from the cachedel self.cache_map[key_tmp]# The insert_at_head function inserts a new element at the front# of the list in constant timeself.cache_list.insert_at_head([key, value])# We set the value of the key as the list begining# since we added the new element at head of the listself.cache_map[key] = self.cache_list.get_head()def print(self):print("Cache current size: ", self.cache_list.size,", ", end="")print("Cache contents: {", end="")node = self.cache_list.get_head()while node:print("{", str(node.pair[0]), ",", str(node.pair[1]),"}", end="")node = node.nextif node:print(", ", end="")print("}")print("-"*100, "\n")def main():# Creating a cache of size 2cache_capacity = 2cache = LRUCache(cache_capacity)print("Initial state of cache")print("Cache capacity: " + str(cache_capacity))cache.print()keys = [10, 10, 15, 20, 15, 25, 5]values = ["20", "get", "25", "40", "get", "85", "5"]for i in range(len(keys)):if values[i] == "get":print("Getting by Key: ", keys[i])print("Cached value returned: ", cache.get(keys[i]))else:print("Setting cache: Key: ", keys[i], ", Value: ", values[i])cache.set(keys[i], int(values[i]))cache.print()if __name__ == '__main__':main()
Now that we've explored the design and implementation of Custom Data Structures, let's explore the next coding pattern in the list.
The Sliding Window pattern is a useful tool for efficiently solving problems involving sequential data such as arrays or strings, where computations on subsets of data must be repeated. In this technique, a window is defined as a contiguous subset of elements within the data that adjusts its boundaries as it moves through it. Sequential information processing is efficient because the window only focuses on relevant subsets of the data at any given time, avoiding unnecessary computations on the entire dataset. Computations are typically updated in constant time by considering elements entering or exiting the window. By subtracting leaving elements and adding new ones, the computational time remains constant with each movement of the window. Problems like Find Maximum in Sliding Window, Repeated DNA Sequences, and Best Time to Buy and Sell Stock are commonly solved using the Sliding Window pattern.
Let’s see how the following example illustrates the application of the Sliding Window pattern to efficiently solve the given coding problem:
Given a string s
and an integer k
, find the length of the longest substring in s
, where all characters are identical, after replacing, at most, k
characters with any other lowercase English character.
We can use the Sliding Window pattern which utilizes two pointers to slide a window over the input string. We initialize the start
and the end
pointer with
Increment the end
pointer until the window becomes invalid.
Increment the start
pointer only if the window is invalid to make it valid again.
We keep track of the frequency of characters in the current window using a hash map. We also maintain a variable, lengthOfMaxSubstring
, to keep track of the longest substring with the same characters after replacements and mostFreqChar
to keep track of the frequency of the most occurring character.
We check whether the new character is in the hash map in each iteration. If it is present in the hash map, we increment its frequency by mostFreqChar
to update the frequency of the most occurring character so far using the following expression:
Then, we use the following expression to check if the number of characters in the window other than the most occurring character is greater than k
:
If the expression above returns TRUE, the number of replacements required in the current window has exceeded our limit, that is, k
. In this case, we decrement the frequency of the character to be dropped out of the window and adjust the window by moving the start
pointer by
Then, we update lengthOfMaxSubstring
with the current window size if the window size is greater than lengthOfMaxSubstring
.
Finally, when the entire input string has been traversed, we return the length of the longest substring such that all the characters in the substring are the same.
Let’s have a look at the code for the algorithm we just discussed.
def longest_repeating_character_replacement(s, k):# initialize variablesstring_length = len(s)length_of_max_substring = 0start = 0char_freq = {}most_freq_char = 0# iterate over the input stringfor end in range(string_length):# if the new character is not in the hash map, add it, else increment its frequencyif s[end] not in char_freq:char_freq[s[end]] = 1else:char_freq[s[end]] += 1# update the most frequent charmost_freq_char = max(most_freq_char, char_freq[s[end]])# if the number of replacements in the current window have exceeded the limit, slide the windowif end - start + 1 - most_freq_char > k:char_freq[s[start]] -= 1start += 1# if this window is the longest so far, update the length of max substringlength_of_max_substring = max(end - start + 1, length_of_max_substring)# return the length of the max substring with same characters after replacement(s)return length_of_max_substring# Driver codedef main():input_strings = ["aabccbb", "abbcb", "abccde", "abbcab", "bbbbbbbbb"]values_of_k = [2, 1, 1, 2, 4]for i in range(len(input_strings)):print(i+1, ".\tInput String: ", input_strings[i], sep="")print("\tk: ", values_of_k[i], sep="")print("\tLength of longest substring with repeating characters: ", longest_repeating_character_replacement(input_strings[i], values_of_k[i]))print("-" * 100)if __name__ == '__main__':main()
With our understanding of Sliding Window established, let's move on to discussing the next coding pattern.
In many coding interviews, candidates often encounter problems where binary search comes in handy. It's known for its logarithmic time complexity which makes it super efficient. However, it only works when the input data is already sorted. That's where the Modified Binary Search pattern steps in. It is an advanced adaptation of the traditional binary search algorithm, modified to handle more complex scenarios where elements may not strictly meet the standard sorted criteria. This pattern excels in efficiently locating elements or conditions that are not straightforward to find through linear searching, particularly when dealing with rotated arrays, finding boundaries, or solving the random pick weight problem.
By dividing the search space in half, this method significantly reduces the time complexity to
The adaptability of the Modified Binary Search pattern makes it a powerful tool in software development, enhancing the ability to manage and retrieve data efficiently in scenarios where direct comparisons and typical ordering do not apply. This pattern not only streamlines data retrieval processes but also aids in optimizing performance across various programming tasks.
Let’s see how the following example illustrates the application of the Modified Binary Search pattern to efficiently solve the given coding problem:
We’re given an array of positive integers, weights
, where weights[i]
is the weight of the weights
array. The larger the value of weights[i]
, the heavier the weight is, and the higher the chances of its index being picked.
Suppose that the array consists of the weights
Index 0:
Index 1:
Index 2:
Note: Since we’re randomly choosing from the options, there is no guarantee that in any specific run of the program, any of the elements will be selected with the exact expected frequency.
We can use the Modified Binary Search pattern to speed up the random index-picking process. It reduces the index searching time to i
stores the cumulative sum of weights up to index i
. Next, we generate a random number between 1 and the total weight. Finally, we use binary search to find the index corresponding to the randomly generated number in the prefix sum array. This approach ensures that elements with higher weights have a proportionally higher chance of being selected while maintaining randomness.
Here’s how the algorithm works:
The Init() method generates a list of cumulative sums using the given list of weights.
The Pick Index() method returns a randomly selected index while considering the provided weights. It works as follows:
Generates a random number, target
, between
Uses binary search to find the index of the first cumulative sum greater than the random value. Initialize the low
index to high
index to the length of the list of cumulative sums of weights. While the low
index is less than or equal to the high
index, the algorithm:
Calculates the mid
index as low
high
low
)
If the cumulative sum at the mid
index is less than or equal to target
, the low
index is updated to mid + 1
.
Otherwise, the high
index is updated to mid
.
At the end of the binary search, the low
pointer will point to the index of the first cumulative sum greater than target
. Return this index as the chosen index.
Let’s look at the code for this solution below:
import randomclass RandomPickWithWeight:# Constructordef __init__(self, weights):# List to store running sums of weightsself.running_sums = []# Variable to calculate running sumrunning_sum = 0# Iterate through the given weightsfor w in weights:# Add the current weight to the running sumrunning_sum += w# Append the running sum to the running_sums listself.running_sums.append(running_sum)# Store the total sum of weightsself.total_sum = running_sum# Method to pick an index based on the weightsdef pick_index(self):# Generate a random number between 1 and the total sum of the arraytarget = random.randint(1, self.total_sum)# Initialize low and high variables for binary searchlow = 0high = len(self.running_sums)# Perform binary search to find the first value higher than the targetwhile low < high:mid = low + (high - low) // 2if target > self.running_sums[mid]:low = mid + 1else:high = mid# Return the index (low) foundreturn low# Driver codedef main():counter = 900weights = [[1, 2, 3, 4, 5],[1, 12, 23, 34, 45, 56, 67, 78, 89, 90],[10, 20, 30, 40, 50],[1, 10, 23, 32, 41, 56, 62, 75, 87, 90],[12, 20, 35, 42, 55],[10, 10, 10, 10, 10],[10, 10, 20, 20, 20, 30],[1, 2, 3],[10, 20, 30, 40],[5, 10, 15, 20, 25, 30]]dict = {}for i in range(len(weights)):print(i + 1, ".\tList of weights: ", weights[i], ", pick_index() called ", counter, " times", "\n", sep="")[dict.setdefault(l, 0) for l in range(len(weights[i]))]sol = RandomPickWithWeight(weights[i])for j in range(counter):index = sol.pick_index()dict[index] += 1print("-"*105)print("\t{:<10}{:<5}{:<10}{:<5}{:<15}{:<5}{:<20}{:<5}{:<15}".format( \"Indexes", "|", "Weights", "|", "Occurences", "|", "Actual Frequency", "|", "Expected Frequency"))print("-"*105)for key, value in dict.items():print("\t{:<10}{:<5}{:<10}{:<5}{:<15}{:<5}{:<20}{:<5}{:<15}".format(key, "|", weights[i][key], "|", value, "|", \str(round((value/counter)*100, 2)) + "%", "|", str(round(weights[i][key]/sum(weights[i])*100, 2))+"%"))dict = {}print("\n", "-"*105, "\n", sep="")if __name__ == '__main__':main()
Now that we've discussed Modified Binary Search, let's focus on another important coding pattern.
The Two Pointers technique is one of the must-know coding techniques for effective problem-solving. It involves traversing linear data structures like arrays or linked lists using two pointers moving in a coordinated way. These pointers move in either one direction or opposite directions based on the given problem’s requirements until a condition is met or the input is exhausted. This technique is the go-to solution for problems with sequentially arranged data like arrays, strings, or linked lists or if we need to find some paired elements to satisfy certain constraints or conditions. From verifying a palindrome to detecting cycles in the given data, the Two Pointers technique showcases its efficiency by providing solutions with linear time complexity.
The dynamic movement of these pointers makes the technique both efficient and versatile. Pointers can move independently of each other based on specific criteria, advancing through the same or different data structures. Whether they move in tandem or diverge, their synchronized traversal enables swift problem resolution.
Let’s see how the following example illustrates the application of the Two Pointers pattern to efficiently solve the given coding problem:
Given a sequence of non-negative integers representing the heights of bars in an elevation map, the goal is to determine the amount of rainwater that can be trapped between the bars after rain.
An optimized approach to solving this problem utilizes the Two Pointers technique. Instead of separately processing each element's left and right sides, we simplify it into a single iteration using two pointers, left
and right
, initially positioned at the elevation map's extremes. The key idea is to maintain two variables, left_max
and right_max
, which track the maximum heights encountered on the left and right. As the pointers move inwards, they calculate the trapped water for each bar based on the lower of the two maximum heights.
Here's the step-by-step algorithm to find the solution:
Start iterating the heights
array using two pointers, left
and right
. To keep track of maximum heights on the leftmost side and the rightmost side, use two variables left_max
and right_max
.
If left_max
is greater than right_max
then it means the maximum height on the left side is greater than the maximum height on the right side.
Hence, we proceed to the right side and calculate the trapped water at the current right
position based on right_max
. Otherwise, we move on to the left side.
Store the amount of water that can be accumulated by taking a difference between the maximum of the respective sides (left_max
or right_max
) and the current bar’s height.
Keep iterating and updating the pointers at each step until left
becomes greater than right
.
Let’s look at the code for this solution below:
def rain_water(heights):# Initialize two pointers at the leftmost and rightmost positions of the elevation mapleft = 0right = len(heights) - 1# Variables to store the accumulated rainwater and maximum heightsstored_water = 0left_max, right_max = 0, 0while left <= right:# If the maximum height on the right is less than or equal to the maximum height on the leftif left_max > right_max:stored_water += max(0, right_max - heights[right])# Update the right maximum height if necessaryright_max = max(right_max, heights[right])right -= 1# If the maximum height on the left is less than the maximum height on the rightelse:stored_water += max(0, left_max - heights[left])# Update the left maximum height if necessaryleft_max = max(left_max, heights[left])left +=1return stored_water# Driver codedef main():input_list = [[1, 0, 1, 2, 1, 4, 0, 3, 5],[2, 0, 9, 6],[3, 1, 2, 0, 2],[4, 2, 5, 3], [3, 0]]index = 1for i in input_list:print(str(index)+".\tHeights: "+str(i))print("\tMaximum rainwater: " + str(rain_water(i)))index += 1print("-" * 100)if __name__ == "__main__":main()
Now that we've covered the Two Pointers, let's move on to another frequently asked coding pattern.
The Graphs pattern offers a structured approach for solving problems involving graph data structure, where entities are represented as nodes and their relationships are represented as edges. This pattern involves traversing the graph using various algorithms like Depth-First Search (DFS) or Breadth-First Search (BFS) to explore its vertices and edges systematically.
Breadth-First Search (BFS) starts at a selected node and explores all its neighboring nodes at the current depth level before moving to the nodes at the next depth level. It traverses the graph level by level, often using a queue data structure to keep track of the nodes to be visited. BFS is well-suited for finding the shortest path between two nodes in an unweighted graph.
Depth-First Search (DFS) starts at a selected node and explores as far as possible along each branch before backtracking. It explores one branch fully before moving on to the next branch. DFS is often implemented using recursion or a stack data structure to keep track of the visited nodes and explore the unvisited ones. This algorithm is useful for tasks like finding cycles in a graph, topological sorting, or exploring all possible paths from a starting node.
Common problems suitable for this pattern include finding the shortest path between two nodes, determining the connectivity of a network, or identifying cycles within a graph.
Let’s check out the following interview problem to see how the Graphs pattern works:
Imagine an island with a rectangular shape that touches both the Pacific and Atlantic oceans. The northern and western sides meet the Pacific, while the southern and eastern sides touch the Atlantic. This island is divided into square cells.
To depict the height above sea level of each cell, we use an integer matrix, heights
, of size heights[r][c]
represents the elevation of the cell at coordinate
When heavy rain pours down on the island every few months, water flows from the island to both the Pacific and Atlantic oceans. The path of flow depends on the heights of the cells.
Consider a cell with a height of
Note: Any cell adjacent to an ocean can channel water into the ocean.
With this information, our task is to return a 2-D array of coordinates. Each entry
The idea stems from the observation that water flows downhill, seeking the lowest possible path. By starting the exploration from the ocean edges and traversing cells equal to or higher in elevation, we can identify regions where water can reach both the Pacific and Atlantic Oceans. The depth-first search (DFS) algorithm is chosen since it naturally emulates the flow of water, recursively exploring cells and marking those that are reachable. The approach focuses on identifying cells that can be reached from both ocean edges, implying that water can flow in both directions, and by finding the intersection of the sets of reachable cells from each ocean, the algorithm effectively pinpoints locations where water can flow to both oceans.
Here’s how the algorithm works:
We initialize the following variables to assist us in performing DFS:
num_rows
: This is the total number of rows in the matrix.
num_cols
: This is the total number of columns in the matrix.
pacific_reach
: This is a hash set that will contain the coordinates of the cells that can be reached from the Pacific Ocean.
atlantic_reach
: This is a hash set that will contain the coordinates of the cells that can be reached from the Atlantic Ocean.
The dfs
function has the following parameters:
row
, col
: The coordinates of the initial cell where DFS will begin.
reach
: The hash set of the respective ocean for which DFS is called.
Here’s how the dfs
function works:
The very first cell passed to the function will always be reachable since this cell is from the border of the matrix about the respective ocean. Therefore, the coordinates of this cell are added to reach
.
For the cell above that has been added to reach
, we perform DFS by exploring all four directions (top, bottom, left, right) from the cell. This is achieved through the use of a loop that adds the following coordinates to the current (row, col)
:
(1,0)
: The cell to the immediate top of the current cell is explored.
(0,1)
: The cell to the immediate bottom of the current cell is explored.
(-1,0)
: The cell to the immediate left of the current cell is explored.
(0,-1)
: The cell to the immediate right of the current cell is explored.
The following conditions are checked for each cell that is explored:
If the cell is out of bounds, it is skipped since it doesn’t exist in the matrix.
If the cell is already in reach
, it is skipped since it has already been visited.
If the cell's height is greater than the height of the previous cell, it is skipped since the previous cell can not be reached from this cell.
If the three conditions above are not satisfied, the current cell is a valid cell on which we can continue DFS. So the dfs
function is called again for this cell, where it is added to reach
, and then its neighbors are explored again.
The dfs
function is then called for each cell of the Pacific and Atlantic border in the matrix, which populates pacific_reach
and atlantic_reach
with the coordinates of the cells that can flow water to the respective oceans.
Finally, we determine the common coordinates in pacific_reach
and atlantic_reach
and store them in the output array. Because these cells can flow water to both oceans, we return the output array containing them.
Let’s look at the code for this solution below:
from print import *def estimate_water_flow(heights):# Get the number of rows and columns in the heightsnum_rows, num_cols = len(heights), len(heights[0])# Initialize sets to track cells reachable from the Pacific and Atlantic oceanspacific_reach = set()atlantic_reach = set()# Define a Depth-First Search (DFS) functiondef dfs(row, col, reach):reach.add((row, col))# Explore all 4 adjacent directions: down, right, up, leftfor (x, y) in [(1, 0), (0, 1), (-1, 0), (0, -1)]:new_row, new_col = row + x, col + y# Check if the new coordinates are outside the height boundsif new_row < 0 or new_row >= num_rows or new_col < 0 or new_col >= num_cols:continue# Skip if the cell is already visitedif (new_row, new_col) in reach:continue# Skip if the cell's height is lower than the current cell's heightif heights[new_row][new_col] < heights[row][col]:continue# Recursively explore the reachable cellsdfs(new_row, new_col, reach)# Start DFS from the left and right borders of the heights for Pacific and Atlantic Oceansfor i in range(num_rows):dfs(i, 0, pacific_reach) # Start from leftdfs(i, num_cols - 1, atlantic_reach) # Start from rightfor i in range(1, num_cols, 1):dfs(0, i , pacific_reach) # Start from topdfs(num_rows - 1, i - 1, atlantic_reach) # Start from bottom# Find the cells that are reachable from both oceansreturn list(pacific_reach.intersection(atlantic_reach))# Driver codedef main():matrices = [[[1, 2, 2, 3, 5], [3, 2, 3, 4, 4], [2, 4, 5, 3, 1], [6, 7, 1, 4, 5], [5, 1, 1, 2, 4]],[[4, 4, 4, 3, 1], [1, 5, 3, 7, 7], [3, 1, 3, 7, 5], [1, 2, 4, 4, 7], [4, 3, 1, 7, 1]],[[7, 3, 5, 2, 8], [2, 3, 4, 5, 6], [3, 9, 6, 8, 4]],[[1, 0, 1], [1, 1, 0], [1, 1, 1]],[[2, 3, 4, 5, 6, 7, 8], [2, 9, 3, 8, 4, 5, 6],[3, 4, 6, 7, 8, 5, 4], [9, 1, 5, 6, 7, 5, 5]]]for i in range(len(matrices)):print(i + 1, ".\t Input heights: ", sep="")print_matrix(matrices[i])print("\n\t Common coordinates: ", estimate_water_flow(matrices[i]))print("-" * 100)if __name__ == "__main__":main()
With our understanding of Graphs established, let’s explore the last, but certainly not least, coding pattern from Google’s list of frequently asked patterns.
The Dynamic Programming pattern is a technique that helps solve complex problems by breaking them down into simpler subproblems and storing their solutions to avoid redundant computations. This technique is useful when the problem can be divided into overlapping subproblems and optimal substructures. By storing and reusing intermediate results, dynamic programming enables us to solve problems with improved time and space complexity. For instance, a naive recursive approach to check if a string like "rotator" is a palindrome or to calculate Fibonacci numbers can be inefficient due to the repeated calculations of the same subproblems. Dynamic programming addresses these inefficiencies through two main strategies:
Memoization (top-down approach): This technique optimizes recursion by storing the results of subproblems the first time they are computed, preventing redundant calculations.
Tabulation (bottom-up approach): Tabulation constructs a table to store the results of smaller subproblems, gradually building up to solve the larger problem.
Let’s see how the following example illustrates the application of the Dynamic Programming pattern to efficiently solve the given coding problem:
Given a string s
, return the longest palindromic substring in s
.
If we look at the example above, we notice that any substring of length dp
, of size dp[i][j]
will store whether the string s[i..j]
is a palindromic substring. If the cell dp[i][j]
holds the result of the earlier computation, we will utilize it in the
Create a resultant array, res
, to store the starting and ending indexes of the longest palindromic substring. Initialize it with
Initialize a lookup table, dp
, with FALSE.
Base case 1: The diagonal in the lookup table is populated with TRUE because any cell in the diagonal corresponds to a substring of length
Base case 2: We check whether all two-letter substrings are palindromes and update the res
and dp
accordingly. We do this by iterating over the string, comparing s[i]
and s[i+1]
, and storing the result at dp[i][i+1]
. After that, we also update res
if the value of dp[i][i+1]
is TRUE, which tells us that the two-letter substring was a palindrome.
After these base cases, we check all substrings of lengths greater than dp[1][1]
, which will tell whether the remaining string “i”, represented by s[1..1]
, is a palindrome. We’ll take the logical AND of these two results and store it at dp[0][2]
because “zin” is represented by the substring s[0..2]
. This way, we’ll avoid redundant computations and check all possible substrings using the lookup table.
Let's implement the algorithm as discussed above:
def longest_palindromic_substring(s):# To store the starting and ending indexes of LPSres = [0, 0]n = len(s)# Initialize a lookup table of dimensions len(s) * len(s)dp = [[False for i in range(len(s))] for i in range(len(s))]# Base case: A string with one letter is always a palindromefor i in range(len(s)):dp[i][i] = True# Base case: Substrings of length 2for i in range(len(s)-1):dp[i][i + 1] = (s[i] == s[i + 1]) # Check if two characters are equalif dp[i][i + 1]:res = [i, i + 1] # Update the resultant array# Substrings of lengths greater than 2for length in range(3, len(s)+1):i = 0# Checking every possible substring of any specific lengthfor j in range(length - 1, len(s)): # Iterate over possible ending indexesdp[i][j] = dp[i + 1][j - 1] and (s[i] == s[j])if dp[i][j]:res = [i, j]i += 1return s[res[0]:res[1] + 1] # Return the longest palindromic substring# Driver codedef main():strings = ['cat', 'lever', 'xyxxyz', 'wwwwwwwwww', 'tattarrattat']for i in range(len(strings)):print(i + 1, ".\t Input string: '", strings[i], "'", sep="")result = longest_palindromic_substring(strings[i])print("\t Number of palindromic substrings: ", result, sep="")print("-" * 100)if __name__ == '__main__':main()
That's about exploring the coding patterns based on the frequently asked coding questions by Google.
Mastering the patterns we have just discovered to ace your Google interview is important. Understanding the underlying patterns behind the solutions you devise will not only help you tackle similar problems in the future but also demonstrate your depth of understanding to interviewers. We have explored some of the most common coding patterns with the help of interview questions frequently asked by Google, but it’s just a start. Remember, practice makes perfect, so dedicate time to solving problems regularly and seek feedback to improve further. You must focus on the Google behavioral interview as well and may explore the following courses by Educative for even better preparation because they cover a wide range of coding patterns as well as Dynamic Programming patterns, and that too in various programming languages:
Moreover, if you are looking for a customized learning plan, take a look at the following paths by Educative:
With determination, preparation, and a solid grasp of coding patterns, you’ll be well-equipped to tackle any coding challenge that comes your way during the Google interview process. Best of luck!
Free Resources