Graph algorithms are commonly used in problem-solving for interviews at major tech companies. They demonstrate a candidate’s ability to handle complex data structures, solve connectivity problems, and work with optimization techniques.
When preparing for coding interviews with major tech companies, it is essential to have an efficient and targeted study plan. One effective strategy is to focus on common data structures and their associated problems. However, given the vast number of coding problems available, concentrating on specific, well-selected data structures can be particularly beneficial for those with limited time. In today’s blog, we’ll cover the essentials of graph algorithms, their practical applications, and strategies to approach graph-related problems in interviews.
Graph algorithms are critical to technical interviews, especially for positions requiring strong problem-solving skills and algorithmic knowledge. Understanding and mastering these algorithms can significantly boost your chances of acing coding interviews.
We will cover the following topics in this blog:
Overview of graph types (undirected, directed, weighted).
Key algorithms: DFS, BFS, Dijkstra’s, Kruskal’s, and Prim’s.
Real-world coding examples and interview applications for each algorithm.
Practical strategies for mastering these algorithms effectively.
A graph is a collection of nodes (or vertices) and edges connecting pairs of nodes. In graph theory, entities are represented as nodes, and their connections are expressed through edges.
Let’s review some basic components and common properties of graphs:
Name | Description |
Vertex (node) | It is the fundamental unit of a graph, usually represented as a point that holds some information or data. |
Edge | It is a connection between two nodes. An edge may have a direction (directed edge) or not (undirected edge). |
Weight | Each edge has a weight assigned to it, which indicates a cost or value associated with the connection. |
Degree | It is the number of edges incident to a node. |
In-degree | It is the number of edges coming toward the node (In directed graphs only). |
Out-degree | It is the number of edges going away from the node (In directed graphs only). |
Adjacent nodes | An edge directly connects those nodes. |
Path | It is a sequence of nodes where an edge connects each adjacent pair. |
Cycle | It is a path that starts and ends at the same node. |
Graphs come in various forms, each with unique properties influencing the algorithms used to traverse and analyze them. Look at the following illustration to better understand the types of graphs that we usually see in most coding problems:
Learning key graph algorithms will help you solve many problems effectively and efficiently. This section will examine the most important graph algorithms, exploring their mechanics, applications, and implementation details.
Depth-first search (DFS) is a fundamental algorithm widely used for traversing or searching tree or graph data structures. DFS starts at a selected node and explores as far as possible along each branch before backtracking. It can be implemented using a stack (iteratively) or recursion.
How does DFS work?
Initialization: Start from the root node (or an arbitrary node in a graph) and push it onto a stack.
Traversal: Pop the top node from the stack, visit it, and push all its unvisited adjacent nodes onto the stack.
Backtracking: Continue this process until the stack is empty, meaning all nodes reachable from the starting node have been visited.
Applications of DFS:
Paths in Maze That Lead to Same Room: Given a maze represented as a 2D grid, find a path from the start position to the end position.
Number of Connected Components in an Undirected Graph: Return the number of connected components given an undirected graph.
Here’s the pseudocode for the DFS algorithm in Python.
def dfs(graph, start):visited_nodes = set() # Set to keep track of visited nodesstack = [start] # Stack to manage the DFS traversalwhile stack:node = stack.pop()if node not in visited_nodes:visited_nodes.add(node) # Mark the node as visitedfor neighbor in graph[node]:if neighbor not in visited_nodes:stack.append(neighbor) # Add unvisited neighbors to the stackreturn visited_nodes# Driver codegraph = {'A': ['B', 'C', 'D'],'B': ['D', 'E'],'C': ['E', 'F'],'D': [],'E': ['F'],'F': []}start_node = 'A'test_visited_noded = dfs(graph, start_node)print(f"Nodes visited from {start_node}: {test_visited_noded}")
Line 1: We define a function dfs
that takes a graph represented as an adjacency list and a starting node as input.
Lines 2 and 3: Then, we initialize an empty set, visited_nodes
, to keep track of visited nodes and a stack with the starting node.
Lines 5–11: We iterate as long as the stack is not empty.
Line 6: In each iteration, we pop a node from the stack, mark it as visited, and explore its neighbors.
Lines 10 and 11: And if a neighbor is not visited, we add it to the stack for further exploration.
Line 12: Finally, we return a set of all nodes visited during the DFS traversal.
Breadth-first search is another algorithm for traversing or searching tree or graph data structures. It starts at a selected node and explores all neighbor nodes at the present depth before moving on to nodes at the next depth level. BFS uses a queue data structure.
How does BFS work?
Initialization: Start from the root node (or an arbitrary node in a graph) and enqueue it.
Traversal: Dequeue the front node, visit it, and enqueue all its unvisited adjacent nodes.
Level-wise traversal: Continue this process until the queue is empty, ensuring that nodes are visited level by level.
Applications of BFS:
Shortest Bridge: Find the shortest bridge in a weighted graph.
Level Order Traversal of a Binary Tree: Common in tree data structures to traverse nodes level by level.
Bipartite Graph: Determine if a graph is bipartite.
Here’s the pseudocode for the BFS algorithm in Python.
def bfs(graph, start):visited_nodes = set() # Set to keep track of visited nodesqueue = [start] # Queue to manage the BFS traversalwhile queue:node = queue.pop(0)if node not in visited_nodes:visited_nodes.add(node) # Mark the node as visitedfor neighbor in graph[node]:if neighbor not in visited_nodes:queue.append(neighbor) # Add unvisited neighbors to the queuereturn visited_nodes# Driver codegraph = {'A': ['B', 'C', 'D'],'B': ['D', 'E'],'C': ['E', 'F'],'D': [],'E': ['F'],'F': []}start_node = 'A'test_visited_nodes = bfs(graph, start_node)print(f"Nodes visited from {start_node}: {test_visited_nodes}")
Line 1: We define a function bfs(graph, start)
that takes in a graph (represented as a dictionary of nodes and their neighbors) and a starting node as input.
Lines 2 and 3: We initialize a set, visited_nodes
, to keep track of visited nodes, and a queue with the starting node start
to perform BFS traversal.
Lines 5–11: We then iterate until the queue
is not empty.
Line 6: During each iteration, we dequeue (remove and return) the first element from the queue
and assign it to the variable node
.
Lines 7 and 8: Then, we check if the node
has not been visited before. If so, we add it to the visited_nodes
set and explore its neighbors.
Lines 9–11: For each neighbor of the current node, if the neighbor has not been visited yet, we add the neighbor to the queue
.
Line 12: Once all reachable nodes from the start
node have been visited, we return the set visited_nodes
containing all the nodes visited during the BFS traversal.
If you’d like to dive deeper into graph algorithms or explore algorithmic problem solving in general, here’s a gallery of a few popular courses on Educative.
Continuing with the key graph algorithms, next up is Dijkstra’s Algorithm. This algorithm is used to find the shortest path between nodes in a graph, which may represent, for example, road networks. It works by maintaining a set of nodes whose shortest distance from the source is known and continuously expanding this set.
How does Dijkstra’s algorithm work?
Initialization: Set the distance to the source node to 0 and all other nodes to infinity. Add the source node to a priority queue.
Relaxation: Extract the node with the minimum distance from the priority queue and update the distance to its neighbors if a shorter path is found.
Expansion: Repeat the relaxation step until all nodes have been processed.
Applications of Dijkstra’s algorithm:
Shortest Bridge: Find the shortest bridge in a weighted graph.
Routing: Used in routing protocols like OSPF (Open Shortest Path First) in networks.
Geographical mapping: Find the shortest path in maps and navigation systems.
Here’s the pseudocode for Dijkstra’s algorithm in Python.
import heapqdef dijkstra(graph, start):distances = {vertex: float('infinity') for vertex in graph} # Initialize distances to infinitydistances[start] = 0 # Distance to the start node is 0pq = [(0, start)] # Priority queue to manage the nodes to be exploredwhile pq:current_distance, current_vertex = heapq.heappop(pq) # Get the node with the smallest distanceif current_distance > distances[current_vertex]:continuefor neighbor, weight in graph[current_vertex].items():distance = current_distance + weightif distance < distances[neighbor]: # If a shorter path is founddistances[neighbor] = distance # Update the shortest distanceheapq.heappush(pq, (distance, neighbor)) # Push the neighbor to the priority queuereturn distances# Driver codegraph = {'A': {'B': 2, 'C': 5},'B': {'A': 2, 'C': 4, 'D': 6},'C': {'A': 3, 'B': 3, 'D': 5},'D': {'B': 2, 'C': 3}}start_node = 'A'distances = dijkstra(graph, start_node)print(f"Shortest distances from {start_node}: {distances}")
Line 3: We define a function dijkstra()
that takes two arguments: graph
, which is a dictionary representing the graph, and start
, which is the starting node for the algorithm.
Lines 4–6: We then initialize a dictionary named distances
with all vertices in the graph set to infinity, except for the starting vertex, which we set to 0
. We also initialize a priority queue pq
with a tuple containing the starting vertex and its distance (0
).
Line 8–19: We iterate as long as the priority queue pq
is not empty.
Line 9: We pop the vertex with the smallest distance from the priority queue during each iteration using heapq.heappop
.
Lines 11 and 12: We skip the current iteration if the distance to the vertex is greater than the distance already stored in the distances
dictionary.
Lines 14–19: For each neighbor of the current vertex, we calculate the distance to reach that neighbor from the starting vertex through the current vertex.
Lines 17–19: If this calculated distance is less than the distance currently stored for the neighbor in the distances
dictionary, we update the distance for that neighbor and add the new distance and the neighbor to the priority queue using heapq.heappush
.
Line 21: Finally, we return the distances
dictionary, which contains the shortest distances from the starting vertex to all other vertices in the graph.
How about we take a break and try to solve the following “Quick Quiz” on graph algorithms?
Quiz on graph algorithms
In a directed graph, what is true about the edges?
Each edge has a direction associated with it.
Edges can only connect adjacent nodes.
Each node must be connected to every other node.
Edges have no direction and can be traversed in both ways.
The A* algorithm is an informed search algorithm that finds the shortest path from a start node to a goal node using heuristics to improve performance. It combines the features of Dijkstra’s algorithm and greedy best-first search.
To further clarify your understanding of the A* algorithm, check out the What are informed search algorithms? blog.
How does the A* algorithm work?
Initialization: Set the start node’s cost to 0 and add it to a priority queue. Use a heuristic function to estimate the cost from each node to the goal.
Traversal: Extract the node with the lowest total cost (actual cost + heuristic cost) from the priority queue.
Expansion: If a shorter path is found, update the cost to the node’s neighbors and add them to the priority queue.
Termination: Repeat until the goal node is reached or the priority queue is empty.
Applications of A* algorithm:
Pathfinding: This is common in AI for games and robotics to find the shortest path.
Navigation: This is used in GPS and mapping systems to find the optimal route.
Puzzle solving: This solves puzzles like the 8-puzzle problem.
Here’s the pseudocode for the A* algorithm in Python.
import heapqdef a_star(graph, start, goal, h):open_list = [(0, start)] # Priority queue to manage the nodes to be exploredg_scores = {start: 0} # Cost from start to each nodef_scores = {start: h(start, goal)} # Estimated cost from start to goal through each nodewhile open_list:_, current = heapq.heappop(open_list) # Get the node with the smallest f_scoreif current == goal:return g_scores[current] # Return the cost if goal is reachedfor neighbor, weight in graph[current].items():tentative_g_score = g_scores[current] + weightif neighbor not in g_scores or tentative_g_score < g_scores[neighbor]:g_scores[neighbor] = tentative_g_score # Update the g_scoref_scores[neighbor] = tentative_g_score + h(neighbor, goal) # Update the f_scoreheapq.heappush(open_list, (f_scores[neighbor], neighbor)) # Push the neighbor to the priority queuereturn float('infinity')# Heuristic function (example: Manhattan distance for grid-based graphs)def heuristic(a, b):return abs(ord(a) - ord(b))# Driver codegraph = {'A': {'B': 2, 'C': 5},'B': {'A': 2, 'C': 3, 'D': 6},'C': {'A': 5, 'B': 3, 'D': 2},'D': {'B': 6, 'C': 2}}start_node = 'A'goal_node = 'D'shortest_path_cost = a_star(graph, start_node, goal_node, heuristic)print(f"Shortest path cost from {start_node} to {goal_node}: {shortest_path_cost}")
Using actual costs and heuristic estimates, the A* algorithm finds the shortest path from a start node to a goal node. Here’s a step-by-step explanation of the code:
We initialize a priority queue (open_list
) with the start node and a cost of 0
. This queue will help manage which nodes to explore next.
We also initialize g_scores
with the start node having a cost of 0
. This dictionary will help track the actual cost from the start node to each node.
We further initialize f_scores
with the start node having a cost equal to the heuristic estimate of the goal. Using this dictionary, we combine actual cost and heuristic estimates to prioritize nodes.
We iterate over the priority queue continuously and pop the node with the lowest f_score
from the priority queue.
If the current node is the goal, we return its g_score
as the shortest path cost.
We calculate the tentative g_score
using the formula current g_score + edge
weight for each neighbor of the current node.
If the tentative g_score
is lower than the known g_score
for the neighbor, we update the g_score
and f_score
for the neighbor and push it onto the priority queue.
If by any means the goal is not reachable, we return infinity
.
Take a look at the following visualization for the A* algorithm. In this visual example, we have 6 nodes with weighted edges. This algorithm aims to consider the shortest path from node 0 to node 5 while keeping track of the weights of edges between nodes.
To run this visualization, click the “Start A* Algorithm” button to initialize the graph. The nodes will be highlighted upon each click of the “Step A* Algorithm.”
Here’s the fun part: Try dragging any node with your cursor. You can play around with the nodes in this playground while reading this blog.
In the console above, you can see the step-by-step execution of nodes and how the A* algorithm incrementally works in figuring out the shortest path according to the weights of the edges.
Kruskal’s algorithm is used to find the minimum spanning tree (MST) of a graph. The MST is a subset of the edges that connects all vertices without any cycles and with the minimum possible total edge weight.
How does Kruskal’s algorithm work?
Initialization: Sort all edges in the graph by their weights.
Edge selection: If the MST doesn’t form a cycle with the already included edges, pick the smallest edge and add it to the MST.
Union Find: Use the Union Find data structure to detect cycles.
Completion: Repeat until there are (V - 1) edges in the MST (where V is the number of vertices).
Applications of Kruskal’s algorithm:
Network design: Designing least-cost networks like telecommunication and electrical grids.
Clustering: Used in hierarchical clustering algorithms in machine learning.
Approximation algorithms: Useful when approximating algorithms for NP-hard problems.
Here’s the pseudocode for the Kruskal’s algorithm in Python.
class DisjointSet:def __init__(self, n):self.parent = list(range(n))self.rank = [0] * ndef find(self, u):""" Find the root representative of the set containing u """if u != self.parent[u]:self.parent[u] = self.find(self.parent[u]) # Path compressionreturn self.parent[u]def union(self, u, v):""" Union operation to merge sets containing u and v """root_u = self.find(u)root_v = self.find(v)if root_u != root_v:if self.rank[root_u] > self.rank[root_v]:self.parent[root_v] = root_uelse:self.parent[root_u] = root_vif self.rank[root_u] == self.rank[root_v]:self.rank[root_v] += 1def kruskal(graph):""" Implementation of Kruskal's algorithm to find Minimum Spanning Tree (MST) """mst = []edges = sorted(graph['edges'], key=lambda edge: edge[2]) # Sort edges by weightds = DisjointSet(graph['vertices']) # Initialize Disjoint Set# Process each edge in sorted orderfor u, v, weight in edges:if ds.find(u) != ds.find(v): # Check if u and v belong to different setsmst.append((u, v, weight)) # Include the edge in MSTds.union(u, v) # Union the sets containing u and vreturn mst# Example usage and demonstrationif __name__ == "__main__":# Example graph representation (vertices and edges)graph = {'vertices': 4,'edges': [(0, 1, 9),(0, 2, 5),(0, 3, 4),(1, 3, 14),(2, 3, 3)]}# Applying Kruskal's algorithm to find MSTminimum_spanning_tree = kruskal(graph)print("Minimum Spanning Tree (MST):\n")for u, v, weight in minimum_spanning_tree:print(f"\tEdge: {u} - {v}, Weight: {weight}")
The DisjointSet
class provides methods for efficient union-find operations, crucial for implementing Kruskal’s algorithm for finding a graph’s MST.
DisjointSet algorithm (Lines 1–22):
We initialize a self.parent
array where each vertex initially points to itself and a self.rank
array to keep track of the depth of trees when performing union by rank.
We define two methods:
The find(self, u)
method finds the root representative of the set containing vertex u
.
The union(self, u, v)
merges two sets containing vertices u
and v
. The union method uses the rank heuristic to keep the tree flat and optimize the union operation.
Kruskal’s algorithm (Lines 24–36):
We define the kruskal(graph)
method, which takes a graph representation and returns the MST as a list of edges.
Next, we sort the edges by their weights to process lighter edges first (edge[2]
represents the edge’s weight).
We use DisjointSet
(ds
) to keep track of connected components.
Finally, we iterate through sorted edges, adding each edge to mst
if it connects two sets (checked using find
and union
operations of DisjointSet
).
Prim’s algorithm is also used to find a graph’s MST. It grows the MST one edge at a time, starting from an arbitrary vertex and adding the cheapest edge from the tree to a vertex not yet in the tree.
How does Prim’s algorithm work?
Initialization: Start from an arbitrary vertex and add it to the MST.
Edge selection: Add the smallest edge that connects a vertex in the MST to a vertex outside the MST.
Priority queue: Use a priority queue to efficiently select the smallest edge.
Completion: Repeat until all vertices are included in the MST.
Applications of Prim’s algorithm:
Network design: Designing minimum-cost networks.
Approximation algorithms: Useful in approximation algorithms for certain optimization problems.
Computer graphics: Used in constructing MST in graphics algorithms.
Here’s the pseudocode for Prim’s algorithm in Python.
import heapqdef prim(graph, start):""" Prim's Algorithm to find the Minimum Spanning Tree (MST) of a graph """mst = [] # List to store the edges of the MSTvisited = set() # Set to track visited verticesmin_heap = [(0, start, None)] # Min-heap to prioritize edges by weightwhile min_heap:weight, u, prev = heapq.heappop(min_heap) # Get the edge with the minimum weightif u in visited:continue # Skip if the vertex has already been visitedvisited.add(u) # Mark the vertex as visitedif prev is not None:mst.append((prev, u, weight)) # Add the edge to the MST# Push all adjacent edges to the min-heapfor neighbor, edge_weight in graph[u].items():if neighbor not in visited:heapq.heappush(min_heap, (edge_weight, neighbor, u))return mst# Example usage and demonstrationif __name__ == "__main__":# Example graph representation (using adjacency list)graph = {0: {1: 9, 2: 5, 3: 4},1: {0: 9, 3: 14},2: {0: 4, 3: 3},3: {0: 4, 1: 14, 2: 3}}# Applying Prim's algorithm to find MST starting from vertex 0minimum_spanning_tree = prim(graph, 0)print("Minimum Spanning Tree (MST):\n")for u, v, weight in minimum_spanning_tree:print(f"\tEdge: {u} - {v}, Weight: {weight}")
Line 5–7: We initialize a list, mst
to store the edges that are part of the MST, a set, visited
, to keep track of vertices that have been included in the MST, and a heap, min_heap
, to manage the edges, initialized with a tuple containing the weight, starting vertex, and None
(indicating no previous vertex).
Lines 9–20: We iterate over the edge with the smallest weight from the min-heap.
Lines 11 and 12: In each iteration, if we find a vertex u
that has already been visited. We skip to the next iteration.
Lines 14 and 15: If u
is not visited, we mark u
as visited, and if prev
is not None
, we add the edge (prev, u, weight)
to the MST.
Lines 18–20: We iterate through all neighbors of the current vertex u
, and if we find a neighbor that has not been visited, we push the edge (edge_weight, neighbor, u)
onto the min-heap.
Line 22: Once all the iterations are done, we return the list of edges that form the MST.
In this section, we’ve listed the most frequently asked interview questions on graphs.
Problem Name | Algorithm(s) That Can Be Used |
DFS, BFS | |
DFS, BFS, Dijkstra’s | |
DFS | |
Kruskal’s algorithm, Prim’s algorithm | |
DFS | |
DFS | |
A*, Dijsktra’s | |
DFS, A*, Dijsktra’s | |
BFS |
How about taking an AI-based mock interview to further strengthen and practice your knowledge of graphs?
Go ahead and check out our mock interviews. It will take you to the mock interviews page, where you’ll find a bundle of interviews. Search for “Graphs” on the page, select your experience level, and polish your skills on the algorithm.
Mastering graph algorithms is essential for success in coding interviews. Understanding the fundamental algorithms, practicing common problems, and applying problem-solving strategies will enhance your ability to tackle graph-related interview questions confidently.
After reading this blog, the next step in your interview preparation journey is to learn more coding algorithms. As you familiarize yourself with each algorithm by practicing interview problems, you’ll notice a sharp increase in your problem-solving and analytical skills.
Consider using Educative’s coding interview patterns course series to guide you through this process. This series has covered 26 essential patterns and algorithms that provide clear explanations, example codes, and practical exercises. By following these steps, you’ll build a strong foundation and be well-prepared for any technical interview.
Best of luck with your next technical interview!
Free Resources