Home/Blog/Interview Prep/Mastering graph algorithms for coding interviews
Home/Blog/Interview Prep/Mastering graph algorithms for coding interviews

Mastering graph algorithms for coding interviews

14 min read
Dec 02, 2024
content
What is a graph?
Types of graphs
Key graph algorithms
Depth-first search (DFS)
Pseudocode for DFS
Explanation
Breadth-first search (BFS)
Pseudocode for BFS
Explanation
Dijkstra’s algorithm
Pseudocode for Dijkstra’s algorithm
Explanation
A* algorithm
Pseudocode for the A* algorithm
Explanation
A* algorithm visualization
Kruskal’s algorithm
Pseudocode on Kruskal’s algorithm
Explanation
Prim’s algorithm
Pseudocode for Prim’s algorithm
Explanation
Most popular graph-based interview questions
Test your understanding of graphs
Conclusion and further readings

When preparing for coding interviews with major tech companies, it is essential to have an efficient and targeted study plan. One effective strategy is to focus on common data structures and their associated problems. However, given the vast number of coding problems available, concentrating on specific, well-selected data structures can be particularly beneficial for those with limited time. In today’s blog, we’ll cover the essentials of graph algorithms, their practical applications, and strategies to approach graph-related problems in interviews.

Graph algorithms are critical to technical interviews, especially for positions requiring strong problem-solving skills and algorithmic knowledge. Understanding and mastering these algorithms can significantly boost your chances of acing coding interviews.

We will cover the following topics in this blog:

  • Overview of graph types (undirected, directed, weighted).

  • Key algorithms: DFS, BFS, Dijkstra’s, Kruskal’s, and Prim’s.

  • Real-world coding examples and interview applications for each algorithm.

  • Practical strategies for mastering these algorithms effectively.

What is a graph?#

A graph is a collection of nodes (or vertices) and edges connecting pairs of nodes. In graph theory, entities are represented as nodes, and their connections are expressed through edges.

Let’s review some basic components and common properties of graphs:

Name

Description

Vertex (node)

It is the fundamental unit of a graph, usually represented as a point that holds some information or data.

Edge

It is a connection between two nodes. An edge may have a direction (directed edge) or not (undirected edge).

Weight

Each edge has a weight assigned to it, which indicates a cost or value associated with the connection.

Degree

It is the number of edges incident to a node.

In-degree

It is the number of edges coming toward the node (In directed graphs only).

Out-degree

It is the number of edges going away from the node (In directed graphs only).

Adjacent nodes

An edge directly connects those nodes.

Path

It is a sequence of nodes where an edge connects each adjacent pair.

Cycle

It is a path that starts and ends at the same node.

Types of graphs#

Graphs come in various forms, each with unique properties influencing the algorithms used to traverse and analyze them. Look at the following illustration to better understand the types of graphs that we usually see in most coding problems:

canvasAnimation-image
1 / 5

Key graph algorithms#

Learning key graph algorithms will help you solve many problems effectively and efficiently. This section will examine the most important graph algorithms, exploring their mechanics, applications, and implementation details.

Depth-first search (DFS)#

Depth-first search (DFS) is a fundamental algorithm widely used for traversing or searching tree or graph data structures. DFS starts at a selected node and explores as far as possible along each branch before backtracking. It can be implemented using a stack (iteratively) or recursion.

How does DFS work?

  1. Initialization: Start from the root node (or an arbitrary node in a graph) and push it onto a stack.

  2. Traversal: Pop the top node from the stack, visit it, and push all its unvisited adjacent nodes onto the stack.

  3. Backtracking: Continue this process until the stack is empty, meaning all nodes reachable from the starting node have been visited.

Applications of DFS:

Pseudocode for DFS#

Here’s the pseudocode for the DFS algorithm in Python.

def dfs(graph, start):
visited_nodes = set() # Set to keep track of visited nodes
stack = [start] # Stack to manage the DFS traversal
while stack:
node = stack.pop()
if node not in visited_nodes:
visited_nodes.add(node) # Mark the node as visited
for neighbor in graph[node]:
if neighbor not in visited_nodes:
stack.append(neighbor) # Add unvisited neighbors to the stack
return visited_nodes
# Driver code
graph = {
'A': ['B', 'C', 'D'],
'B': ['D', 'E'],
'C': ['E', 'F'],
'D': [],
'E': ['F'],
'F': []
}
start_node = 'A'
test_visited_noded = dfs(graph, start_node)
print(f"Nodes visited from {start_node}: {test_visited_noded}")

Explanation#

  • Line 1: We define a function dfs that takes a graph represented as an adjacency list and a starting node as input.

  • Lines 2 and 3: Then, we initialize an empty set, visited_nodes, to keep track of visited nodes and a stack with the starting node.

  • Lines 5–11: We iterate as long as the stack is not empty.

    • Line 6: In each iteration, we pop a node from the stack, mark it as visited, and explore its neighbors.

      • Lines 10 and 11: And if a neighbor is not visited, we add it to the stack for further exploration.

  • Line 12: Finally, we return a set of all nodes visited during the DFS traversal.

Breadth-first search (BFS)#

Breadth-first search is another algorithm for traversing or searching tree or graph data structures. It starts at a selected node and explores all neighbor nodes at the present depth before moving on to nodes at the next depth level. BFS uses a queue data structure.

How does BFS work?

  1. Initialization: Start from the root node (or an arbitrary node in a graph) and enqueue it.

  2. Traversal: Dequeue the front node, visit it, and enqueue all its unvisited adjacent nodes.

  3. Level-wise traversal: Continue this process until the queue is empty, ensuring that nodes are visited level by level.

Applications of BFS:

Pseudocode for BFS#

Here’s the pseudocode for the BFS algorithm in Python.

def bfs(graph, start):
visited_nodes = set() # Set to keep track of visited nodes
queue = [start] # Queue to manage the BFS traversal
while queue:
node = queue.pop(0)
if node not in visited_nodes:
visited_nodes.add(node) # Mark the node as visited
for neighbor in graph[node]:
if neighbor not in visited_nodes:
queue.append(neighbor) # Add unvisited neighbors to the queue
return visited_nodes
# Driver code
graph = {
'A': ['B', 'C', 'D'],
'B': ['D', 'E'],
'C': ['E', 'F'],
'D': [],
'E': ['F'],
'F': []
}
start_node = 'A'
test_visited_nodes = bfs(graph, start_node)
print(f"Nodes visited from {start_node}: {test_visited_nodes}")

Explanation#

  • Line 1: We define a function bfs(graph, start) that takes in a graph (represented as a dictionary of nodes and their neighbors) and a starting node as input.

  • Lines 2 and 3: We initialize a set, visited_nodes, to keep track of visited nodes, and a queue with the starting node start to perform BFS traversal.

  • Lines 5–11: We then iterate until the queue is not empty.

    • Line 6: During each iteration, we dequeue (remove and return) the first element from the queue and assign it to the variable node.

    • Lines 7 and 8: Then, we check if the node has not been visited before. If so, we add it to the visited_nodes set and explore its neighbors.

      • Lines 9–11: For each neighbor of the current node, if the neighbor has not been visited yet, we add the neighbor to the queue.

  • Line 12: Once all reachable nodes from the start node have been visited, we return the set visited_nodes containing all the nodes visited during the BFS traversal.

If you’d like to dive deeper into graph algorithms or explore algorithmic problem solving in general, here’s a gallery of a few popular courses on Educative.

Dijkstra’s algorithm#

Continuing with the key graph algorithms, next up is Dijkstra’s Algorithm. This algorithm is used to find the shortest path between nodes in a graph, which may represent, for example, road networks. It works by maintaining a set of nodes whose shortest distance from the source is known and continuously expanding this set.

How does Dijkstra’s algorithm work?

  1. Initialization: Set the distance to the source node to 0 and all other nodes to infinity. Add the source node to a priority queue.

  2. Relaxation: Extract the node with the minimum distance from the priority queue and update the distance to its neighbors if a shorter path is found.

  3. Expansion: Repeat the relaxation step until all nodes have been processed.

Applications of Dijkstra’s algorithm:

  • Shortest Bridge: Find the shortest bridge in a weighted graph.

  • Routing: Used in routing protocols like OSPF (Open Shortest Path First) in networks.

  • Geographical mapping: Find the shortest path in maps and navigation systems.

Pseudocode for Dijkstra’s algorithm#

Here’s the pseudocode for Dijkstra’s algorithm in Python.

import heapq
def dijkstra(graph, start):
distances = {vertex: float('infinity') for vertex in graph} # Initialize distances to infinity
distances[start] = 0 # Distance to the start node is 0
pq = [(0, start)] # Priority queue to manage the nodes to be explored
while pq:
current_distance, current_vertex = heapq.heappop(pq) # Get the node with the smallest distance
if current_distance > distances[current_vertex]:
continue
for neighbor, weight in graph[current_vertex].items():
distance = current_distance + weight
if distance < distances[neighbor]: # If a shorter path is found
distances[neighbor] = distance # Update the shortest distance
heapq.heappush(pq, (distance, neighbor)) # Push the neighbor to the priority queue
return distances
# Driver code
graph = {
'A': {'B': 2, 'C': 5},
'B': {'A': 2, 'C': 4, 'D': 6},
'C': {'A': 3, 'B': 3, 'D': 5},
'D': {'B': 2, 'C': 3}
}
start_node = 'A'
distances = dijkstra(graph, start_node)
print(f"Shortest distances from {start_node}: {distances}")

Explanation#

  • Line 3: We define a function dijkstra() that takes two arguments: graph, which is a dictionary representing the graph, and start, which is the starting node for the algorithm.

  • Lines 4–6: We then initialize a dictionary named distances with all vertices in the graph set to infinity, except for the starting vertex, which we set to 0. We also initialize a priority queue pq with a tuple containing the starting vertex and its distance (0).

  • Line 8–19: We iterate as long as the priority queue pq is not empty.

    • Line 9: We pop the vertex with the smallest distance from the priority queue during each iteration using heapq.heappop.

    • Lines 11 and 12: We skip the current iteration if the distance to the vertex is greater than the distance already stored in the distances dictionary.

    • Lines 14–19: For each neighbor of the current vertex, we calculate the distance to reach that neighbor from the starting vertex through the current vertex.

      • Lines 17–19: If this calculated distance is less than the distance currently stored for the neighbor in the distances dictionary, we update the distance for that neighbor and add the new distance and the neighbor to the priority queue using heapq.heappush.

  • Line 21: Finally, we return the distances dictionary, which contains the shortest distances from the starting vertex to all other vertices in the graph.

How about we take a break and try to solve the following “Quick Quiz” on graph algorithms?

Quiz on graph algorithms

1

In a directed graph, what is true about the edges?

A)

Each edge has a direction associated with it.

B)

Edges can only connect adjacent nodes.

C)

Each node must be connected to every other node.

D)

Edges have no direction and can be traversed in both ways.

Question 1 of 30 attempted

A* algorithm#

The A* algorithm is an informed search algorithm that finds the shortest path from a start node to a goal node using heuristics to improve performance. It combines the features of Dijkstra’s algorithm and greedy best-first search.

To further clarify your understanding of the A* algorithm, check out the What are informed search algorithms? blog.

How does the A* algorithm work?

  1. Initialization: Set the start node’s cost to 0 and add it to a priority queue. Use a heuristic function to estimate the cost from each node to the goal.

  2. Traversal: Extract the node with the lowest total cost (actual cost + heuristic cost) from the priority queue.

  3. Expansion: If a shorter path is found, update the cost to the node’s neighbors and add them to the priority queue.

  4. Termination: Repeat until the goal node is reached or the priority queue is empty.

Applications of A* algorithm:

  • Pathfinding: This is common in AI for games and robotics to find the shortest path.

  • Navigation: This is used in GPS and mapping systems to find the optimal route.

  • Puzzle solving: This solves puzzles like the 8-puzzle problem.

Pseudocode for the A* algorithm#

Here’s the pseudocode for the A* algorithm in Python.

import heapq
def a_star(graph, start, goal, h):
open_list = [(0, start)] # Priority queue to manage the nodes to be explored
g_scores = {start: 0} # Cost from start to each node
f_scores = {start: h(start, goal)} # Estimated cost from start to goal through each node
while open_list:
_, current = heapq.heappop(open_list) # Get the node with the smallest f_score
if current == goal:
return g_scores[current] # Return the cost if goal is reached
for neighbor, weight in graph[current].items():
tentative_g_score = g_scores[current] + weight
if neighbor not in g_scores or tentative_g_score < g_scores[neighbor]:
g_scores[neighbor] = tentative_g_score # Update the g_score
f_scores[neighbor] = tentative_g_score + h(neighbor, goal) # Update the f_score
heapq.heappush(open_list, (f_scores[neighbor], neighbor)) # Push the neighbor to the priority queue
return float('infinity')
# Heuristic function (example: Manhattan distance for grid-based graphs)
def heuristic(a, b):
return abs(ord(a) - ord(b))
# Driver code
graph = {
'A': {'B': 2, 'C': 5},
'B': {'A': 2, 'C': 3, 'D': 6},
'C': {'A': 5, 'B': 3, 'D': 2},
'D': {'B': 6, 'C': 2}
}
start_node = 'A'
goal_node = 'D'
shortest_path_cost = a_star(graph, start_node, goal_node, heuristic)
print(f"Shortest path cost from {start_node} to {goal_node}: {shortest_path_cost}")

Explanation#

Using actual costs and heuristic estimates, the A* algorithm finds the shortest path from a start node to a goal node. Here’s a step-by-step explanation of the code:

  • We initialize a priority queue (open_list) with the start node and a cost of 0. This queue will help manage which nodes to explore next.

  • We also initialize g_scores with the start node having a cost of 0. This dictionary will help track the actual cost from the start node to each node.

  • We further initialize f_scores with the start node having a cost equal to the heuristic estimate of the goal. Using this dictionary, we combine actual cost and heuristic estimates to prioritize nodes.

  • We iterate over the priority queue continuously and pop the node with the lowest f_score from the priority queue.

    • If the current node is the goal, we return its g_score as the shortest path cost.

    • We calculate the tentative g_score using the formula current g_score + edge weight for each neighbor of the current node.

      • If the tentative g_score is lower than the known g_score for the neighbor, we update the g_score and f_score for the neighbor and push it onto the priority queue.

  • If by any means the goal is not reachable, we return infinity.

A* algorithm visualization#

Take a look at the following visualization for the A* algorithm. In this visual example, we have 6 nodes with weighted edges. This algorithm aims to consider the shortest path from node 0 to node 5 while keeping track of the weights of edges between nodes.

To run this visualization, click the “Start A* Algorithm” button to initialize the graph. The nodes will be highlighted upon each click of the “Step A* Algorithm.”

Here’s the fun part: Try dragging any node with your cursor. You can play around with the nodes in this playground while reading this blog.

Console
A* algorithm visual example

In the console above, you can see the step-by-step execution of nodes and how the A* algorithm incrementally works in figuring out the shortest path according to the weights of the edges.

Kruskal’s algorithm#

Kruskal’s algorithm is used to find the minimum spanning tree (MST) of a graph. The MST is a subset of the edges that connects all vertices without any cycles and with the minimum possible total edge weight.

How does Kruskal’s algorithm work?

  1. Initialization: Sort all edges in the graph by their weights.

  2. Edge selection: If the MST doesn’t form a cycle with the already included edges, pick the smallest edge and add it to the MST.

  3. Union Find: Use the Union Find data structure to detect cycles.

  4. Completion: Repeat until there are (V - 1) edges in the MST (where V is the number of vertices).

Applications of Kruskal’s algorithm:

  • Network design: Designing least-cost networks like telecommunication and electrical grids.

  • Clustering: Used in hierarchical clustering algorithms in machine learning.

  • Approximation algorithms: Useful when approximating algorithms for NP-hard problems.

Pseudocode on Kruskal’s algorithm#

Here’s the pseudocode for the Kruskal’s algorithm in Python.

class DisjointSet:
def __init__(self, n):
self.parent = list(range(n))
self.rank = [0] * n
def find(self, u):
""" Find the root representative of the set containing u """
if u != self.parent[u]:
self.parent[u] = self.find(self.parent[u]) # Path compression
return self.parent[u]
def union(self, u, v):
""" Union operation to merge sets containing u and v """
root_u = self.find(u)
root_v = self.find(v)
if root_u != root_v:
if self.rank[root_u] > self.rank[root_v]:
self.parent[root_v] = root_u
else:
self.parent[root_u] = root_v
if self.rank[root_u] == self.rank[root_v]:
self.rank[root_v] += 1
def kruskal(graph):
""" Implementation of Kruskal's algorithm to find Minimum Spanning Tree (MST) """
mst = []
edges = sorted(graph['edges'], key=lambda edge: edge[2]) # Sort edges by weight
ds = DisjointSet(graph['vertices']) # Initialize Disjoint Set
# Process each edge in sorted order
for u, v, weight in edges:
if ds.find(u) != ds.find(v): # Check if u and v belong to different sets
mst.append((u, v, weight)) # Include the edge in MST
ds.union(u, v) # Union the sets containing u and v
return mst
# Example usage and demonstration
if __name__ == "__main__":
# Example graph representation (vertices and edges)
graph = {
'vertices': 4,
'edges': [
(0, 1, 9),
(0, 2, 5),
(0, 3, 4),
(1, 3, 14),
(2, 3, 3)
]
}
# Applying Kruskal's algorithm to find MST
minimum_spanning_tree = kruskal(graph)
print("Minimum Spanning Tree (MST):\n")
for u, v, weight in minimum_spanning_tree:
print(f"\tEdge: {u} - {v}, Weight: {weight}")

Explanation#

The DisjointSet class provides methods for efficient union-find operations, crucial for implementing Kruskal’s algorithm for finding a graph’s MST.

DisjointSet algorithm (Lines 1–22):

  • We initialize a self.parent array where each vertex initially points to itself and a self.rank array to keep track of the depth of trees when performing union by rank.

  • We define two methods:

    • The find(self, u) method finds the root representative of the set containing vertex u.

    • The union(self, u, v) merges two sets containing vertices u and v. The union method uses the rank heuristic to keep the tree flat and optimize the union operation.

Kruskal’s algorithm (Lines 24–36):

  • We define the kruskal(graph) method, which takes a graph representation and returns the MST as a list of edges.

  • Next, we sort the edges by their weights to process lighter edges first (edge[2] represents the edge’s weight).

  • We use DisjointSet (ds) to keep track of connected components.

  • Finally, we iterate through sorted edges, adding each edge to mst if it connects two sets (checked using find and union operations of DisjointSet).

Prim’s algorithm#

Prim’s algorithm is also used to find a graph’s MST. It grows the MST one edge at a time, starting from an arbitrary vertex and adding the cheapest edge from the tree to a vertex not yet in the tree.

How does Prim’s algorithm work?

  1. Initialization: Start from an arbitrary vertex and add it to the MST.

  2. Edge selection: Add the smallest edge that connects a vertex in the MST to a vertex outside the MST.

  3. Priority queue: Use a priority queue to efficiently select the smallest edge.

  4. Completion: Repeat until all vertices are included in the MST.

Applications of Prim’s algorithm:

  • Network design: Designing minimum-cost networks.

  • Approximation algorithms: Useful in approximation algorithms for certain optimization problems.

  • Computer graphics: Used in constructing MST in graphics algorithms.

Pseudocode for Prim’s algorithm#

Here’s the pseudocode for Prim’s algorithm in Python.

import heapq
def prim(graph, start):
""" Prim's Algorithm to find the Minimum Spanning Tree (MST) of a graph """
mst = [] # List to store the edges of the MST
visited = set() # Set to track visited vertices
min_heap = [(0, start, None)] # Min-heap to prioritize edges by weight
while min_heap:
weight, u, prev = heapq.heappop(min_heap) # Get the edge with the minimum weight
if u in visited:
continue # Skip if the vertex has already been visited
visited.add(u) # Mark the vertex as visited
if prev is not None:
mst.append((prev, u, weight)) # Add the edge to the MST
# Push all adjacent edges to the min-heap
for neighbor, edge_weight in graph[u].items():
if neighbor not in visited:
heapq.heappush(min_heap, (edge_weight, neighbor, u))
return mst
# Example usage and demonstration
if __name__ == "__main__":
# Example graph representation (using adjacency list)
graph = {
0: {1: 9, 2: 5, 3: 4},
1: {0: 9, 3: 14},
2: {0: 4, 3: 3},
3: {0: 4, 1: 14, 2: 3}
}
# Applying Prim's algorithm to find MST starting from vertex 0
minimum_spanning_tree = prim(graph, 0)
print("Minimum Spanning Tree (MST):\n")
for u, v, weight in minimum_spanning_tree:
print(f"\tEdge: {u} - {v}, Weight: {weight}")

Explanation#

  • Line 5–7: We initialize a list, mst to store the edges that are part of the MST, a set, visited, to keep track of vertices that have been included in the MST, and a heap, min_heap, to manage the edges, initialized with a tuple containing the weight, starting vertex, and None (indicating no previous vertex).

  • Lines 9–20: We iterate over the edge with the smallest weight from the min-heap.

    • Lines 11 and 12: In each iteration, if we find a vertex u that has already been visited. We skip to the next iteration.

    • Lines 14 and 15: If u is not visited, we mark u as visited, and if prev is not None, we add the edge (prev, u, weight) to the MST.

    • Lines 18–20: We iterate through all neighbors of the current vertex u, and if we find a neighbor that has not been visited, we push the edge (edge_weight, neighbor, u) onto the min-heap.

  • Line 22: Once all the iterations are done, we return the list of edges that form the MST.

In this section, we’ve listed the most frequently asked interview questions on graphs.

Problem Name

Algorithm(s) That Can Be Used

DFS, BFS

DFS, BFS, Dijkstra’s

DFS

Kruskal’s algorithm, Prim’s algorithm

DFS

DFS

A*, Dijsktra’s

DFS, A*, Dijsktra’s

BFS

Test your understanding of graphs#

How about taking an AI-based mock interview to further strengthen and practice your knowledge of graphs?

Go ahead and check out our mock interviews. It will take you to the mock interviews page, where you’ll find a bundle of interviews. Search for “Graphs” on the page, select your experience level, and polish your skills on the algorithm.

Conclusion and further readings#

Mastering graph algorithms is essential for success in coding interviews. Understanding the fundamental algorithms, practicing common problems, and applying problem-solving strategies will enhance your ability to tackle graph-related interview questions confidently.

After reading this blog, the next step in your interview preparation journey is to learn more coding algorithms. As you familiarize yourself with each algorithm by practicing interview problems, you’ll notice a sharp increase in your problem-solving and analytical skills.

Consider using Educative’s coding interview patterns course series to guide you through this process. This series has covered 26 essential patterns and algorithms that provide clear explanations, example codes, and practical exercises. By following these steps, you’ll build a strong foundation and be well-prepared for any technical interview.

Best of luck with your next technical interview!

Frequently Asked Questions

Why do tech interviews focus on graph algorithms?

Graph algorithms are commonly used in problem-solving for interviews at major tech companies. They demonstrate a candidate’s ability to handle complex data structures, solve connectivity problems, and work with optimization techniques.

What are the most important graph algorithms to study?

How do DFS and BFS differ?

What is the best way to approach graph-related problems in interviews?

How can I practice graph algorithms for coding interviews?


Written By:
Dian Us Suqlain
Join 2.5 million developers at
Explore the catalog

Free Resources