...

Implementation of DFS

Learn how to implement depth first search.

We'll cover the following...

Implementation notes

Avoiding stack overflows

Implementing cycle detection with DFS

Runtime analysis
Trying out the cycle detection
Detecting cycles in undirected graphs

In this lesson, we’ll solve the problem of detecting cycles in directed graphs by implementing DFS.

Implementation notes

The easiest way to implement DFS is as a recursive procedure:

Mark the current node as “in progress.”
Recursively process all nodes behind discovery edges.
Mark the current node as done.

Due to the recursion occuring between starting and finishing a node, we automatically explore deeper nodes first before going up to shallower nodes again.

So, in principle, our depth-first search procedure should be a recursive function

void dfs(int u) {}

that performs DFS from the starting node u.

As we mostly need to perform operations on all neighboring edges of a node, we’ll make use of the adjacency list graph representation. We are considering unweighted graphs here, but the implementation for weighted graphs is very similar. We’ll simply ignore the values of the weights.

But now our DFS procedure additionally needs to access some “global” state such as:

the adjacency list of the graph.
the current state or color of each node.

One option would be to pass this state to the dfs routine additionally, like in

void dfs(int u, vector<vector<int>> const &adjacencyList, vector<NodeState> &state) {}

However, this is unwieldy, especially when we want to keep track of more state during the search, such as node predecessors or distances.

Instead, we’ll implement our cycle detection algorithm as a class where this state is stored in the member variables.

Avoiding stack overflows

There is one downside of implementing DFS recursively. As each recursive function call creates a stack frame, it is possible to run into stack overflows on very large graphs.

An easy workaround for this problem is to simply increase the maximum stack size, for example, using the ulimit -s command on Linux systems.

But some operating systems do not support raising the stack size past a hard limit. In that case, the only remedy is to implement DFS iteratively by keeping track of a vector of “in progress” nodes. This vector can be allocated on the heap and thus we can avoid stack overflow issues.

Implementing cycle detection with DFS

Let’s start by setting up a basic class to run the cycle detection.

Press + to interact

#include <vector>
#include <iostream>
using namespace std;
// shorthand for adjacency list type
typedef vector<vector<int>> AdjacencyList;
// define an enum for node states during DFS execution
enum class NodeState {UNVISITED, IN_PROGRESS, FINISHED};
class CycleDetector {
private:
    const AdjacencyList adjacencyList;
    vector<NodeState> nodeStates;
    bool cycleFound = false;
public: 
    CycleDetector(AdjacencyList _adjacencyList) 
        : adjacencyList {_adjacencyList} 
        // initialize all nodes as unvisited
        , nodeStates {vector<NodeState>(adjacencyList.size(), NodeState::UNVISITED)}
    {}
};

To keep track of the node states during DFS, we have defined the enumeration NodeState.

enum class NodeState {UNVISITED, IN_PROGRESS, FINISHED};

The states correspond to the colors white, gray, and blue from the previous lesson.

In our class, we use a vector to keep track of the current state of each node. Initially, all nodes are in the UNVISITED state. We also maintain a boolean variable cycleFound, initially false, which keeps track of whether we have discovered a cycle yet.

Next, let’s write the DFS routine.

Press + to interact

void dfs(int u) {
    // mark current node as in progress
    this->nodeStates[u] = NodeState::IN_PROGRESS;
    for (int v : this->adjacencyList[u]) {
        switch (this->nodeStates[v]) {
            // discovery edge: recursively call dfs
            case NodeState::UNVISITED: dfs(v); break;
            // back edge: mark cycle as found
            case NodeState::IN_PROGRESS: this->cycleFound = true; break;
            // redundant edge: skip
            case NodeState::FINISHED: break;
        }
    }
    // mark current node as done 
    this->nodeStates[u] = NodeState::FINISHED;
}

Introduction

Graph Representations

Graph Traversal

Shortest Paths

Spanning Trees

Flow Problems

Conclusion

Appendix

Implementation of DFS

Implementation notes

Avoiding stack overflows

Implementing cycle detection with DFS