...

Tarjan’s Algorithm

Learn about Tarjan's algorithm for finding strongly connected components.

We'll cover the following...

Building intuition
Using a stack for identification of an SCC
- Reflections on non-tree edges and reachability
Implementation details
Explanation
The last mile

Press + to interact

Building intuition

The first vertex of an SCC to be visited by depth-first search becomes an ancestor of all other vertices in that SCC (and potentially vertices of other SCCs visited subsequently). We call this vertex the root of the SCC or scc-root, since it is the root of the depth-first subtree rooted at $v$ .

An scc-root is the first vertex of an SCC to be visited and the last vertex of that SCC to be marked explored.

The idea behind Tarjan’s algorithm is to follow the natural flow of depth-first search and identify the strongly connected components as soon as the last vertex of that strongly connected component is marked explored.

For example, if the boxes in the following illustration represent the strongly connected components $C_i$ in a tree-like hierarchy (corresponding to a depth-first tree), and the dotted edges represent non-tree edges going between different SCCs, then the algorithm identifies $C_1$ first, then $C_2$ , then $C_3$ , and so on, as indicated by increasing subscripts—in the order specified by the $ended$ timestamps of the scc-roots.

Press + to interact

Using a stack for identification of an SCC

To implement Tarjan’s algorithm, each vertex is pushed onto the stack as soon as it’s visited by depth-first search. When a vertex is marked explored, it is not removed from the stack unless it is an scc-root.

As soon as an scc-root is marked explored, we pop all of its descendants present on the stack in addition to the scc-root itself and create a new strongly connected component containing these vertices.

So, at any point in time, the only vertices on the stack are the ones that have been visited but whose SCCs have not been determined.

Don’t worry about how to identify an scc-root at this point. For now, the interest is only in seeing how the SCCs are identified whenever an scc-root is identified.

Press + to interact

The magic in Tarjan’s algorithm is in figuring out which vertices are the scc-roots. Once an scc-root is identified, the scc-root and the descendants present above it on the stack can be popped to create a new SCC.

To see how to identify scc-roots, it’s useful to think about how a vertex in the directed graph can be reached from another vertex.

Reflections on non-tree edges and reachability

Suppose depth-first search is run on a directed graph, and every edge is classified into one of four types. The tree-edges offer us limited insight into how some of the vertices can be reached from others. So, let’s think about how the non-tree edges can be useful for reaching one vertex from another:

Forward-edge

A forward-edge $(u,v)$ just gives an alternate path from $u$ to $v$ , since by definition of a forward-edge, another path exists from $u$ to $v$ consisting entirely of tree-edges. So, it adds nothing more to our knowledge that $v$ is reachable from $u$ .

Since forward-edges offer us no new insights into strongly connected components, they are unimportant for our present purpose.

Back-edge

A back-edge $(u,v)$ creates a cycle, and all vertices in a cycle must belong to the same SCC.

A back-edge can only lead to a vertex in the same SCC.

Cross-edge

The end-vertices of a cross-edge may belong to the same SCC, but they may very well belong to two different ones.

Claim: For a cross-edge $(u,v)$ , if $u$ and $v$ belong to different SCCs, the SCC containing $v$ ...

Introduction

Review: Asymptotic Notation and Math Prerequisites

Working with Graphs

Depth-First Search and Applications

Prelude: Shortest Paths

Single-source Shortest Paths in Weighted Digraphs

All Pairs Shortest Paths

Minimum Spanning Tree

Flows

Matchings and Vertex Covers

Conclusion

Use Simulated Annealing for the Traveling Salesman Problem

Tarjan’s Algorithm

Building intuition

Using a stack for identification of an SCC

Reflections on non-tree edges and reachability

Forward-edge

Back-edge

Cross-edge