...

/

Implementation of Disjoint-Set Data Structure

Implementation of Disjoint-Set Data Structure

Learn about the disjoint-set data structure.

The disjoint-set data structure (also known as the union-find data structure) is useful for implementing Kruskal’s algorithm, but it is also useful, in general, for checking equivalences. Two elements that are equivalent, under any notion of equivalence, can be thought of as belonging to the same equivalence class and equivalence classes are essentially disjoint sets.

Operations

We go over one of the ways to implement the following methods supported by the disjoint-set data structure.

  • DisjointSet(x): Create a set containing the element x.

  • Find(x): Find the name of the set containing the element x.

  • Union(x, y): Merge the sets named x and y.

Implementation as rooted trees

A single set is implemented as a rooted tree, where the tree is named after the name of the element present at the root.

  • Each node has a parent pointer that points to the parent of that node in the rooted tree.

  • The parent pointer of the root points to the root itself.

Press + to interact
Two disjoint sets of size 1 and 3
Two disjoint sets of size 1 and 3

The Find method

The Find operation, when called on a node, recursively traces the parent pointers all the way up to the root to find the name of that set.

The Find operation can also be implemented using a heuristic called the path-compression heuristic. In such an implementation, as the recursion unwinds after executing the base case, the parent pointers along that recursive path are updated to point to the root. This ensures that future calls on any nodes along that path take O(1)O(1) ...