Home/Blog/Programming/Equivalence classes and the real world

Equivalence classes and the real world

21 min read

Dec 26, 2023

content

Motivation

Relation on a set

Reflexive relation

Symmetric relation

Transitive relation

Equivalence relation

Pop quiz

Equivalence classes

Use cases in computer science

Become a Software Engineer in Months, Not Years

From your first line of code, to your first day on the job — Educative has you covered. Join 2M+ developers learning in-demand programming skills.

Motivation#

For many of us, our first introduction to the union-find data structure (also known as the disjoint-set data structure) comes from seeing a well-known implementation of Kruskal’s algorithm. In many situations, the union-find data structure is used for storing objects that are connected. These objects can be pixels of an image, elements of a grid, or nodes of a network. However, there are many other situations that warrant the use of the union-find data structure. To be able to identify the union-find data structure as an appropriate data structure for a particular problem, we need to keep in mind that the primary use case of the union-find data structure is representing what are known as equivalence classes.

Objects in a set can be related to each other in many different ways. Some of these relationships are so pervasive and useful that they are given special names. In this blog, we talk about one such relation called an equivalence relation. We also discuss how an equivalence relation on a set causes some of its elements to be considered equivalent in some sense. These subsets of equivalent elements are called equivalence classes. We conclude by looking at some of the ways in which these abstract ideas are useful in the real world.

Relation on a set#

Let’s consider a set $S$ of objects. We can express a relationship between some elements of $S$ as another set $R$ . The set $R$ contains ordered pairs of the form $(x,y)$ , representing that the element $x$ of $S$ is related to the element $y$ of $S$ in some sense.

Example 1: Suppose $S_1$ is the set of all individuals. We define a relation $R_1$ on $S_1$ as:
$R_1 = \{(a, b) \mid a \text{ was born in the same month as }b, \text{ where } a, b \in S_1. \}$
It is the set of all ordered pairs $(a, b)$ , where $a$ and $b$ are any individuals that are born in the same month. For instance, both Charles Babbage and Ada Lovelace were born in December, so we have:
$\begin{align*} (\text{Charles Babbage}, \text{Ada Lovelace}) &\in R_1 \\ (\text{Ada Lovelace}, \text{Charles Babbage}) &\in R_1\end{align*}$

But beware! In general, if $(a, b)$ is in a relation, we may not necessarily have $(b, a)$ in that relation.
Example 2: Suppose $S_2$ is a set of all people. We define a relation $R_2$ on $S_2$ as:
$R_2 = \{(a, b) \mid a \text{ is a sister of }\text{ }b \text{, where } a, b \in S_2 \}$
$R_2$ contains all ordered pairs $(a, b)$ , where $a$ and $b$ are individuals such that $a$ is a sister of $b$ .

Example 3: Let $S_3$ be the set of all people living in the same US state. We define a relation $R_3$ on $S_3$ as:
$R_3 = \{(a, b) \mid a \text{ is a friend of }b, \text{ where } a, b \in S_3. \}$
$R_3$ contains all ordered pairs $(a, b)$ , where $a$ and $b$ live in the same US state, and are friends with each other.

All of the above relations are defined on sets containing individuals, but they may very well be defined on other kinds of collections, such as a set of numbers, a collection of virtual machines, a set of variables, and so on. We’ll show four more examples of relations defined on sets of numbers.

Example 4: Let $\mathbb{R}$ be the set of real numbers. We can define a relation $R_4$ that contains all pairs containing a real number and its additive inverse:
$R_4 = \{(a, b)\mid a + b = 0, \text{where }a, b \in \mathbb{R} \}$
This is the set of all ordered pairs $(a, b)$ , where $a$ and $b$ are reals, and $a$ is the additive inverse of $b$ .

To understand equivalence relations, we first need to understand three other types of relations:

Reflexive
Symmetric
Transitive

Reflexive relation#

A relation $R$ on a set $S$ is called reflexive if for each element $x \in S$ , the pair $(x, x) \in R$ .

Example: The relation $R_6 = \{(a, b) \mid a \bmod b = 0, \text{where }a, b \in \mathbb{Z^+}\}$ is a reflexive relation on the set of positive integers because each positive integer divides itself. So, $R_6$ contains $(1,1)$ , $(2,2)$ , and so on.

Nonexample: The following relation is not a reflexive relation on the set of reals because it does not contain all pairs of the kind $(a, a)$ , where $a$ is a real:

R_5 = \{(a, b) \mid a > b, \text{where }a, b \in \mathbb{R} \}

For instance, it does not contain the pair $(0,0)$ or $(1,1)$ .

Notice that we gave a general argument to show that a relation is reflexive. But to show that a relation is not reflexive, a single counterexample is enough.

Symmetric relation#

A relation $R$ is called symmetric if for each pair $(x, y) \in R$ , it’s also the case that $(y,x)\in R$ .

Example: The following is a symmetric relation on the set $S_1$ of all individuals:

R_1 = \{(a, b) \mid a \text{ was born in the same month as }b, \text{ where } a, b \in S_1. \}

To see this for any two individuals $a$ and $b$ , if $a$ and $b$ are born in the same month, then it’s perfectly legitimate to say that $b$ and $a$ are born in the same month. In other words, if $(a, b)\in R_1$ , then $(b, a) \in R_1$

Nonexample: The following relation is not a symmetric relation on the set of positive integers.

R_6 = \{(a, b) \mid a \bmod b = 0, \text{where }a, b \in \mathbb{Z^+} \}

To see this, observe that $(10, 2)\in R_6$ as $10 \bmod 2=0$ , but $(2, 10)\notin R_6$ as $2 \bmod 10 = 2$ .

Transitive relation#

A relation $R$ is called transitive if for each $(x, y) \in R$ and $(y, z) \in R$ , it’s also the case that $(x, z)\in R$ .

Example: The following is a transitive relation on the set of all positive integers:
$R_6 = \{(a, b)\mid a \bmod b = 0, \text{where }a \text{ and } b \in \mathbb{Z^+}\}$
Suppose $(a, b)$ and $(b, c) \in R_6$ . Since $(a, b) \in R_6$ , it follows that $a \bmod b = 0$ , and so $b$ is a factor of $a$ . Similarly, $(b, c) \in R_6$ implies that $b \bmod c = 0$ , which implies that $c$ is a factor of $b$ . As $c$ is a factor of $b$ , and $b$ is a factor of $a$ , it follows that $c$ is a factor of $a$ . In other words, $a \bmod c = 0$ , implying $(a, c)\in R_6$ .
Nonexample: The following relation, $R_4$ , is not a transitive relation on the set of all real numbers.
$R_4 = \{(a, b)\mid a + b = 0, \text{where }a, b \in \mathbb{R} \}$
The pairs $(9, -9)$ and $(-9, 9)$ belong to $R_4$ . If $R_4$ was transitive, then $(9, 9)$ and $(-9,-9)$ would be present in $R_4$ , but they aren’t. So, $R_4$ is not transitive.

Equivalence relation#

An equivalence relation on a set is a relation that is reflexive, symmetric, and transitive. Of all the relations shown at the top of this page, only two are equivalence relations.

Example: The following is an equivalence relation on the set $S_1$ of all individuals.

$R_1 = \{(a, b) \mid a \text{ was born in the same month as }b, \text{ where } a, b \in S_1. \}$
- It’s reflexive because each person is born in the same month as themselves.
- It’s symmetric because if an individual $a$ is born in the same month as an individual $b$ , then $b$ is born in the same month as $a$ .
- It’s transitive because if an individual $a$ is born in the same month as an individual $b$ , and $b$ is born in the same month as an individual $c$ , then clearly, $a$ and $c$ are born in the same month.

Pop quiz#

Test your understanding of equivalence relations:

Equivalence classes#

Two elements $x$ and $y$ in a set $S$ are said to be equivalent under an equivalence relation $R$ if $(x, y)\in R$ . For example, under the following equivalence relation on the set of all individuals, all individuals born in December are considered equivalent.

R_1 = \{(a, b) \mid a, b \in S_1, \text{ and }a \text{ was born in the same month as }b. \}

A subset of a set $S$ that contains all elements that are equivalent under an equivalence relation is called an equivalence class.

Under the relation $R_1$ defined above, all individuals born in the same month fall into one equivalence class. So, there are 12 equivalence classes corresponding to the 12 months of the year, and each individual falls into exactly one of those 12 classes.

Notice how these equivalence classes are actually disjoint subsets since no person falls into two different equivalence classes. Notice also that since each person belongs to some equivalence class, if we take a union of these disjoint equivalence classes, we get the entire set of individuals.

This observation is of utmost importance because equivalence classes are primarily useful for thinking of a set as being partitioned into these disjoint subsets, with each subset consisting of objects that are deemed equivalent. We express this observation as the following fact:

Notation: In mathematics, equivalence classes are typically represented using a notation $[x]$ , where $x$ is an arbitrary element in the equivalence class.

Example: Consider the following equivalence relation:
$R_7 = \{(a, b)\mid a \bmod 7 = b \bmod 7, \text{where }a, b \in \mathbb{N} \}$
Since the remainder of any natural number by $7$ is an integer in the range $0$ to $6$ , it follows that there are seven equivalence classes:
$[0], [1], [2], [3], [4], [5], [6]$
Note that all seven equivalence classes here are infinite sets. It would not be incorrect to write $[0]$ as $[7]$ or as $[14]$ because all multiples of $7$ belong to $[0]$ , and any number in this equivalence class can equally well serve as its representative.

Use cases in computer science#

The idea of equivalence classes gives us a useful abstraction for looking at the world around us. We’ll briefly touch upon a few of the ways in which the notion of equivalence classes has been put to use in different applications of computer science.

An important use case is the applications that use the union-find data structure to either help identify equivalence classes or enable working with equivalence classes. In fact, this is the reason the data structure was invented in the first place: to identify variables that are equivalent in the sense that they represent the same memory locations.

The most significant ways in which the union-find data structure has been effectively used in real-world applications centers around definitions of equivalence relations that can be abstracted as “an entity $x$ is connected to entity $y$ in a specific way.”

For instance, the connected components of a graph form equivalence classes under the following equivalence relation defined on the set of vertices of a graph $G$ as:

$\{(u,v)\mid u \text{ and } v \text{ are in the same component of }G \}$
- Reflexive: Each vertex of a graph is in the same component as itself.
- Symmetric: If $u$ and $v$ are vertices in the same component, $v$ and $u$ are vertices in the same component.
- Transitive: If $u$ is in the same component as $v$ , and $v$ is in the same component as $w$ , then $u$ is in the same component as $w$ .
Well-known implementations of Kruskal’s algorithm use the union-find data structure to keep track of connected subgraphs that coalesce into a minimum spanning tree.

There are similar applications in image processing. The Matlab function bwlabel takes a 2-dimensional boolean matrix as an input and returns a matrix containing different labels for each connected structure in the input matrix. The precise definition of “connected” can vary depending on a function parameter. For example, the illustration below shows this for an image that has three 4-connected regions, where two pixels that are adjacent along the horizontal or vertical axis are said to be part of the same 4-connected region.

Each colored region, labeled 1, 2, and 3 in the illustration on the right below, is an equivalence class showing pixels that are 4-connected to each other. The function bwlabel uses the union-find data structure to identify these. The function can also be used for identifying equivalence classes under other equivalence relations, such as the equivalence relation of belonging to the same 8-connectedPixels that are adjacent diagonally, vertically or horizontally. region.

Many other similar applications require either identifying equivalence classes as their final goal or as a means to an end.

One instance where the concept of equivalence classes is useful is in black-box testing. To test a software the test cases are designed so that they correspond to equivalence classes. Each equivalence class contains elements that are equivalent in the sense that they are expected to elicit a similar response from the system. The idea is to test software more effectively by testing representative data elements from equivalence classes instead of having to test the entire input space. Having said this, figuring out which data falls in the same equivalence class requires skills. This use case is a little more unusual than the other abovementioned use cases because the focus is on identifying equivalence classes based on the requirement specifications and the engineer’s judgment call of what’s considered equivalent rather than beginning with a mathematically crisp definition of what constitutes an equivalence relation.
There are similar instances in experimental works where looking at the search space or a dataset through the lens of equivalence classes has provided some kind of theoretical justification.

One of the takeaways here is what was left unsaid: it’s important to have the basics covered and be well-prepared because one never knows where one simple abstract idea can be applied in so many diverse ways.

That’s all for this blog. We hope this was useful for you!

Written By:

Mehvish Poshni

Free AI Mock Interviews

Coding Interview

Coding PatternsFree Interview

Gain insights and practical experience with coding patterns through targeted MCQs and coding problems, designed to match and challenge your expertise level.

System Design

YouTubeFree Interview

Learn to design a video streaming platform like YouTube by tackling functional and non-functional requirements, core components, and high-level to detailed design challenges.

Free Resources