Home/Blog/Programming/Equivalence classes and the real world
Home/Blog/Programming/Equivalence classes and the real world

Equivalence classes and the real world

21 min read
Dec 26, 2023
content
Motivation
Relation on a set
Reflexive relation
Symmetric relation
Transitive relation
Equivalence relation
Pop quiz
Equivalence classes
Use cases in computer science

Become a Software Engineer in Months, Not Years

From your first line of code, to your first day on the job — Educative has you covered. Join 2M+ developers learning in-demand programming skills.

Motivation#

For many of us, our first introduction to the union-find data structure (also known as the disjoint-set data structure) comes from seeing a well-known implementation of Kruskal’s algorithm. In many situations, the union-find data structure is used for storing objects that are connected. These objects can be pixels of an image, elements of a grid, or nodes of a network. However, there are many other situations that warrant the use of the union-find data structure. To be able to identify the union-find data structure as an appropriate data structure for a particular problem, we need to keep in mind that the primary use case of the union-find data structure is representing what are known as equivalence classes.

Objects in a set can be related to each other in many different ways. Some of these relationships are so pervasive and useful that they are given special names. In this blog, we talk about one such relation called an equivalence relation. We also discuss how an equivalence relation on a set causes some of its elements to be considered equivalent in some sense. These subsets of equivalent elements are called equivalence classes. We conclude by looking at some of the ways in which these abstract ideas are useful in the real world.

Relation on a set#

Let’s consider a set SS of objects. We can express a relationship between some elements of SS as another set RR. The set RR contains ordered pairs of the form (x,y)(x,y), representing that the element xx of SS is related to the element yy of SS in some sense.

  • Example 1: Suppose S1S_1 is the set of all individuals. We define a relation R1R_1 on S1S_1 as:

    R1={(a,b)a was born in the same month as b, where a,bS1.}R_1 = \{(a, b) \mid a \text{ was born in the same month as }b, \text{ where } a, b \in S_1. \}

    It is the set of all ordered pairs (a,b)(a, b), where aa and bb are any individuals that are born in the same month. For instance, both Charles Babbage and Ada Lovelace were born in December, so we have:

    (Charles Babbage,Ada Lovelace)R1(Ada Lovelace,Charles Babbage)R1\begin{align*} (\text{Charles Babbage}, \text{Ada Lovelace}) &\in R_1 \\ (\text{Ada Lovelace}, \text{Charles Babbage}) &\in R_1\end{align*}

    But beware! In general, if (a,b)(a, b) is in a relation, we may not necessarily have (b,a)(b, a) in that relation.

  • Example 2: Suppose S2S_2 is a set of all people. We define a relation R2R_2 on S2S_2 as:

    R2={(a,b)a is a sister of  b, where a,bS2}R_2 = \{(a, b) \mid a \text{ is a sister of }\text{ }b \text{, where } a, b \in S_2 \}

    R2R_2 contains all ordered pairs (a,b)(a, b), where aa and bb are individuals such that aa is a sister of bb.

  • Example 3: Let S3S_3 be the set of all people living in the same US state. We define a relation R3R_3 on S3S_3 as:

    R3={(a,b)a is a friend of b, where a,bS3.}R_3 = \{(a, b) \mid a \text{ is a friend of }b, \text{ where } a, b \in S_3. \}

    R3R_3 contains all ordered pairs (a,b)(a, b), where aa and bb live in the same US state, and are friends with each other.

All of the above relations are defined on sets containing individuals, but they may very well be defined on other kinds of collections, such as a set of numbers, a collection of virtual machines, a set of variables, and so on. We’ll show four more examples of relations defined on sets of numbers.

  • Example 4: Let R\mathbb{R} be the set of real numbers. We can define a relation R4R_4 that contains all pairs containing a real number and its additive inverse:

    R4={(a,b)a+b=0,where a,bR}R_4 = \{(a, b)\mid a + b = 0, \text{where }a, b \in \mathbb{R} \}

    This is the set of all ordered pairs (a,b)(a, b), where aa and bb are reals, and aa is the additive inverse of bb.

  • Example 5: We can define an “is greater than” relation as the following relation on the set of reals:

    R5={(a,b)a>b,where a,bR}R_5 = \{(a, b)\mid a > b, \text{where }a, b \in \mathbb{R} \}

    This is the set of all ordered pairs (a,b)(a, b), where aa and bb are reals, and aa is greater than bb.


  • Example 6: Suppose Z+\mathbb{Z^+} is a set of positive integers. We define a relation “is divisible by” on Z+\mathbb{Z^+} as follows:

    R6={(a,b)amodb=0,where a,bZ+}R_6 = \{(a, b)\mid a \bmod b = 0, \text{where }a, b \in \mathbb{Z^+} \}

    This is the set of all ordered pairs (a,b)(a, b), where aa is divisible by bb for positive integers aa and bb.

  • Example 7: Suppose N\mathbb{N} is a set of natural numbersDefined here as the set of nonnegative integers {0, 1, 2, … }. We define a relation “is congruent modulo 7” on N\mathbb{N} as:

    R7={(a,b)amod7=bmod7,where a,bN}R_7 = \{(a, b)\mid a \bmod 7 = b \bmod 7, \text{where }a, b \in \mathbb{N} \}

    This relates all natural numbers aa and bb that have the same remainder when divided by 77.

To understand equivalence relations, we first need to understand three other types of relations:

  • Reflexive
  • Symmetric
  • Transitive

Reflexive relation#

A relation RR on a set SS is called reflexive if for each element xSx \in S, the pair (x,x)R(x, x) \in R.

Example: The relation R6={(a,b)amodb=0,where a,bZ+}R_6 = \{(a, b) \mid a \bmod b = 0, \text{where }a, b \in \mathbb{Z^+}\} is a reflexive relation on the set of positive integers because each positive integer divides itself. So, R6R_6 contains (1,1)(1,1), (2,2)(2,2), and so on.

Nonexample: The following relation is not a reflexive relation on the set of reals because it does not contain all pairs of the kind (a,a)(a, a), where aa is a real:

R5={(a,b)a>b,where a,bR}R_5 = \{(a, b) \mid a > b, \text{where }a, b \in \mathbb{R} \}

For instance, it does not contain the pair (0,0)(0,0) or (1,1)(1,1).

All pairs of the form (x, x) are present in a reflexive relation
All pairs of the form (x, x) are present in a reflexive relation

Notice that we gave a general argument to show that a relation is reflexive. But to show that a relation is not reflexive, a single counterexample is enough.

Symmetric relation#

A relation RR is called symmetric if for each pair (x,y)R(x, y) \in R, it’s also the case that (y,x)R(y,x)\in R.

Example: The following is a symmetric relation on the set S1S_1 of all individuals:

R1={(a,b)a was born in the same month as b, where a,bS1.}R_1 = \{(a, b) \mid a \text{ was born in the same month as }b, \text{ where } a, b \in S_1. \}

To see this for any two individuals aa and bb, if aa and bb are born in the same month, then it’s perfectly legitimate to say that bb and aa are born in the same month. In other words, if (a,b)R1(a, b)\in R_1, then (b,a)R1(b, a) \in R_1

Nonexample: The following relation is not a symmetric relation on the set of positive integers.

R6={(a,b)amodb=0,where a,bZ+}R_6 = \{(a, b) \mid a \bmod b = 0, \text{where }a, b \in \mathbb{Z^+} \}

To see this, observe that (10,2)R6(10, 2)\in R_6 as 10mod2=010 \bmod 2=0, but (2,10)R6(2, 10)\notin R_6 as 2mod10=22 \bmod 10 = 2.

For each pair (x, y) in a symmetric relation, the pair (y, x) must also be in it
For each pair (x, y) in a symmetric relation, the pair (y, x) must also be in it

Transitive relation#

A relation RR is called transitive if for each (x,y)R(x, y) \in R and (y,z)R(y, z) \in R, it’s also the case that (x,z)R(x, z)\in R.

  • Example: The following is a transitive relation on the set of all positive integers:

    R6={(a,b)amodb=0,where a and bZ+}R_6 = \{(a, b)\mid a \bmod b = 0, \text{where }a \text{ and } b \in \mathbb{Z^+}\}

    Suppose (a,b)(a, b) and (b,c)R6(b, c) \in R_6. Since (a,b)R6(a, b) \in R_6, it follows that amodb=0a \bmod b = 0, and so bb is a factor of aa. Similarly, (b,c)R6(b, c) \in R_6 implies that bmodc=0b \bmod c = 0, which implies that cc is a factor of bb. As cc is a factor of bb, and bb is a factor of aa, it follows that cc is a factor of aa. In other words, amodc=0a \bmod c = 0, implying (a,c)R6(a, c)\in R_6.

  • Nonexample: The following relation, R4R_4, is not a transitive relation on the set of all real numbers.

    R4={(a,b)a+b=0,where a,bR}R_4 = \{(a, b)\mid a + b = 0, \text{where }a, b \in \mathbb{R} \}

    The pairs (9,9)(9, -9) and (9,9)(-9, 9) belong to R4R_4. If R4R_4 was transitive, then (9,9)(9, 9) and (9,9)(-9,-9) would be present in R4R_4, but they aren’t. So, R4R_4 is not transitive.

For any two pairs of the form (x, y) and (y, z) in a relation, the pair (x, z) must also be in it in order to be called a transitive relation
For any two pairs of the form (x, y) and (y, z) in a relation, the pair (x, z) must also be in it in order to be called a transitive relation

Equivalence relation#

An equivalence relation on a set is a relation that is reflexive, symmetric, and transitive. Of all the relations shown at the top of this page, only two are equivalence relations.

  • Example: The following is an equivalence relation on the set S1S_1 of all individuals.

    R1={(a,b)a was born in the same month as b, where a,bS1.}R_1 = \{(a, b) \mid a \text{ was born in the same month as }b, \text{ where } a, b \in S_1. \}

    • It’s reflexive because each person is born in the same month as themselves.
    • It’s symmetric because if an individual aa is born in the same month as an individual bb, then bb is born in the same month as aa.
    • It’s transitive because if an individual aa is born in the same month as an individual bb, and bb is born in the same month as an individual cc, then clearly, aa and cc are born in the same month.

Pop quiz#

Test your understanding of equivalence relations:

1.

Is the following an equivalence relation on the set of all natural numbers?

R7={(a,b)amod7=bmod7,where a,bN}R_7 = \{(a, b)\mid a \bmod 7 = b \bmod 7, \text{where }a, b \in \mathbb{N} \}

Show Answer
Q1 / Q3

Equivalence classes#

Two elements xx and yy in a set SS are said to be equivalent under an equivalence relation RR if (x,y)R(x, y)\in R. For example, under the following equivalence relation on the set of all individuals, all individuals born in December are considered equivalent.

R1={(a,b)a,bS1, and a was born in the same month as b.}R_1 = \{(a, b) \mid a, b \in S_1, \text{ and }a \text{ was born in the same month as }b. \}

A subset of a set SS that contains all elements that are equivalent under an equivalence relation is called an equivalence class.

Under the relation R1R_1 defined above, all individuals born in the same month fall into one equivalence class. So, there are 12 equivalence classes corresponding to the 12 months of the year, and each individual falls into exactly one of those 12 classes.

Notice how these equivalence classes are actually disjoint subsets since no person falls into two different equivalence classes. Notice also that since each person belongs to some equivalence class, if we take a union of these disjoint equivalence classes, we get the entire set of individuals.

This observation is of utmost importance because equivalence classes are primarily useful for thinking of a set as being partitioned into these disjoint subsets, with each subset consisting of objects that are deemed equivalent. We express this observation as the following fact:

A partitioning of a set into six disjoint equivalence classes
A partitioning of a set into six disjoint equivalence classes

Fact: Any equivalence relation RR on a set S imposes a grouping of the elements of SS into disjoint subsets of SS, called its equivalence classes, such that a union of these equivalence classes is SS.

A proof of this fact follows from the definitions.

Notation: In mathematics, equivalence classes are typically represented using a notation [x][x], where xx is an arbitrary element in the equivalence class.

  • Example: Consider the following equivalence relation:

    R7={(a,b)amod7=bmod7,where a,bN}R_7 = \{(a, b)\mid a \bmod 7 = b \bmod 7, \text{where }a, b \in \mathbb{N} \}

    Since the remainder of any natural number by 77 is an integer in the range 00 to 66, it follows that there are seven equivalence classes:

    [0],[1],[2],[3],[4],[5],[6][0], [1], [2], [3], [4], [5], [6]

    Note that all seven equivalence classes here are infinite sets. It would not be incorrect to write [0][0] as [7][7] or as [14][14] because all multiples of 77 belong to [0][0], and any number in this equivalence class can equally well serve as its representative.

Equivalence classes for the congruence modulo 7 relation
Equivalence classes for the congruence modulo 7 relation

Use cases in computer science#

The idea of equivalence classes gives us a useful abstraction for looking at the world around us. We’ll briefly touch upon a few of the ways in which the notion of equivalence classes has been put to use in different applications of computer science.

An important use case is the applications that use the union-find data structure to either help identify equivalence classes or enable working with equivalence classes. In fact, this is the reason the data structure was invented in the first place: to identify variables that are equivalent in the sense that they represent the same memory locations.

The most significant ways in which the union-find data structure has been effectively used in real-world applications centers around definitions of equivalence relations that can be abstracted as “an entity xx is connected to entity yy in a specific way.”

  • For instance, the connected components of a graph form equivalence classes under the following equivalence relation defined on the set of vertices of a graph GG as:

    {(u,v)u and v are in the same component of G}\{(u,v)\mid u \text{ and } v \text{ are in the same component of }G \}

    • Reflexive: Each vertex of a graph is in the same component as itself.
    • Symmetric: If uu and vv are vertices in the same component, vv and uu are vertices in the same component.
    • Transitive: If uu is in the same component as vv, and vv is in the same component as ww, then uu is in the same component as ww.

    Well-known implementations of Kruskal’s algorithm use the union-find data structure to keep track of connected subgraphs that coalesce into a minimum spanning tree.

Connected components of a graph
Connected components of a graph
  • There are similar applications in image processing. The Matlab function bwlabel takes a 2-dimensional boolean matrix as an input and returns a matrix containing different labels for each connected structure in the input matrix. The precise definition of “connected” can vary depending on a function parameter. For example, the illustration below shows this for an image that has three 4-connected regions, where two pixels that are adjacent along the horizontal or vertical axis are said to be part of the same 4-connected region.

    Each colored region, labeled 1, 2, and 3 in the illustration on the right below, is an equivalence class showing pixels that are 4-connected to each other. The function bwlabel uses the union-find data structure to identify these. The function can also be used for identifying equivalence classes under other equivalence relations, such as the equivalence relation of belonging to the same 8-connectedPixels that are adjacent diagonally, vertically or horizontally. region.

Connected structures in the input are assigned labels
Connected structures in the input are assigned labels
  • Games like Hex or Havannah require two players to take turns to place pebbles on the game board. The first player whose pebbles connect to form a special pattern wins the game. In such a case, a union-find data structure can be maintained for each player, where the pebbles that form connected structures fall in the same disjoint-set.

  • The coding interview courses in Educative, such as this course, contain lots of interesting problems involving the union-find data structure and equivalence classes.

Many other similar applications require either identifying equivalence classes as their final goal or as a means to an end.

  • One instance where the concept of equivalence classes is useful is in black-box testing. To test a software the test cases are designed so that they correspond to equivalence classes. Each equivalence class contains elements that are equivalent in the sense that they are expected to elicit a similar response from the system. The idea is to test software more effectively by testing representative data elements from equivalence classes instead of having to test the entire input space. Having said this, figuring out which data falls in the same equivalence class requires skills. This use case is a little more unusual than the other abovementioned use cases because the focus is on identifying equivalence classes based on the requirement specifications and the engineer’s judgment call of what’s considered equivalent rather than beginning with a mathematically crisp definition of what constitutes an equivalence relation.

  • There are similar instances in experimental works where looking at the search space or a dataset through the lens of equivalence classes has provided some kind of theoretical justification.

One of the takeaways here is what was left unsaid: it’s important to have the basics covered and be well-prepared because one never knows where one simple abstract idea can be applied in so many diverse ways.

That’s all for this blog. We hope this was useful for you!


Written By:
Mehvish Poshni
Join 2.5 million developers at
Explore the catalog

Free Resources