...

/

Equivalence of Regular Expressions and Regular Languages

Equivalence of Regular Expressions and Regular Languages

Learn about the process of converting a regular expression to a regular language.

Regular expressions are hard to find

Not all languages lend themselves to the easy discovery of a regular expression to describe them. Consider a language consisting of an even number of both aa’s and bb’s. We know that each letter must occur in pairs but not necessarily in consecutive pairs. In addition, we can introduce the aa’s and bb’s independently—having no aa's and/or bb's is valid since zero is an even number. The answer is not easy to come up with by ad hoc means:

(aa+bb+(ab+ba)(aa+bb)(ab+ba))(aa+bb+(ab+ba)(aa+bb)^*(ab+ba))^*

It is especially difficult to come up with a regular expression for the complement language of a given regular expression. Regular expressions describe what is in a language, not what is outside of it. Fortunately, we can convert any finite automaton to a regular expression and vice versa.

To show that regular expressions represent the class of regular languages, we’ll demonstrate the following:

  • For every regular expression, there is an NFA (and, therefore, a DFA) that accepts the same language.
  • For every finite automaton (DFA or NFA), there is a regular expression that describes the language the automaton accepts.

From regular expression to NFA

To show that there is an automaton for every regular expression, it is sufficient to exhibit an automaton for each component of the formal definition of regular expressions. The NFAs in the figure below represent the base cases.

Press + to interact
NFAs for part 1 of the definition of regular expressions (empty set, empty string, single letter)
NFAs for part 1 of the definition of regular expressions (empty set, empty string, single letter)

We now need to show how to construct the union, concatenation, and Kleene star of arbitrary NFAs. The grouping operation is merely a notational convenience for regular expressions and is not part of the proof.

To construct the union of two NFAs, we create a new initial state and use lambda expressions to the start states of each of the NFAs, allowing a choice of which machine to use. If x1x_1 and y1y_1 are the initial states of the two NFAs in question, the following figure illustrates the process.

Press + to interact
NFA fragment representing the union of two other NFAs
NFA fragment representing the union of two other NFAs

To construct the concatenation of two NFAs, we place a lambda transition from each accepting state of the first machine to the start state of the second and make those final states in the first machine nonfinal. Of course, the start state of the second machine is also no longer a start state. See the figure below, which assumes, without loss of generality, that xix_i and xjx_j are the final states of the first machine and y1y_1 is the start state of the second.

Press + to interact
NFA fragment representing the concatenation of two other NFAs
NFA fragment representing the concatenation of two other NFAs

To construct the Kleene star of an NFA involves two considerations:

  1. If the empty string is not already accepted, we must make it accepted.
  2. We must allow arbitrary “restarts” of the NFA.

If the initial state of an NFA is also an accepting state, the first consideration above is already fulfilled. If not, we can make it accept only if there are no incoming edges from other states. Otherwise, making the start state accepting would change the language of the machine. In such a case, we create a new accepting start state, followed by a lambda transition to the original start state.

Press + to interact
NFA accepting the empty string
NFA accepting the empty string

To allow arbitrary repetition of a machine, we add lambda transitions from each accepting state to the original initial state, since repetition is just concatenation with itself.

Press + to interact
NFA allowing repetition
NFA allowing repetition

This constitutes a constructive proof of the first part of Kleene’s theorem. Let’s see it in action by constructing an NFA for the regular expression (aba+b)(ab^*a+b)^*. We begin with an NFA accepting the singleton string aa.

Press + to interact
NFA accepting a
NFA accepting a

Next, we allow zero or more bb’s (we are omitting a lambda transition to a separate machine here for simplicity).

Press + to interact
Concatenating b* after a in the NFA shown in the previous figure
Concatenating b* after a in the NFA shown in the previous figure

Now we concatenate another aa.

Press + to interact
Concatenating a in the figure above
Concatenating a in the figure above

Next, we allow the alternative of just accepting bb ...

Access this course and 1400+ top-rated courses and projects.