Equivalence of Regular Expressions and Regular Languages

Regular expressions are hard to find

Not all languages lend themselves to the easy discovery of a regular expression to describe them. Consider a language consisting of an even number of both aa’s and bb’s. We know that each letter must occur in pairs but not necessarily in consecutive pairs. In addition, we can introduce the aa’s and bb’s independently—having no aa's and/or bb's is valid since zero is an even number. The answer is not easy to come up with by ad hoc means:

(aa+bb+(ab+ba)(aa+bb)(ab+ba))(aa+bb+(ab+ba)(aa+bb)^*(ab+ba))^*

It is especially difficult to come up with a regular expression for the complement language of a given regular expression. Regular expressions describe what is in a language, not what is outside of it. Fortunately, we can convert any finite automaton to a regular expression and vice versa.

To show that regular expressions represent the class of regular languages, we’ll demonstrate the following:

  • For every regular expression, there is an NFA (and, therefore, a DFA) that accepts the same language.
  • For every finite automaton (DFA or NFA), there is a regular expression that describes the language the automaton accepts.

From regular expression to NFA

To show that there is an automaton for every regular expression, it is sufficient to exhibit an automaton for each component of the formal definition of regular expressions. The NFAs in the figure below represent the base cases.

Get hands-on with 1400+ tech skills courses.