...

/

The Chomsky Hierarchy

The Chomsky Hierarchy

Learn about the Chomsky hierarchy of formal languages and the existence of languages that are not recursively enumerable.

We'll cover the following...

We can now summarize the entire hierarchy of formal languages. We emphasize “hierarchy” because each language class is a subset of the class above it. This classification originated with Noam Chomsky and is aptly called the Chomsky hierarchy. See the following table.

Language Class Recognizing Machine Grammar Type
Recursively Enumerable TM\text{TM} (recognizer) Type 0: Unrestricted
Recursive TM\text{TM} (decider) Type 0: Unrestricted
Context Sensitive LBA (bounded TM\text{TM}) Type 1: Non-contracting
Context Free PDA Type 2: Context Free
Deterministic Context Free DPDA Type 2: Context Free
Regular DFA Type 3: Right/Left Linear

We have yet to describe an RE language that is not recursive, and a language that is non-RE. Here we will show that there are more languages outside the Chomsky hierarchy (i.e., non-RE languages) than there are inside of it (the “recursively enumerable or better” languages). Our argument is based on the countability of infinite sets.

Countable sets

A set is countable if there is a one-to-one correspondence between its elements and the natural numbers, N\mathcal{N} (or a subset of N\mathcal{N}). Any finite set is therefore countable, and an infinite set is countable if its elements can be enumerated. The set of strings {a,b}\{a,b\}^* is countably infinite because its elements can be arranged in a well-defined sequence (i.e., the familiar canonical order):

In this case, the subscripts are the natural numbers themselves, exhibiting the one-to-one correspondence between N\mathcal{N} and {a,b}\{a,b\}^* (i.e., e(0)=λ,e(1)=a,e(2)=b,e(3)=aa,e(0)=\lambda, e(1)=a, e(2)=b, e(3)=aa,\cdots).

We can conclude that all formal languages are countable sets, since they can be arranged in canonical order. A formal language is, after all, a subset of the countable set, Σ\Sigma^*.

Remember: All formal languages are countable sets of strings.

Now, recall that Turing machines can be represented as strings over some alphabet. In particular, we know how to encode any TM\text{TM} ...

Access this course and 1400+ top-rated courses and projects.