The Halting Problem

Learn about the languages that are not recursive.

Librarian Paradox

To complete our coverage of formal languages, we need to exhibit a recursively enumerable (RE) language that is not recursive. We’ll form a language that can be recognized by a Turing machine, but that is not decidable (i.e., the TM\text{TM} may not halt on strings that are not in the language). This language will therefore be RE but not recursive. This language will allow us to discover other nonrecursive languages, and, by complementation, languages that are not RE. But first, we consider some interesting features of self-reference.

There once was a library intern tasked with creating a list of all books in the library that didn’t mention themselves. The intern found that the list became exceedingly long, and thought it best to expand the list itself into a book, to be housed in the same library for reference.

Satisfied with their lengthy and tedious effort, the intern proudly placed the book of non-self-referencing books on a shelf in the reference section. After a while, it occurred to them and others that this new book itself, being a book in the library, also needed to be classified. Should this new book be listed inside itself? It certainly seemed so, because it did not mention itself within its pages. But once added, it now mentions itself, and therefore should be removed from itself, which means it no longer mentions itself, so it should be added to itself…

This story is known as the Librarian Paradox. Things get very interesting when self-reference and a negative proposition are involved.

The halting problem

The canonical example of a computation that is not decidable is the halting problem, which was originally proposed by Alan Turing himself. Suppose there is an algorithm, and therefore, a Turing machine, HH, that takes an encoding of another TM\text{TM}, MM, as input, along with an arbitrary string, ww, and always halts, accepting if MM will halt when processing ww as input, and rejecting if it will not halt on ww. This needs to be done without running MM on ww, of course, since that may hang, and then HH wouldn’t be an algorithm. So HH could be a very useful algorithm. We could call H(M,w)H(M,w) to decide whether we should actually bother executing M(w)M(w). This could lead to software that reads other programs and finds infinite loops, a nice, preemptive debugging tactic.

Intuitively, we suspect that the halting problem is unsolvable since not all TM\text{TM}s halt on every possible input string. After all, what kind of oracle can foresee whether a TM\text{TM} will halt on an arbitrary input? But, as usual, we would like a proof. We will show there can be no such HH by assuming that it does exist, and then deriving a contradiction from that assumption.

If HH exists, then we can use it as a subroutine to build other machines. For simplicity, we will write a new TM\text{TM}, HH', that takes only one input, the string encoding of an arbitrary input TM\text{TM}, and feeds that encoding as both arguments to HH. Put simply, H(M)H'(M) calls H(M,M)H(M,M). The input string is immaterial in what follows, so no harm done. The following figures depict HH and HH', respectively.

Get hands-on with 1300+ tech skills courses.