It's fair to assume that those who read this blog have experience in writing programs to solve problems. They've likely solved enough problems to justifiably believe that any problem that requires programming is solvable, as long as it’s well-defined. With the advent of generative AI like ChatGPT—that can significantly aid in solving programming problems—this belief will likely become even more pervasive.
However, there are limitations on what can be solved through programming! This blog will discuss the exciting and profound field of computer science, which aims to study computability itself.
Let's ask some obvious but important questions that are foundational.
Given a well-defined problem, is there always an algorithm to solve it?
The answer to the question is no, not always!
This isn't just about limitations of resources, like processing power or memory constraints, or even about the underlying physics of the computers. Some problems just can’t be solved algorithmically in any sort of computer, whether classical or quantum, irrespective of the hardware specifications.
Are these problems esoteric, residing only in the minds of theoreticians and of no practical significance?
No, they aren’t. They are very real and relevant problems, such as automated software testing, determining algorithmically whether a specific piece of code in a program is ever executed, or determining whether a given program ever halts.
But there are static analysis tools that warn us about unused code, infinite loops, and the like—don’t they follow an algorithm to determine that?
When a problem is said to be algorithmically unsolvable, this is not to say that it’s impossible to solve all instances of that problem. Some special instances are rather easy to solve. For example, the code reachability problem can be solved for cases where there’s no terminating condition in a loop. Clearly, any code that appears after the loop won’t ever be executed.
However, when a problem is said to be solved algorithmically, that means that there’s an algorithm that can solve all instances of that problem. For example, a sorting program that can sort arrays of only specific types can’t claim to have solved the general sorting problem.
It's simple to tell when a problem can be solved algorithmically. The algorithm is written and tested, or the programmer can utilize techniques learned in an analysis of algorithms course. But how can one know that there’s no algorithm for a problem?
A formal answer to this question would have to begin by defining a computational model, like Alan Turing's Turing machines, Alonzo Church's lambda calculus, or Kurt Gödel's recursive functions. However, the focus will remain on computer programs for the purposes of this blog.
A framework is necessary to answer these questions. The scope will be limited to specific types of problems and programs. This will help keep the explanations crisp and will avoid needless details.
This discussion will be limited to decision problems, where the objective is to test whether a given instance of the problem satisfies certain properties. For example, the following problems are decision problems:
Determine whether a given number is a prime number, where the number defines an instance of the problem.
Determine whether divides , where the pair and defines an instance of the problem.
Determine whether the list is a sorted version of a list , where the pair and defines an instance of the problem.
This discussion will be limited to programs with a specific structure because these suffice for the decision problems. Our programs will have the following features:
They take only one input parameter in string format.
Their return type is boolean, i.e. True/False.
They can simulate another program on an input if an encoding of both the program and its input are passed as an input to our program.
They have no limitations on their memory usage.
The first condition doesn't limit generality because multiple inputs of different types can be consolidated by casting and encoding them all in one string with a delimiter distinguishing between different parameters. This condition is relaxed in some of the examples below to keep the explanations simple. The third assumption should also not cause any distress. Programs can be easily passed as inputs to other programs. Think of compilers that take the source code of another program as input, or of functions that can be passed as parameters in programming languages in which functions are first-class citizens. If we were working with Turing machines, this assumption would be justified by describing the mechanics of a universal Turing machine that can easily simulate other Turing machines.
Decision problems are natural candidates for the kind of programs described above. Given an instance of a decision problem, a program can be written that returns True if the instance has the specific properties and otherwise, False.
Importantly, if there's no such program that can solve a decision problem, it's definitely unsolvable.
Note that when such a program is run on an input, there can be three kinds of results:
It halts and returns True.
It halts and returns False.
It doesn't halt and continues computation forever.
This third condition holds, for instance, if there is a loop that never terminates.
Naturally, when a program solves a decision problem, it should either return True or False, and it shouldn't run into any infinite computation on any input.
Given a program
takes two inputs (encoded as one, as discussed above): a program and a string . returns True if the program halts on with either True or False, and returns False if the program never halts on input .
The halting problem isn't solvable algorithmically.
Suppose such a program
If
If
It’s a straightforward matter to write
Now, to address the question "Why was
Note: If it seems strange seeing a program being passed as input to itself, think of passing the code of a compiler to itself as input when using it to compile itself!
Replacing for above,
That is,
Both of these possibilities are self-contradictory. Therefore, they can’t happen. Logic dictates that such a program can’t exist. But every step in writing is clearly doable, assuming that exists. Hence, can’t exist.
Thus, the halting problem is unsolvable!
What's achieved here is quite remarkable. The above section just demonstrated that it is impossible to solve a problem algorithmically!
This argument employs the diagonalization argument that was initially presented by Georg Cantor to prove that real numbers are uncountable. That is, there're so many real numbers that it’s impossible to even specify a way of listing them all in one infinite list, as opposed to, say, natural numbers that can be listed, like
Now, it’s time to face the consequences! This impossibility leads to other impossibilities.
Suppose there’s a program , a piece of code that appears in , and an input string . Can someone write a program to determine whether the code is ever executed during ’s execution on input ?
The answer is no.
We'll demonstrate below that if the code reachability problem is solvable, so is the halting problem. As was illustrated above, the halting problem isn’t solvable, this would imply that the code reachability problem isn’t solvable either. In other words, we'll demonstrate that the halting problem is reducible to the code reachability problem, thereby establishing it isn't solvable.
Suppose a program
As the first step,
Now,
If
If
In other words,
One can use similar reductions to show that almost every meaningful question about computer programs is unanswerable programmatically.
The software verification problem asks: given a bunch of formally written requirements, , and a program , is there a program that can tell whether the program satisfies all of the requirements ?
If such an algorithm existed, quality assurance departments all over the world would have to fight hard to justify their existence! To test whether the programs are working according to the given requirements, the requirements would need to be formally specified and the verifier
But unfortunately, this problem can’t be solved either. Here’s why:
Assume the software verification problem is solvable, that is, the program
Just like before, it will construct another program
Now, if
On the other hand, if
Clearly, this shows that the halting problem can be solved, which is impossible. Hence, the software verification problem is unsolvable as well.
This is arguably the most ambitious question ever asked in the history of mathematics. In 1928, David Hilbert and Wilhelm Ackermann—based on a musing of Leibniz—asked whether it was possible to automate the process of finding proofs of mathematical statements. That is, given a mathematical claim, is it possible to determine algorithmically whether it’s provable or unprovable? Thereby reducing the jobs of mathematicians to asking meaningful questions and relegating the grunt work of proving or disproving them to this algorithm. This never came to be as this problem too is unsolvable. It was shown to be unsolvable by Alan Turing in the same paper that introduced Turing machines.
Free Resources