Reproduction is a quality of all living beings, but can we create machines with the ability to reproduce and, thereby, ensure their perpetual survival?
The implications—both practical and philosophical—are endless. For example, think of Von Neumann probes (space probes that can replicate themselves) that can potentially chart out the entire space!
Self-replicating machines were postulated by Von Neumann in 1940, who thought such machines might be constructed using computers and other automated manufacturing techniques. But to date there’s no real model of a physical machine that can do so. But there’s no reason why such a machine can’t be built.
Or is there?
Surely, one may think, that a machine that can create another machine has to be more complicated than the created machine. A part can’t be of the same complexity as the whole, right?
Not necessarily. All that implies is that a machine cannot contain a copy of itself. It says nothing about a machine “creating” or “generating” a copy of itself. Instead of worrying about the general problem, we’ll demonstrate this point by focussing on computer programs, for which it’s certainly possible!
Let’s first try to understand the task. We’re not asking you to read the source file of the code and print it. The program should be able to print its own code using just the print
statements (or its equivalents in your favorite language).
Now the problem looks more interesting!
When we start to look at this problem, it becomes apparent that we can’t just put the entire code of the program inside a print
statement. This is because the code that needs to be printed would then include this print
statement as well. If we try to include it in the print
statement, then that print
statement would also have to be printed. This pattern will continue endlessly. Clearly, this approach cannot work.
Let’s try to circumvent this by trying out different code organizations, say, by dividing the program into two functions: function A
and function B
. The function A
will be responsible to print the code of the function B
, and function B
will be responsible to print the code of the function A
. We can then run the function B
first to print the function A
, and then run the function A
to print the function B
.
Writing A
is simple enough because we can just copy the code of B
—whatever it may be—and simply paste it into A
.
function A() {
print <code of function B here>
}
As long as we have the code of function B
available, this can be done.
Now, how should we go about writing the function B
? One might be tempted to write it similarly, as follows:
function B() {
print <code of function A here>
}
But this won’t work. The reason is the same infinite regress we encountered earlier.
Since function A
contains the code of function B
, function B
can’t contain the code of function A
because that’d imply function B
contains its own code. And this—as we’ve already realized—is impossible. So what should we do?
If it’s possible at all for a program to print its own code, it’d have to compute or generate at least some part of its code. Any other attempt will run into some sort of infinite regress. There’s no way around it.
We’ll now try doing just that.
As our final attempt, let’s write function A
in the same way but let’s write function B
so that, instead of containing a copy of function A
, it computes the code of function A
.
To obtain the code of function A
, all we need is the code of function B
. If we have access to it, we know how to generate the code of A
. We’d simply add function A() {
and a print
statement before this code, and append a closing }
at the end.
function B(code_of_B) {
print <function A() {
print <code_of_B>
}>
}
As long as we have the code of function B
, we can generate the code of A
computationally. But how can we obtain the code of function B
without running into the infinite regress we encountered in all of our earlier attempts?
We can obtain the code of function B
by running function A
, of course!
Function A
is responsible for just one task: printing the code of function B
. So if we run A
first and pass its output as input to function B
, we can generate the code of function A
!
The function A
is assisting in generating its own code. How cool is that!
To give the complete structure of the code, we’d have to worry about exactly how and where to run the two functions to obtain the code. But those are just details that can be easily taken care of. We’ve made some minor adjustments to the two functions. Function A
is now responsible for providing the code of everything except itself, and function B
is responsible for providing the code of function A
.
With these adjustments, the program would have the following structure:
function A() {
print <function B(code_of_B_and_the_rest) {
code_of_A = <function A() {
return <code_of_B_and_the_rest>
}>
return <code_of_A>
}
code_of_B_and_the_rest = A()
code_of_A = B(code_of_B_and_the_rest)
print code_of_A
print code_of_B_and_the_rest>
}
function B(code_of_B_and_the_rest) {
code_of_A = <function A() {
return <code_of_B_and_the_rest>
}>
return <code_of_A>
}
code_of_B_and_the_rest = A()
code_of_A = B(code_of_B_and_the_rest)
print code_of_A
print code_of_B_and_the_rest
Note that, as we observed earlier, all code starting with function
B
needs to be duplicated inside functionA
.
Now, in an actual implementation in some language, we’d need to carefully print string delimiters as well, even when they appear inside other string delimiters.
For example, see the following implementation in JavaScript in which we’ve taken care of this problem:
Programs that can print their own code are referred to as quines, after the logician and philosopher Willard Van Orman Quine. You can find examples of quines in almost every programming language. You can use the mechanism we’ve described here to endow any program with the ability to acquire its own code, thereby turning it into an equivalent program that’s also a quine.
Consider this program that finds the minimum element in an array:
We can convert it into a quine by adding the apparatus developed earlier:
Now that we have access to the entire code towards the end of this program, we can perform arbitrary computations with it—instead of just frivolously printing it—provided that we have some way of executing the code. Fortunately, since JavaScript is an interpreted language, we can do so using the eval
function. Following is an implementation of the “recursive” factorial
function, without using the built-in recursion of JavaScript.
Take care to copy any additional code you may write and paste it inside function
A
as well. This is because functionA
is responsible for generating the entire file (except itself).
In lines 26–32, we define the factorial
function in the usual way, except we don’t use the built-in recursion of the language. Instead, we evaluate the entire code of the file appended with the factorial(n-1)
call. This would non-recursively evaluate the factorial
function from the obtained program code on n-1
!
This can come in handy in languages that do not support recursion natively.
This implementation here is inspired by the strategy discussed in the proof of Kleene’s recursion theorem in an excellent
Though we haven’t done the formal proof of this theorem here, we’ve looked at some examples to see that this can be done. The proof is a generalization of the techniques that are apparent in our examples.
It’s time to explore some of the consequences of the recursion theorem.
In addition to its obvious applications in enabling a computer virus to self-replicate, the recursion theorem helps in obtaining—often constructive—proofs of some important theorems. Let’s look at two of these below.
It’s well-known that the halting problem (i.e., the problem of determining whether a given program will halt on a given input) is unsolvable algorithmically. The proof makes use of a tricky diagonalization technique initially developed by Georg Cantor and has been used influentially since then in mathematics and computer science. We can come up with a simpler proof of the unsolvability of the halting problem using the recursion theorem.
We’ll show that if we assumed that the halting problem is solvable by a
program H
, we’d arrive at a contradiction (something that we know to be false or impossible). We can do so simply by creating an impossible machine P
as follows:
function P(w) {
1. Obtain the code of P via the recursion theorem
2. Run H on the program P and and input w.
3. If H says P will halt on w, go into an infinite loop.
4. If H says P will not halt on w, simply halt.
}
Now it’s clear that P
will halt on w
if and only if P
will not halt on w
, which is clearly impossible. Hence, H
cannot exist! QED.
We’ll end this blog by stating and proving a very implausible result. Suppose you have a transformer of computer programs. That is, you have a computer program that takes in a program and outputs another program after performing some arbitrary changes in the code of . In case the transformation introduces syntax errors, the convention is to regard as a program that just returns false on all inputs. The fixed-point theorem states that no matter how arbitrary such a transformer is, there’ll always be a program whose behavior remains unchanged by this transformer. Sure, the code will change, but both and will be equivalent, in the sense that they’d both give the same outputs on the same inputs.
Surely it’s not possible! We can easily write a completely random transformer that just garbles up the whole code! Right?
Wrong! We can be as random as we like. The theorem guarantees that some such “fixed-points” would always exist.
The proof is constructive. That is, we’ll describe a program whose behavior remains unchanged by the transformation .
function F(w) {
1. Obtain the code of F using the techniques discussed above.
2. Apply the transformation T on the obtained code of F to obtain a program T(F).
3. Simulate T(F) on the input w, and do what T(F) does.
}
Clearly, both and have the same behavior since both compute the same outputs on the same inputs. This is because is just simulating . Hence the program is our fixed-point. QED.
Note that using the technique suggested in this theorem and the techniques discussed in this blog, you can actually construct such fixed-points!
This theorem is one of the many fixed-point theorems found in mathematics. For amusement, let’s briefly talk about an implication of the influential Brouwer’s fixed-point theorem from Topology.
Suppose two pieces of identical papers are placed on top of each other in a perfectly aligned way. Then if the top paper is picked up, crumpled up in any way that does not involve tearing, and put back anywhere on the bottom paper, there’ll always be a point on the two papers that’d still be aligned. That is, , where the function represents the crumpling-and-putting-back transformation we described above.
Randomness, apparently, can’t be too random!
Free Resources