Algorithmic Efficiency: Guessing Game
Learn how to develop efficient strategies of problem-solving and their beautiful connection with computer science.
Description
Let us play a guessing game where one player thinks of a random number within a range, and the rest of the players try to guess that number.
We will try to simulate this game but with a slightly modified objective which is to develop an
Algorithm development
Imagine a classroom where a teacher asks the students to guess a number within the range of to . The students will bombard the teacher with random tries. Someone may ask, Is it 1? Or 21? Or 65? Or 330? Or 500? Or maybe 999? Or even 1000? Whatever the number really is, it will be extremely hard to guess right. In the worst case, students may have to make tries to guess that number.
Now, the question is, can we guess this number in fewer tries? What if the teacher allows True/False questions too?
If someone asks, “Is it a prime number?” and the teacher replies that it is not a prime number. Then this attempt will reduce the number of possibilities to , as the number of primes within the range of to is .
What if someone asks “Is the number even or odd?” and the teacher replies that it is an even number. Now this question cuts down tries (odd numbers) at once and leaves only even numbers to guess from.
Someone may think of dividing the problem space into intervals of say 100s. For example, a student asks, “Is the number less than a 100?” If the teacher replies yes, we can guess the number in only a hundred tries (asking about each number from 1–99). But in case the teacher replies no, a student will probably ask if the number is in the next interval, such as less than 200 but greater than 100. Now, if the teacher replies yes, then again we’ll be able to guess the number in tries: 2 for determining the interval and the rest for trying every possibility within that interval.
If the number is between to , this strategy of searching by interval will take tries to guess the number which is much better than our previous naive strategies.
Now that we understand the questions, let’s jump into the classroom.
So far, we have found that the most efficient way is to identify the interval first, and then within the range of s, we try every possibility. There are two points to consider:
- Trying every possibility leads to too many tries. Can we shrink the interval size further and repeat the same procedure?
- In our interval search strategy, our question answers (“yes/no”) strategy was creating unequal distributions. Like in our first question: Does the number lie within the range , “no” was yielding a remaining range of tries, and “yes” was yielding 100 tries.
In a worst-case scenario, it is possible that the desired number always lies in the bigger interval range consuming more tries.
Can we do better?
Yes, we can. What if we first divide the number (in general it should be the
Can we still do better?
Yes, we can repeat the step of dividing the range into two parts and select the one for which the teacher nods yes. For example, we first take the middle of the range and ask the teacher if the number is in the first or second half. Whatever the teacher answers, select that range. Let’s suppose that the number is in the second half; now take the middle of the 2nd half and ask the teacher again by taking the middle and repeating the same procedure until we find the number (keep shrinking the range until the range becomes number and that is our target).
Example
Let’s consider that the target number is and the range is again . First, we divide by , which is , and ask if this number is greater than or not, or if it is in the first half or second half? As is in the second half we’ll find the middle of the second half by adding the two extremes and dividing by 2: . Then, the question is whether it is greater than or not? As it is not, we’ll divide this half again by and again ask if this number is greater than or not and so on. Here is the table of the questions and the halving range with respect to that the guessed range is shrinking slowly.
Step | Range | Guess | Answer | Shrunk Range |
---|---|---|---|---|
1 | 1–1000 | <501 | No | 501-1000 |
2 | 501–1000 | <751 | Yes | 501-750 |
3 | 501–750 | <626 | Yes | 501-625 |
4 | 501–625 | <562 | No | 562-625 |
5 | 562–625 | <594 | Yes | 562-593 |
6 | 562–593 | <578 | Yes | 562-577 |
7 | 562–577 | <569 | No | 569-577 |
8 | 569–577 | >573 | No | 569-573 |
9 | 569–573 | >570 | Yes | 571-572 |
10 | 571–572 | >571 | No | 571 (answer) |
In computer science, we call this halving strategy of finding a number within a range as binary searching.
Let us write down the step-by-step algorithm, in
Algorithm
if is the starting element, and is the ending element of that range, then will be .
Step 1: Initialize
Step 2: Repeat the whole process until
Step 2.1: If , change to
Step 2.2: If , change to
Step 2.3: Recompute
Step-3: mid
holds the target
value
Let’s take a look at the following interactive animation for better understanding.
If finding in the range of took 10 tries through binary search, then how many tries will it take to find a number for the range .
How many steps will it take to find a number for the range .
How efficient is the binary search!
Assuming that was the original range, of size , in which we needed to find a target, binary search will shrink the range size by following the sequence:
Hence our binary search algorithm will take steps. For example: In our case where is , binary search will take only .
Exponentiation vs. logarithm
Exponentiation is a mathematical operation, written as , involving two numbers: the base “b” and the exponent “n”. Where n is a positive integer, exponentiation corresponds to repeated multiplication of the base b. That is, is the product of multiplying b with itself n times. Similarly, if we start from a number N and we keep dividing that by b until it becomes , that quantity is called , e.g. is as = 1024. In the same vein, is . Similarly, is , if the number N is not a perfect base of b. Then, the result will be an approximation. For example, is
Both logarithm and exponentiation are related. A beautiful analogy to understand the two quantities is the as follows. If we start from a quantity , and keep multiplying by a constant, (let us say 10) then the quantity will increase in the following fashion
The number of steps, it will take to reach from to will be . If we say something is growing exponentially, it means it is growing very fast.
Just a little fact to remember; the total number of atoms in the known universe is around . Now, if you start from a quantity of and keep multiplying by then in only steps we’ll reach as big a number as the number of atoms in the universe.
Naive approach vs. binary search
To realize how efficient binary searching is, consider that we have a supercomputer that has a processor with a clock speed of GHz (Which means it generate clock ticks in one second). Assume it can compute execute instructions in 1 second (for simplicity assume that in every clock tick it executes one instruction, though in reality a computer takes several thousand clock ticks to execute one instruction).
Now, imagine that our teacher has labeled the entire atoms in the universe with a unique number, hence a range between , and they have picked a random atom, and we are challenged with finding the label of that atom. Now, this problem is the same as finding a number from a range of .
If the computer finds that number using
On the other hand, if we use the binary search algorithm, only steps are required. Hence, our supercomputer will solve it in, seconds (less than one-millionth part of a second) by that same computer.
This is the difference in efficiency we talked about.
Why search for an efficient strategy?
A computer scientist’s job is not just to come up with a solution, rather the real challenge is to come up with an Algorithm that is most efficient in terms of space and time. That is the standard every computer scientist aspires to achieve.
We’ll discover in this journey several computational problem-solving algorithms and learn strategies through writing codes in C++.
Exercise (Covid-101)
Consider the following scenario: we have an infinite population of a unicellular organism called Dinos (N Dinos) living in a single-dimensional world in a single row.
Accidently, the first dino has been infected by a deadly virus named Covid-101. This virus spreads from one dino to the other in a linear order (an un-infected dino can get an infection from the infected neighbor only). It took a while to become aware of this viral infection, and it has already infected a huge number of dinos.
Fortunately, Dr. Hakeem Luqman has made the antidote known as phakkee for that virus. There are some important considerations to make:
-
The phakkee is very difficult and time consuming to make. Hence, making an infinite number of them and giving it to everyone will not be plausible.
-
During the time it takes to test one dino, it can infect its neighbor.
-
The infected dino can not be saved.
-
If a dino is vaccinated the virus cannot infect it. Hence, infection will not proceed through a vaccinated dino (who is yet to be attacked by the virus). Therefore, a vaccinated dino is a guarantee to safeguard the rest of the population to its right.
We need to save the maximum number of dinos from Covid-101. How do we design an efficient algorithm that can save as many dinos as possible?