Beginner to Advanced Computing and Logic Building/

...

Algorithmic Efficiency: Guessing Game

Learn how to develop efficient strategies of problem-solving and their beautiful connection with computer science.

We'll cover the following...

Description
Algorithm development
- Can we do better?
- Can we still do better?
Example
Algorithm
How efficient is the binary search!
- Exponentiation vs. logarithm
- Naive approach vs. binary search
Why search for an efficient strategy?
Exercise (Covid-101)

Description

Let us play a guessing game where one player thinks of a random number within a range, and the rest of the players try to guess that number.

We will try to simulate this game but with a slightly modified objective which is to develop an algorithmA set of rules that must be followed when solving a particular problem (ref. Oxford Dictionary) that reduces the number of guesses to a minimum number of tries.

Algorithm development

Imagine a classroom where a teacher asks the students to guess a number within the range of $1$ to $1000$ . The students will bombard the teacher with random tries. Someone may ask, Is it 1? Or 21? Or 65? Or 330? Or 500? Or maybe 999? Or even 1000? Whatever the number really is, it will be extremely hard to guess right. In the worst case, students may have to make $1000$ tries to guess that number.

Now, the question is, can we guess this number in fewer tries? What if the teacher allows True/False questions too?

If someone asks, “Is it a prime number?” and the teacher replies that it is not a prime number. Then this attempt will reduce the number of possibilities to $832$ , as the number of primes within the range of $1$ to $1000$ is $168$ .

What if someone asks “Is the number even or odd?” and the teacher replies that it is an even number. Now this question cuts down $500$ tries (odd numbers) at once and leaves only $500$ even numbers to guess from.

Someone may think of dividing the problem space into intervals of say 100s. For example, a student asks, “Is the number less than a 100?” If the teacher replies yes, we can guess the number in only a hundred tries (asking about each number from 1–99). But in case the teacher replies no, a student will probably ask if the number is in the next interval, such as less than 200 but greater than 100. Now, if the teacher replies yes, then again we’ll be able to guess the number in $101$ tries: 2 for determining the interval and the rest for trying every possibility within that interval.

If the number is between $900$ to $1000$ , this strategy of searching by interval will take $109$ tries to guess the number which is much better than our previous naive strategies.

Now that we understand the questions, let’s jump into the classroom.

So far, we have found that the most efficient way is to identify the interval first, and then within the range of $100$ s, we try every possibility. There are two points to consider:

Trying every possibility leads to too many tries. Can we shrink the interval size further and repeat the same procedure?
In our interval search strategy, our question answers (“yes/no”) strategy was creating unequal distributions. Like in our first question: Does the number lie within the range $1-100$ , “no” was yielding a remaining range of $900$ tries, and “yes” was yielding 100 tries.

In a worst-case scenario, it is possible that the desired number always lies in the bigger interval range consuming more tries.

Can we do better?

Yes, we can. What if we first divide the number (in general it should be the range size1000 in our case. So it would be 1000/2 = 500) by $2$ and then ask if the number is in the first half or in the second half (by asking something like this: Is the number less than or equal to the middle number?). After that we can again divide it into ranges of $100$ numbers at a time and, in the worst case, we can solve it in $105$ steps. Now, we have a systematic way to solve a problem: divide the number in different parts and then solve it.

Can we still do better?

Yes, we can repeat the step of dividing the range into two parts and select the one for which the teacher nods yes. For example, we first take the middle of the range and ask the teacher if the number is in the first or second half. Whatever the teacher answers, select that range. Let’s suppose that the number is in the second half; now take the middle of the 2nd half and ask the teacher again by taking the middle and repeating the same procedure until we find the number (keep shrinking the range until the range becomes $1$ number and that is our target).

Example

Let’s consider that the target number is $571$ and the range is again $1-1000$ . First, we divide $1000$ by $2$ , which is $500$ , and ask if this number is greater than $500$ or not, or if it is in the first half or second half? As $571$ is in the second half we’ll find the middle of the second half $($ by adding the two extremes and dividing by 2: $\frac{501+1000}{2}=750$ $)$ . Then, the question is whether it is greater than $750$ or not? As it is not, we’ll divide this half again by $2$ and again ask if this number is greater than $625$ or not and so on. Here is the table of the questions and the halving range with respect to that the guessed range is shrinking slowly.

Step	Range	Guess	Answer	Shrunk Range
1	1–1000	<501	No	501-1000
2	501–1000	<751	Yes	501-750
3	501–750	<626	Yes	501-625
4	501–625	<562	No	562-625
5	562–625	<594	Yes	562-593
6	562–593	<578	Yes	562-577
7	562–577	<569	No	569-577
8	569–577	>573	No	569-573
9	569–573	>570	Yes	571-572
10	571–572	>571	No	571 (answer)

In computer science, we call this halving strategy of finding a number within a range as binary searching.

Let us write down the step-by-step algorithm, in pseudocodea technique used to describe the steps of an algorithm or a coding logic in plain English.

Algorithm

if $low$ is the starting element, and $high$ is the ending element of that range, then $mid$ will be $\bm{\frac{low+high}{2}}$ .

Step 1: Initialize $low = 1, high = maximum\_size, mid = (low + high) / 2$

Step 2: Repeat the whole process until $mid = target$
$\space \space \space \space$ Step 2.1: If $mid < target$ , change $high$ to $mid-1$
$\space \space \space \space$ Step 2.2: If $mid > target$ , change $low$ to $mid+1$
$\space \space \space \space$ Step 2.3: Recompute $mid = (low + high) / 2$

Step-3: mid holds the target value

How efficient is the binary search!

Assuming that $1-N$ was the original range, of size $N$ , in which we needed to find a target, binary search will shrink the range size by following the sequence:

N, \space \frac{N}{2}, \space \frac{N}{4}, \space \frac{N}{8}, \space ...\space , \space 8, \space 4, \space 2, \space 1

Hence our binary search algorithm will take $log_2 N$ steps. For example: In our case where $N$ is $1000$ , binary search will take only $log_2 1000 \approx 10$ .

Exponentiation vs. logarithm

Exponentiation is a mathematical operation, written as $b^n$ , involving two numbers: the base “b” and the exponent “n”. Where n is a positive integer, exponentiation corresponds to repeated multiplication of the base b. That is, $b^n$ is the product of multiplying b with itself n times. Similarly, if we start from a number N and we keep dividing that by b until it becomes $1$ , that quantity is called $\log_b N$ , e.g. $\log_2 1024$ is $10$ as $2^{10}$ = 1024. In the same vein, $\log_2 8$ is $3$ . Similarly, $\log_3 27$ is $3$ , if the number N is not a perfect base of b. Then, the result will be an approximation. For example, $\log_3 28$ is $3.0331....$
Both logarithm and exponentiation are related. A beautiful analogy to understand the two quantities is the as follows. If we start from a quantity $1$ , and keep multiplying by a constant, (let us say 10) then the quantity will increase in the following fashion
$1, 10, 100, 1000, 10000, 100000, 1000000, …$ $1, 10, 10^2, 10^3, 10^4, 10^5, 10^6, 10^7, …$
The number of steps, it will take to reach from $1$ to $10^{82}$ will be $log_{10} 10^{82} => 82$ . If we say something is growing exponentially, it means it is growing very fast.
Just a little fact to remember; the total number of atoms in the known universe is around $10^{82}$ . Now, if you start from a quantity of $1$ and keep multiplying by $10$ then in only $82$ steps we’ll reach as big a number as the number of atoms in the universe.

Naive approach vs. binary search

To realize how efficient binary searching is, consider that we have a supercomputer that has a processor with a clock speed of $10^6$ GHz (Which means it generate $10^{6} \times 10^{9} = 10^{15}$ clock ticks in one second). Assume it can compute execute $10^{15}$ instructions in 1 second (for simplicity assume that in every clock tick it executes one instruction, though in reality a computer takes several thousand clock ticks to execute one instruction).

Now, imagine that our teacher has labeled the entire atoms in the universe with a unique number, hence a range between $1, \space ... \space,\space 10^{82}$ , and they have picked a random atom, and we are challenged with finding the label of that atom. Now, this problem is the same as finding a number from a range of $1 ... 10^{82}$ .

If the computer finds that number using iterative approachchecking each number one by one (by exhaustively trying every possibility, which was our first approach) then it will take $\frac{10^{82}}{10^{15}.24.60.60}\approx1.2 \times 10^{62}\text{days} \approx 3.2\times 10^{59}$ years. In other words, it will take millions of millions of centuries for our supercomputer to find that number.

On the other hand, if we use the binary search algorithm, only $82 \times \log_2 10 \approx 273$ steps are required. Hence, our supercomputer will solve it in, $\frac{273}{10^{15}}$ seconds (less than one-millionth part of a second) by that same computer.

This is the difference in efficiency we talked about.

Why search for an efficient strategy?

A computer scientist’s job is not just to come up with a solution, rather the real challenge is to come up with an Algorithm that is most efficient in terms of space and time. That is the standard every computer scientist aspires to achieve.

We’ll discover in this journey several computational problem-solving algorithms and learn strategies through writing codes in C++.

Exercise (Covid-101)

Consider the following scenario: we have an infinite population of a unicellular organism called Dinos (N Dinos) living in a single-dimensional world in a single row.

Accidently, the first dino has been infected by a deadly virus named Covid-101. This virus spreads from one dino to the other in a linear order (an un-infected dino can get an infection from the infected neighbor only). It took a while to become aware of this viral infection, and it has already infected a huge number of dinos.

Fortunately, Dr. Hakeem Luqman has made the antidote known as phakkee for that virus. There are some important considerations to make:

The phakkee is very difficult and time consuming to make. Hence, making an infinite number of them and giving it to everyone will not be plausible.
During the time it takes to test one dino, it can infect its neighbor.
The infected dino can not be saved.
If a dino is vaccinated the virus cannot infect it. Hence, infection will not proceed through a vaccinated dino (who is yet to be attacked by the virus). Therefore, a vaccinated dino is a guarantee to safeguard the rest of the population to its right.

We need to save the maximum number of dinos from Covid-101. How do we design an efficient algorithm that can save as many dinos as possible?

Getting Started

Computing in the Real World

Introduction to C++ (Table Printing)

GNU Debugger Tutorial

Memory, Operations and Control Structure

Divide, Conquer, and Rule

Iteration, Reusability and Extensibility

Introduction to Arrays

Problem Solving With Arrays

Multi-dimensional Arrays

Sliding Window

Conclusion