Problem Solving: Searching in Sorted Arrays

Does sorted data give any advantage in terms of searching?

Yes! We can make an extremely efficient algorithm for searching if the data is sorted.


Instruction: Use the following playground to write the complete code for the task in this lesson.

2 4 6 8 9 10 11 12 15 18 -1
Editor for your implementation

Search in a sorted array

Task 1: Given a sorted (ascending order) array and a value target, search the index of target inside this array.

Program: Search in a sorted array

Data: 2 4 6 8 9 10 11 12 15 18 
Element: 11
11 is found at index :6

Naive approach 1

Here is the idea:

  • To search a value target, we will compare the target with each element of the sorted array, starting from the 0th index onwards.
    • If the D[di]==target, then we will return the index di.
    • If D[di] > target, then we immediately return -1 (as the values coming up will be increasing further because the array is sorted).
    • Otherwise, we keep traversing the array.
  • If we traverse the array completely and don’t find the target, we’ll return -1.

Let’s write the implementation.

Press + to interact
int searchInSorted(int D[ ] , int size, int target)
{
for(int di=0; di<size; di++) // iterate from the 0 to the size-1.
{
if(D[di] == target) return di; // value is found no need to loop further
if(D[di] > target) return -1; // Not only D[di] is bigger but also all the
} // the remaining values hence it should return -1
return -1; // the loop terminates, it means `t` is still bigger than
// the entire data, hence `t` does not exist and returns `-1`.
}

Instruction: Write the code in the following playground and test it on the main() given above.

Naive approach 2

In the above program, we can see that if during the iteration we reached a value that is bigger than the target, we don’t need to check further. Similarly, we can design another algorithm in which the loop runs in reverse order and, if D[di]<target, then we don’t need to look further towards the left of the array again. You may check this in the above playground with this implementation:

int searchInSorted(int D[ ] , int size, int target)  
{
    for(int di=size-1; di>=0; di--) // iterate from the 0 to the size-1.
    {
       if(D[di] == target) return di;  // value is found no need to loop further 
       if(D[di] < target) return -1;   // Not only D[di] is smaller but also all the
    }                      // the remaining values hence it should return -1
   return -1;   // the loop terminates, it means `t` is still smaller than 
                // the entire data, hence `t` does not exist and returns `-1`.
}

So the question arises: can we do it better by combining both the naive approaches?

Yes, by doing a binary search!

Binary search

Given a sorted array and value target, we have to search the target in a sorted array using the binary search method (also known as the halving technique).

Sample input

Program: Binary Search

Data: 2 4 6 8 9 10 11 12 15 18 
Element: 11
11 is found at index: 6

Binary search is an efficient algorithm to find an element from the sorted array.

The technique repeatedly divides the search range inside the array into halves until we get the exact location.

Look at the animation below to understand the working of the binary search technique.

The idea’s inspiration is from the above two naive ideas. Let us look into the algorithm.

Algorithm’s idea:

  • Step 0: First, we define the range starting with the index si = 0 and ending with the index ei = size-1.

  • Step 1: Assuming we’re using integer division, we calculate mid = (si+ei)/2. If the result of this expression is not an integer, then it will be rounded down to the nearest integer. This is because the mid variable is an integer and so it can only store whole number values.

    • If D[mid] is equal to the target, it means we have found the index that we were looking for. Therefore, we return mid.
    • If the target is smaller than D[mid], we search on the left side of the middle element. Therefore, we must shrink the space by updating end with mid-1.
    • If the target is greater than D[mid], we search on the right side of the middle element. Therefore, we must shrink the space again by updating start with mid+1.
    • Keep repeating Step 1, until the range finishes.
  • Step 2: If step 1 ends up failing the search, it means that the value t does not exist. So we return -1.

The prototype of the function is the following:

Press + to interact
int binarySearching(int D[], int size, int target);

Let’s implement our function:

Press + to interact
int binarySearching(int D[ ], int size, int target)
{
int si = 0, ei = size-1; // si: starting index, ei: ending index
// mi: mid index
while(si<=ei)
{
int mi = (si+ei)/2; // Calculate the midpoint
if(D[mi]==target) // If middle element equals t return middle index
{
return mi;
}
else if(D[mi]>target) // If t is smaller than middle element
{ // search in left side of middle element
ei = mi-1;
}
else // Search in right side of middle element
{
si = mi+1;
}
}
return -1;
}
  • Line 7: We calculate a midpoint on the basis of the start and endpoint.
  • Lines 9–12: If the target equals a middle element D[mi], then we return the middle index mi.
  • Lines 13–16: If the target is smaller than the middle element D[mi], then we search on the left half of the middle element. Hence, the range will shrink from si to mi-1. The way to fix this is by updating ei to mi-1.
  • Lines 17–20: If the target is greater than the middle element D[mi], then we search on the right half of the middle element. Hence, the new range is mi+1 to ei. This means that si should be moved to mi+1.

Instruction: Write the code in the exercise playground and test it.

Remember: The halving procedure, i.e., the loop of the binary search, is always boundedThe maximum number of times something can occur by log2N\log_2 N, where NN is the size of the data array D[ ].


How fast is binary searching?

Imagine if someone tells you that the entire data of the internet (including Google, Microsoft, Facebook, and Amazon) is around 1200 million terabytes, i.e., approximately 1.2×10181.2 \times 10^{18}, which is approximately 1.2×2502511.2 \times 2^{50} \leq 2^{51}. If someone then tells you that all this data is stored and sorted in some structured form where binary searching can be applied, then you know that you can definitely find a specific element within this humongous data quickly and efficiently.

To give you an idea, the number of times our binary searching loop will execute to search inside this massive data will be as follows:

log2251=51\log_2 2^{51} = 51

This is extremely fast, and the entire data can be pruned with this entire range-shrinking algorithm within only 51 iterations.

This is incredibly efficient and super-fast. Isn’t it?

Note: By the time you will read this lesson, maybe some more thousands of million terabytes will be added to this massive data; that will not disturb our calculation above because it will only add 2 to 3 more iterations. The difference between 51, 53, or 55 does not matter much.


Application: Searching in a k-shift sorted array

Task 2: Given a sorted array, shifted k times to the right (circularly, i.e., the last element comes to the first), search efficiently in this array.

Data: 2 4 6 8 9 10 11 12 15 18 

How can we shift the array k times? Look at the example below:

Press + to interact
k=1
Data: 18 2 4 6 8 9 10 11 12 15
k=2
Data: 15 18 2 4 6 8 9 10 11 12
k=3
Data: 12 15 18 2 4 6 8 9 10 11

Sample input

Program: Searching in a sorted array with k shifts to the right.

Data: 12 15 18 2 4 6 8 9 10 11 
K: 3
Element: 4

Sample output

4 is found at index:4

Idea:

If we observe the k shifted sorted array, it is divided into two sorted arrays: first, k elements and then the remaining size-k elements. We can actually solve this problem using two calls to the binary search method and by combining the results. Here is how we do it:

  • We search the target in the shifted elements, which start from D[0] and go up to D[k-1], i.e., BinarySearch(D, k, target).

  • If we find the element in the first array, we return with the found index.

  • If the target does not exist in the first part, then we will search the number in the second sorted array, which starts from D+k, while the remaining size of the array after D+k will be size-k. So, we’ll have BinarySearch(D+k, size-k, target).

  • If the second binary search returns the target index, then we must add k to it as the actual index. This is because the second array is starting from the kth index which is the 0th index of the second array.

For example, we have a k-shifted array with k=3:

12 15 18 2 4 6 8 9 10 11

These are basically two arrays:

i) 12 15 18  ---------  ii) 2 4 6 8 9 10 11
Case 1: Searching for 15
Binary Search result: 2
The answer is:        2

Case 2: Searching for 9
Binary Search result: 4
The answer should be: 7 (the size of the first array should be added 3+4=7)

Let’s implement the function SearchingKShifted():

Press + to interact
int searchingKShifted(int D[ ], int size, int k, int target)
{
// K Shifted array has basically two regions hence we can call two Binary Searches separately
int si;
si = binarySearching(D, k,target);
if(si==-1) // value not found in the first k-subarray
{
si = binarySearching(D+k, size-k,target);
// binary searching starting at D[k] as base address
// till size-k entries i.e. BinarySearching(D[k...size-1])
if(si!=-1)
{
si+=k;
}
}
return si;
}

Complete implementation

Let’s write the complete code below:

2 4 6 8 9 10 11 12 15 18 -1
Complete implementation