Problem Solving: Searching in Sorted Arrays
We will learn efficient ways of searching in sorted arrays.
Does sorted data give any advantage in terms of searching?
Yes! We can make an extremely efficient algorithm for searching if the data is sorted.
Instruction: Use the following playground to write the complete code for the task in this lesson.
2 4 6 8 9 10 11 12 15 18 -1
Search in a sorted array
Task 1: Given a sorted (ascending order) array and a value target
, search the index of target
inside this array.
Program: Search in a sorted array
Data: 2 4 6 8 9 10 11 12 15 18
Element: 11
11 is found at index :6
Naive approach 1
Here is the idea:
- To search a value
target
, we will compare thetarget
with each element of the sorted array, starting from the 0th index onwards.- If the
D[di]==target
, then we will return the indexdi
. - If
D[di] > target
, then we immediately return-1
(as the values coming up will be increasing further because the array is sorted). - Otherwise, we keep traversing the array.
- If the
- If we traverse the array completely and don’t find the
target
, we’ll return-1
.
Let’s write the implementation.
int searchInSorted(int D[ ] , int size, int target){for(int di=0; di<size; di++) // iterate from the 0 to the size-1.{if(D[di] == target) return di; // value is found no need to loop furtherif(D[di] > target) return -1; // Not only D[di] is bigger but also all the} // the remaining values hence it should return -1return -1; // the loop terminates, it means `t` is still bigger than// the entire data, hence `t` does not exist and returns `-1`.}
Instruction: Write the code in the following playground and test it on the main()
given above.
Naive approach 2
In the above program, we can see that if during the iteration we reached a value that is bigger than the target
, we don’t need to check further. Similarly, we can design another algorithm in which the loop runs in reverse order and, if D[di]<target
, then we don’t need to look further towards the left of the array again. You may check this in the above playground with this implementation:
int searchInSorted(int D[ ] , int size, int target)
{
for(int di=size-1; di>=0; di--) // iterate from the 0 to the size-1.
{
if(D[di] == target) return di; // value is found no need to loop further
if(D[di] < target) return -1; // Not only D[di] is smaller but also all the
} // the remaining values hence it should return -1
return -1; // the loop terminates, it means `t` is still smaller than
// the entire data, hence `t` does not exist and returns `-1`.
}
So the question arises: can we do it better by combining both the naive approaches?
Yes, by doing a binary search!
Binary search
Given a sorted array and value target
, we have to search the target
in a sorted array using the binary search method (also known as the halving technique).
Sample input
Program: Binary Search
Data: 2 4 6 8 9 10 11 12 15 18
Element: 11
11 is found at index: 6
Binary search is an efficient algorithm to find an element from the sorted array.
The technique repeatedly divides the search range inside the array into halves until we get the exact location.
Look at the animation below to understand the working of the binary search technique.
The idea’s inspiration is from the above two naive ideas. Let us look into the algorithm.
Algorithm’s idea:
-
Step 0: First, we define the range starting with the index
si = 0
and ending with the indexei = size-1
. -
Step 1: Assuming we’re using integer division, we calculate
mid = (si+ei)/2
. If the result of this expression is not an integer, then it will be rounded down to the nearest integer. This is because themid
variable is an integer and so it can only store whole number values.- If
D[mid]
is equal to thetarget
, it means we have found the index that we were looking for. Therefore, we returnmid
. - If the
target
is smaller thanD[mid]
, we search on the left side of the middle element. Therefore, we must shrink the space by updatingend
withmid-1
. - If the
target
is greater thanD[mid]
, we search on the right side of the middle element. Therefore, we must shrink the space again by updatingstart
withmid+1
. - Keep repeating Step 1, until the range finishes.
- If
-
Step 2: If step 1 ends up failing the search, it means that the value
t
does not exist. So we return-1
.
The prototype of the function is the following:
int binarySearching(int D[], int size, int target);
Let’s implement our function:
int binarySearching(int D[ ], int size, int target){int si = 0, ei = size-1; // si: starting index, ei: ending index// mi: mid indexwhile(si<=ei){int mi = (si+ei)/2; // Calculate the midpointif(D[mi]==target) // If middle element equals t return middle index{return mi;}else if(D[mi]>target) // If t is smaller than middle element{ // search in left side of middle elementei = mi-1;}else // Search in right side of middle element{si = mi+1;}}return -1;}
- Line 7: We calculate a
midpoint
on the basis of the start and endpoint. - Lines 9–12: If the
target
equals a middle elementD[mi]
, then we return the middle indexmi
. - Lines 13–16: If the
target
is smaller than the middle elementD[mi]
, then we search on the left half of the middle element. Hence, the range will shrink fromsi
tomi-1
. The way to fix this is by updatingei
tomi-1
. - Lines 17–20: If the
target
is greater than the middle elementD[mi]
, then we search on the right half of the middle element. Hence, the new range ismi+1
toei
. This means thatsi
should be moved tomi+1
.
Instruction: Write the code in the exercise playground and test it.
Remember: The halving procedure, i.e., the loop of the binary search, is always
by , where is the size of the data array bounded The maximum number of times something can occur D[ ]
.
How fast is binary searching?
Imagine if someone tells you that the entire data of the internet (including Google, Microsoft, Facebook, and Amazon) is around 1200 million terabytes, i.e., approximately , which is approximately . If someone then tells you that all this data is stored and sorted in some structured form where binary searching can be applied, then you know that you can definitely find a specific element within this humongous data quickly and efficiently.
To give you an idea, the number of times our binary searching loop will execute to search inside this massive data will be as follows:
This is extremely fast, and the entire data can be pruned with this entire range-shrinking algorithm within only 51 iterations.
This is incredibly efficient and super-fast. Isn’t it?
Note: By the time you will read this lesson, maybe some more thousands of million terabytes will be added to this massive data; that will not disturb our calculation above because it will only add 2 to 3 more iterations. The difference between 51, 53, or 55 does not matter much.
Application: Searching in a k-shift sorted array
Task 2: Given a sorted array, shifted k
times to the right (circularly, i.e., the last element comes to the first), search efficiently in this array.
Data: 2 4 6 8 9 10 11 12 15 18
How can we shift the array k
times? Look at the example below:
k=1Data: 18 2 4 6 8 9 10 11 12 15k=2Data: 15 18 2 4 6 8 9 10 11 12k=3Data: 12 15 18 2 4 6 8 9 10 11
Sample input
Program: Searching in a sorted array with k shifts to the right.
Data: 12 15 18 2 4 6 8 9 10 11
K: 3
Element: 4
Sample output
4 is found at index:4
Idea:
If we observe the k
shifted sorted array, it is divided into two sorted arrays: first, k
elements and then the remaining size-k
elements. We can actually solve this problem using two calls to the binary search method and by combining the results. Here is how we do it:
-
We search the
target
in the shifted elements, which start fromD[0]
and go up toD[k-1]
, i.e.,BinarySearch(D, k, target)
. -
If we find the element in the first array, we return with the found index.
-
If the
target
does not exist in the first part, then we will search the number in the second sorted array, which starts fromD+k
, while the remaining size of the array afterD+k
will besize-k
. So, we’ll haveBinarySearch(D+k, size-k, target)
. -
If the second binary search returns the target index, then we must add
k
to it as the actual index. This is because the second array is starting from thek
th index which is the0
th index of the second array.
For example, we have a k
-shifted array with k=3
:
12 15 18 2 4 6 8 9 10 11
These are basically two arrays:
i) 12 15 18 --------- ii) 2 4 6 8 9 10 11
Case 1: Searching for 15
Binary Search result: 2
The answer is: 2
Case 2: Searching for 9
Binary Search result: 4
The answer should be: 7 (the size of the first array should be added 3+4=7)
Let’s implement the function SearchingKShifted()
:
int searchingKShifted(int D[ ], int size, int k, int target){// K Shifted array has basically two regions hence we can call two Binary Searches separatelyint si;si = binarySearching(D, k,target);if(si==-1) // value not found in the first k-subarray{si = binarySearching(D+k, size-k,target);// binary searching starting at D[k] as base address// till size-k entries i.e. BinarySearching(D[k...size-1])if(si!=-1){si+=k;}}return si;}
Complete implementation
Let’s write the complete code below:
2 4 6 8 9 10 11 12 15 18 -1