Asymptotic Complexity and Big O Notation
Explore the complexity of algorithms and Big O notation for representation in this lesson.
We'll cover the following...
Asymptotic complexity
There is usually more than one way to solve a problem, and if efficiency is a concern, we should first focus on high-level optimizations by choosing the right algorithms and data structures. A useful way of evaluating and comparing algorithms is by analyzing their asymptotic computational complexity—that is, analyzing how the running time or memory consumption grows when the size of the input increases. In addition, the C++ standard library specifies the asymptotic complexity for all containers and algorithms, which means that a basic understanding of this topic is a must if you are using this library.
Linear search
Let's start off with an example. Suppose we want to write an algorithm that returns true if it finds a specific key in an array, or false otherwise. In order to find out how our algorithm behaves when passed different sizes of the array, we would like to analyze the running time of this algorithm as a function of its input size:
bool linear_search(const std::vector<int>& vals, int key) noexcept {for (const auto& v : vals) {if (v == key) {return true;}}return false;}
The algorithm is straightforward. It iterates over the elements in the array and compares each element with the key. If we are lucky, we find the key at the beginning of the array and it returns immediately, but we might loop through the entire array without finding the key at all. This would be the worst case for the algorithm, and in general, that is the case we want to analyze.
But what happens with the running time when we increase the input size? Say we double the size of the array. Well, in the worst case, we need to compare all elements in the array, which would double the running time. There seems to be a linear relationship between the input size and the running time. We call this a linear growth rate:
Now consider the following algorithm:
struct Point {int x_{};int y_{};};bool linear_search(const std::vector<Point>& a, const Point& key) {for (size_t i = 0; i < a.size(); ++i) {if (a[i].x_ == key.x_ && a[i].y_ == key.y_) {return true;}}return false;}
We are comparing points instead of integers and we are using an index with the subscript operator to access each element. How is the running time affected by these changes? The absolute running time is probably higher compared to the first algorithm since we are doing more work—for example, the comparison of points involves two integers instead of just one for each element in the array. However, at this stage, we are interested in the growth rate the algorithm exhibits, and if we plot the running time against the input size, we will still end up with a straight line, as shown in the preceding figure.
Binary search
As the last example of searching for integers, let's see whether we can find a better algorithm if we assume that the elements in the array are sorted. Our first algorithm would work regardless of the order of the elements, but if we know that they are sorted, we can use a binary search. It works by looking at the element in the middle to determine whether it should continue searching in the first or second half of the array. For simplicity, the indexes high
, low
, and mid
are of type int
and requires a static_cast
. A better option would be to use iterators that will be covered in succeeding chapters. Here follows the algorithm:
bool binary_search(const std::vector<int>& a, int key) {auto low = 0;auto high = static_cast<int>(a.size()) - 1;while (low <= high) {const auto mid = std::midpoint(low, high); // C++20if (a[mid] < key) {low = mid + 1;} else if (a[mid] > key) {high = mid - 1;} else {return true;}}return false;}
As you can see, this algorithm is harder to get correct than a simple linear scan. It looks for the specified key by guessing that it's in the middle of the array. If it's not, it compares the key with the element in the middle to decide which half of the array it should keep looking for the key in. So, in each iteration, it cuts the array in half.
Assume we called binary_search()
with an array containing 64 elements. In the first iteration, we reject 32 elements; in the next iteration, we reject 16 elements; in the next iteration, we reject 8 elements, and so on, until there are no more elements to compare or until we find the key. For an input size of 64, there will be, at most, 7 loop iterations. What if we double the input size to ...