Why Data Structures and Algorithms are Important

Learn about the need for knowledge of algorithms in problem-solving.

The need for knowledge of algorithms

Algorithms are everywhere. Every branch of computer science is highly influenced by the use of good algorithms. Whether it’s web security or managing operating system tasks, algorithms are needed in all areas. So, learning about algorithms is a must.

Like computer science, algorithms are also used heavily in Data science. In data science, we use many different algorithms: recommendation algorithms to recommend movies to customers, prediction algorithms to predict house prices, classification algorithms to classify an image as cat or dog, and much more. Each of these algorithms is composed of smaller algorithms that need to be written efficiently so that the master algorithm can perform better.

Tech companies also understand the importance of algorithmic knowledge. That’s why tech companies will conduct one or more rounds of algorithm-related interviews before hiring for any data science position. The aim of these interviews is to understand the thinking and problem-solving capabilities of the interviewee.

In general, the interviewer starts by asking a question inspired by real life. It is expected of the interviewee to solve the problem efficiently (both in terms of time and space complexity).

After a verbal or pseudocode discussion, the next challenge is to write the code in a programming language.


In this section, we will learn about different types of algorithms, and we will discuss a general algorithm toolbox that can be applied to solve a problem and write the effective code in Python. Solving the problem verbally is not enough to clear the interviews. You have to put your solution into working code, and make sure that it will pass all the boundary conditions.

Your first challenge

Now, let’s understand the need for the right technique by using an example. Consider an array of n elements. The array consists of numbers from 11 to (n1)(n-1) in random order. All numbers in this array are unique, except one. You have to find the repeating number.

Method 1: Going easy! Use loops

The first solution that may come to mind is to use two loops. Start with the first element in the first loop, and the second element in the second loop, and go until the end of the array.

If the value of the ith element of the first loop is equal to the jth element of the second loop, we are done. We report that element and come out of the loops.

Let’s implement this solution.

Press + to interact
#Defination of the method
def method1(numbers):
#Checking for boundary case
if len(numbers)<2:
return "Invalid!!!"
#Outer Loop
for i in range(0,len(numbers)):
#Inner Loop
for j in range(i+1,len(numbers)):
#If both the numbers are equal return it
if numbers[i]==numbers[j]:
return numbers[i]
#Test cases
numbers = [2,3,4,1,2]
print(method1(numbers))
numbers = [5,2,3,5,4,1]
print(method1(numbers))
numbers = [1,1]
print(method1(numbers))
numbers = [0]
print(method1(numbers))
numbers = [1,9,2,8,3,7,4,7,5,6]
print(method1(numbers))

This is great! We got the solution on the first attempt. Should we start celebrating?

Not quite yet. Let’s analyze the algorithm more deeply.

In the above algorithms, we used two loops (one inside of the other). That means for each array of size n, we are running it nnn*n times. So, the complexity of the code is OO(nn22).

Can we do better?

Method 2: Use a hash table or the Python dictionary

Instead of using two loops we can use a Python dictionary (which works like a hash table) or set to get the repeated element.

First, define an empty dictionary. Each time we get an element, check if it is already available in the dictionary. If yes, the problem is solved: we got the numbers we wanted to find. If not, add the item to the dictionary and keep going. Let’s implement this solution.

Press + to interact
#Defination of the method
def method2(numbers):
#Checking for boundary conditions
if len(numbers)<2:
return "Invalid!!!"
#Defining a hash table
solution_hash = dict()
#Checking the existance of the number in the dictionary
for item in numbers:
if item in solution_hash:
return item
#If number is not available, add it.
else:
solution_hash[item] = 1
#Test cases
numbers = [2,3,4,1,2]
print(method2(numbers))
numbers = [5,2,3,5,4,1]
print(method2(numbers))
numbers = [1,1]
print(method2(numbers))
numbers = [0]
print(method2(numbers))
numbers = [1,9,2,8,3,7,4,7,5,6]
print(method2(numbers))

Much better. We found a way to get our solution using only one loop instead of two. Accessing to the elements contained in the dictionary takes O(n)O(n) time.

But wait! In this solution, we use a data structure dictionary. We are creating a dictionary of the approximate same size as the array of numbers. Doing so takes up some space in the memory, so the space complexity also increases to O(n)O(n).

Can we do better? Can we stay at an O(n)O(n) running time complexity without using any additional space?

Yes, we can.

Method 3: Mathematics are here to help

Now, let’s use our mathematical knowledge to better understand the problem. We have given that for an n size array, we have numbers from 11 to n1n-1. We know that one number is repeating. So, we can calculate the sum of the numbers from 1 to n1n-1 using a series sum formula. We can also calculate the sum of the elements of the given array. Subtracting a series sum from the array sum will give you the required number.

Press + to interact
#Defination of the method
def method3(numbers):
#Checking for boundary cases
if len(numbers)<2:
return "Invalid!!!"
#Calculating series sum
series_sum = (len(numbers)*(len(numbers)-1))/2
#Storing sum in running array
array_element_sum = 0
#Performing the running sum
for item in numbers:
array_element_sum+=item
#Calculation for repeating number
repeated_number = array_element_sum - series_sum
#Returning the repeated number
return int(repeated_number)
#Test cases
numbers = [2,3,4,1,2]
print(method3(numbers))
numbers = [5,2,3,5,4,1]
print(method3(numbers))
numbers = [1,1]
print(method3(numbers))
numbers = [0]
print(method3(numbers))
numbers = [1,9,2,8,3,7,4,7,5,6]
print(method3(numbers))

This is good. We got the solution without using any extra space and still solved the problem in O(n)O(n).

It is always necessary to design the best solution for the problem. In the following lessons, we will help you build your algorithm toolbox, which can be applied to problems to find the optimal solution.

Interview question:

Question

Why should we avoid writing algorithms with exponential complexity?

Show Answer