Why Data Structures and Algorithms are Important
Learn about the need for knowledge of algorithms in problem-solving.
We'll cover the following
The need for knowledge of algorithms
Algorithms are everywhere. Every branch of computer science is highly influenced by the use of good algorithms. Whether it’s web security or managing operating system tasks, algorithms are needed in all areas. So, learning about algorithms is a must.
Like computer science, algorithms are also used heavily in Data science. In data science, we use many different algorithms: recommendation algorithms to recommend movies to customers, prediction algorithms to predict house prices, classification algorithms to classify an image as cat or dog, and much more. Each of these algorithms is composed of smaller algorithms that need to be written efficiently so that the master algorithm can perform better.
Tech companies also understand the importance of algorithmic knowledge. That’s why tech companies will conduct one or more rounds of algorithm-related interviews before hiring for any data science position. The aim of these interviews is to understand the thinking and problem-solving capabilities of the interviewee.
In general, the interviewer starts by asking a question inspired by real life. It is expected of the interviewee to solve the problem efficiently (both in terms of time and space complexity).
After a verbal or pseudocode discussion, the next challenge is to write the code in a programming language.
In this section, we will learn about different types of algorithms, and we will discuss a general algorithm toolbox that can be applied to solve a problem and write the effective code in Python. Solving the problem verbally is not enough to clear the interviews. You have to put your solution into working code, and make sure that it will pass all the boundary conditions.
Your first challenge
Now, let’s understand the need for the right technique by using an example. Consider an array of n elements. The array consists of numbers from to in random order. All numbers in this array are unique, except one. You have to find the repeating number.
Method 1: Going easy! Use loops
The first solution that may come to mind is to use two loops. Start with the first element in the first loop, and the second element in the second loop, and go until the end of the array.
If the value of the ith element of the first loop is equal to the jth element of the second loop, we are done. We report that element and come out of the loops.
Let’s implement this solution.
#Defination of the methoddef method1(numbers):#Checking for boundary caseif len(numbers)<2:return "Invalid!!!"#Outer Loopfor i in range(0,len(numbers)):#Inner Loopfor j in range(i+1,len(numbers)):#If both the numbers are equal return itif numbers[i]==numbers[j]:return numbers[i]#Test casesnumbers = [2,3,4,1,2]print(method1(numbers))numbers = [5,2,3,5,4,1]print(method1(numbers))numbers = [1,1]print(method1(numbers))numbers = [0]print(method1(numbers))numbers = [1,9,2,8,3,7,4,7,5,6]print(method1(numbers))
This is great! We got the solution on the first attempt. Should we start celebrating?
Not quite yet. Let’s analyze the algorithm more deeply.
In the above algorithms, we used two loops (one inside of the other). That means for each array of size n, we are running it times. So, the complexity of the code is ().
Can we do better?
Method 2: Use a hash table or the Python dictionary
Instead of using two loops we can use a Python dictionary (which works like a hash table) or set to get the repeated element.
First, define an empty dictionary. Each time we get an element, check if it is already available in the dictionary. If yes, the problem is solved: we got the numbers we wanted to find. If not, add the item to the dictionary and keep going. Let’s implement this solution.
#Defination of the methoddef method2(numbers):#Checking for boundary conditionsif len(numbers)<2:return "Invalid!!!"#Defining a hash tablesolution_hash = dict()#Checking the existance of the number in the dictionaryfor item in numbers:if item in solution_hash:return item#If number is not available, add it.else:solution_hash[item] = 1#Test casesnumbers = [2,3,4,1,2]print(method2(numbers))numbers = [5,2,3,5,4,1]print(method2(numbers))numbers = [1,1]print(method2(numbers))numbers = [0]print(method2(numbers))numbers = [1,9,2,8,3,7,4,7,5,6]print(method2(numbers))
Much better. We found a way to get our solution using only one loop instead of two. Accessing to the elements contained in the dictionary takes time.
But wait! In this solution, we use a data structure dictionary. We are creating a dictionary of the approximate same size as the array of numbers. Doing so takes up some space in the memory, so the space complexity also increases to .
Can we do better? Can we stay at an running time complexity without using any additional space?
Yes, we can.
Method 3: Mathematics are here to help
Now, let’s use our mathematical knowledge to better understand the problem. We have given that for an n size array, we have numbers from to . We know that one number is repeating. So, we can calculate the sum of the numbers from 1 to using a series sum formula. We can also calculate the sum of the elements of the given array. Subtracting a series sum from the array sum will give you the required number.
#Defination of the methoddef method3(numbers):#Checking for boundary casesif len(numbers)<2:return "Invalid!!!"#Calculating series sumseries_sum = (len(numbers)*(len(numbers)-1))/2#Storing sum in running arrayarray_element_sum = 0#Performing the running sumfor item in numbers:array_element_sum+=item#Calculation for repeating numberrepeated_number = array_element_sum - series_sum#Returning the repeated numberreturn int(repeated_number)#Test casesnumbers = [2,3,4,1,2]print(method3(numbers))numbers = [5,2,3,5,4,1]print(method3(numbers))numbers = [1,1]print(method3(numbers))numbers = [0]print(method3(numbers))numbers = [1,9,2,8,3,7,4,7,5,6]print(method3(numbers))
This is good. We got the solution without using any extra space and still solved the problem in .
It is always necessary to design the best solution for the problem. In the following lessons, we will help you build your algorithm toolbox, which can be applied to problems to find the optimal solution.
Interview question:
Interview question:
Why should we avoid writing algorithms with exponential complexity?