Home/Blog/Languages/8 data structures every Python programmer needs to know

8 data structures every Python programmer needs to know

14 min read

Feb 16, 2021

content

What are data structures?

Arrays (Lists) in Python

Common arrays interview questions in Python

Queues in Python

Common queue interview questions in Python

Stacks in Python

Common stacks interview questions in Python

Linked lists in Python

Common linked list interview questions in Python

Circular linked lists in Python

Common circular linked list interview questions in Python

Keep brushing up on Python Data Structures

Trees in Python

Common tree interview questions in Python

Graphs in Python

Common graph interview questions in Python

Hash tables in Python

Common hash table interview questions in Python

What to learn next

Continue reading about Python interviews

Become a Software Engineer in Months, Not Years

From your first line of code, to your first day on the job — Educative has you covered. Join 2M+ developers learning in-demand programming skills.

What are data structures?#

Data structures are code structures for storing and organizing data that make it easier to modify, navigate, and access information. Data structures determine how data is collected, the functionality we can implement, and the relationships between data.

Data structures are used in almost all areas of computer science and programming, from operating systems, to front-end development, to machine learning.

Data structures help to:

Manage and utilize large datasets

Quickly search for particular data from a database

Build clear hierarchical or relational connections between data points

Simplify and speed up data processing

Data structures are vital building blocks for efficient, real-world problem solving. Data structures are proven and optimized tools that give you an easy frame to organize your programs. After all, there’s no need for you to remake the wheel (or structure) every time you need it.

Each data structure has a task or situation it is most suited to solve. Python has 4 built-in data structures, lists, dictionaries, tuples, and sets. These built-in data structures come with default methods and behind the scenes optimizations that make them easy to use.

Most data structures in Python are modified forms of these or use the built-in structures as their backbone.

List: Array-like structures that let you save a set of mutable objects of the same type to a variable.
Tuple: Tuples are immutable lists, meaning the elements cannot be changed. It’s declared with parenthesis instead of square brackets.
Set: Sets are unordered collections, meaning that elements are unindexed and have no set sequence. They’re declared with curly braces.
Dictionary (dict): Similar to hashmap or hash tables in other languages, a dictionary is a collection of key/value pairs. You initialize an empty dictionary with empty curly braces and fill it with colon separated keys and values. All keys are unique, immutable objects.

Now, let’s see how we can use these structures to create all the advanced structures interviewers are looking for.

We could use a Python list with append() and pop() methods to implement a queue. However, this is inefficient because lists must shift all elements by one index whenever you add a new element to the beginning.

Instead, it’s best practice to use the deque class from Python’s collections module. Deques are optimized for the append and pop operations. The deque implementation also allows you to create double-ended queues, which can access both sides of the queue through the popleft() and popright() methods.

Stacks in Python#

Stacks are a sequential data structure that act as the Last-in, First-out (LIFO) version of queues. The last element inserted in a stack is considered at the top of the stack and is the only accessible element. To access a middle element, you must first remove enough elements to make the desired element the top of the stack.

Many developers imagine stacks as a stack of dinner plates; you can add or remove plates to the top of the stack but must move the whole stack to place one at the bottom.

Trees in Python#

Trees are another relation-based data structure, which specialize in representing hierarchical structures. Like a linked list, they’re populated with Node objects that contain a data value and one or more pointers to define its relation to immediate nodes.

Each tree has a root node that all other nodes branch off from. The root contains pointers to all elements directly below it, which are known as its child nodes. These child nodes can then have child nodes of their own. Binary trees cannot have nodes with more than two child nodes.

Any nodes on the same level are called sibling nodes. Nodes with no connected child nodes are known as leaf nodes.

The most common application of the binary tree is a binary search tree. Binary search trees excel at searching large collections of data, as the time complexity depends on the depth of the tree rather than the number of nodes.

Binary search trees have four strict rules:

The left subtree contains only nodes with elements lesser than the root.

The right subtree contains only nodes with elements greater than the root.

Left and right subtrees must also be a binary search tree. They must follow the above rules with the “root” of their tree.

There can be no duplicate nodes, i.e. no two nodes can have the same value.

class Node:
 
    def __init__(self, data):
 
        self.left = None
        self.right = None
        self.data = data
 
    def insert(self, data):
# Compare the new value with the parent node
        if self.data:
            if data < self.data:
                if self.left is None:
                    self.left = Node(data)
                else:
                    self.left.insert(data)
            elif data > self.data:
                if self.right is None:
                    self.right = Node(data)
                else:
                    self.right.insert(data)
        else:
            self.data = data
 
# Print the tree
    def PrintTree(self):
        if self.left:
            self.left.PrintTree()
        print( self.data),
        if self.right:
            self.right.PrintTree()
 
# Use the insert method to add nodes
root = Node(12)
root.insert(6)
root.insert(14)
root.insert(3)
 
root.PrintTree()

Advantages:

Good for representing hierarchical relationships
Dynamic size, great at scale
Quick insert and delete operations
In a binary search tree, inserted nodes are sequenced immediately.
Binary search trees are efficient at searches; length is only $O(height)$ .

Disadvantages:

Time expensive, $O(logn)4$ , to modify or “balance” trees or retrieve elements from a known location
Child nodes hold no information on their parent node and can be hard to traverse backwards
Only works for lists that are sorted. Unsorted data degrades into linear search.

They’re primarily used to convey visual web-structure networks in code form. These structures can model many different types of relationships like hierarchies, branching structures, or simply be an unordered relational web. The versatility and intuitiveness of graphs makes them a favorite for data science with Python.

When written in plain text, graphs have a list of vertices and edges:

V = {a, b, c, d, e}
E = {ab, ac, bd, cd, de}

In Python, graphs are best implemented using a dictionary with the name of each vertex as a key and the edges list as the values.

Each input key goes through a hash function that converts it from its starting form into an integer value, called a hash. Hash functions must always produce the same hash from the same input, must compute quickly, and produce fixed-length values. Python includes a built-in hash() function that speeds up implementation.

The table then uses the hash to find the general location of the desired value, called a storage bucket. The program then only has to search this subgroup for the desired value rather than the entire data pool.

Beyond this general framework, hash tables can be very different depending on the application. Some may allow keys from different data types, while some may have differently setup buckets or different hash functions.

Here is an example of a hash table in Python code:

import pprint
class Hashtable:
    def __init__(self, elements):
        self.bucket_size = len(elements)
        self.buckets = [[] for i in range(self.bucket_size)]
        self._assign_buckets(elements)
    def _assign_buckets(self, elements):
        for key, value in elements: #calculates the hash of each key
            hashed_value = hash(key)
            index = hashed_value % self.bucket_size # positions the element in the bucket using hash
            self.buckets[index].append((key, value)) #adds a tuple in the bucket
    def get_value(self, input_key):
        hashed_value = hash(input_key)
        index = hashed_value % self.bucket_size
        bucket = self.buckets[index]
        for key, value in bucket:
            if key == input_key:
                return(value)
        return None
    def __str__(self):
        return pprint.pformat(self.buckets) # pformat returns a printable representation of the object
if __name__ == "__main__":
     capitals = [
        ('France', 'Paris'),
        ('United States', 'Washington D.C.'),
        ('Italy', 'Rome'),
        ('Canada', 'Ottawa')
    ]
hashtable = Hashtable(capitals)
print(hashtable)
print(f"The capital of Italy is {hashtable.get_value('Italy')}")

What to learn next#

There are dozens of interview questions and formats for each of these 8 data structures. The best way to learn Python for the interview process is to keep trying hands-on practice problems.

To help you become a data structure expert, Educative has created the Ace the Python Coding Interview Path. This curated Learning Path includes hands-on practice material for all the most discussed concepts like data structures, recursion, and concurrency.

By the end, you’ll have completed over 250 practice problems and have the hands-on experience to crack any interview question.

Happy learning!

Continue reading about Python interviews#

Written By:

Ryan Thelin

Join 2.5 million developers at

Explore the catalog

Free Resources

8 data structures every Python programmer needs to know

What are data structures?#

Arrays (Lists) in Python#

Common arrays interview questions in Python#

Queues in Python#

Common queue interview questions in Python#

Stacks in Python#

Common stacks interview questions in Python#

Linked lists in Python#

Common linked list interview questions in Python#

Circular linked lists in Python#

Common circular linked list interview questions in Python#

Keep brushing up on Python Data Structures#

Trees in Python#

Common tree interview questions in Python#

Graphs in Python#

Common graph interview questions in Python#

Hash tables in Python#

Common hash table interview questions in Python#

What to learn next#

Continue reading about Python interviews#