Data structures are an important part of programming and coding interviews. These skills show your ability to think complexly, solve ambiguous problems, and recognize coding patterns.
Programmers use data structures to organize data, so the more efficient your data structures are, the better your programs will be.
Today, we will take a deep dive into one of the most popular data structures out there: trees.
Today, we will cover:
Get hands-on with data structures
Data structures are amongst the fundamentals of Computer Science and an important decision in every program. Consequently, they are also largely categorized as a vital benchmark of computer science knowledge when it comes to industry interviews. This course contains a detailed review of all the common data structures and provides implementation level details in Java to allow readers to become well equipped. Now with more code solutions, lessons, and illustrations than ever, this is the course for you!
Data structures are used to store and organize data. We can use algorithms to manipulate and use our data structures. Different types of data are organized more efficiently by using different data structures.
Trees are non-linear data structures. They are often used to represent hierarchical data. For a real-world example, a hierarchical company structure uses a tree to organize.
Trees are a collection of nodes (vertices), and they are linked with edges (pointers), representing the hierarchical connections between the nodes. A node contains data of any type, but all the nodes must be of the same data type. Trees are similar to graphs, but a cycle cannot exist in a tree. What are the different components of a tree?
Root: The root of a tree is a node that has no incoming link (i.e. no parent node). Think of this as a starting point of your tree.
Children: The child of a tree is a node with one incoming link from a node above it (i.e. a parent node). If two children nodes share the same parent, they are called siblings.
Parent: The parent node has an outgoing link connecting it to one or more child nodes.
Leaf: A leaf has a parent node but has no outgoing link to a child node. Think of this as an endpoint of your tree.
Subtree: A subtree is a smaller tree held within a larger tree. The root of that tree can be any node from the bigger tree.
Depth: The depth of a node is the number of edges between that node and the root. Think of this as how many steps there are between your node and the tree’s starting point.
Height: The height of a node is the number of edges in the longest path from a node to a leaf node. Think of this as how many steps there are between your node and the tree’s endpoint. The height of a tree is the height of its root node.
Degree: The degree of a node refers to the number of sub-trees.
Trees can be applied to many things. The hierarchical structure gives a tree unique properties for storing, manipulating, and accessing data. Trees form some of the most basic organization of computers. We can use a tree for the following:
But, how does that all look in code? To build a tree in Java, for example, we start with the root node.
Node<String> root = new Node<>("root");
Once we have our root, we can add our first child node using addChild
, which adds a child node and assigns it to a parent node. We refer to this process as insertion (adding nodes) and deletion (removing nodes).
Node<String> node1 = root.addChild(new Node<String>("node 1"));
We continue adding nodes using that same process until we have a complex hierarchical structure. In the next section, let’s look at the different kinds of trees we can use.
There are many types of trees that we can use to organize data differently within a hierarchical structure. The tree we use depends on the problem we are trying to solve. Let’s take a look at the trees we can use in Java. We will be covering:
In N-ary tree, a node can have child nodes from 0-N. For example, if we have a 2-ary tree (also called a Binary Tree), it will have a maximum of 0-2 child nodes.
Note: The balance factor of a node is the height difference between the left and right subtrees.
A balanced tree is a tree with almost all leaf nodes at the same level, and it is most commonly applied to sub-trees, meaning that all sub-trees must be balanced. In other words, we must make the tree height balanced, where the difference between the height of the right and left subtrees do not exceed one. Here is a visual representation of a balanced tree.
There are three main types of binary trees based on their structures.
A complete binary tree exists when every level, excluding the last, is filled and all nodes at the last level are as far left as they can be. Here is a visual representation of a complete binary tree.
A full binary tree (sometimes called proper binary tree) exits when every node, excluding the leaves, has two children. Every level must be filled, and the nodes are as far left as possible. Look at this diagram to understand how a full binary tree looks.
A perfect binary tree should be both full and complete. All interior nodes should have two children, and all leaves must have the same depth. Look at this diagram to understand how a perfect binary tree looks.
Note: You can also have a skewed binary tree, where all the nodes are shifted to the left or right, but it is best practice to avoid this type of tree in Java, as it is far more complex to search for a node.
Get hands-on with data structures
Data structures are amongst the fundamentals of Computer Science and an important decision in every program. Consequently, they are also largely categorized as a vital benchmark of computer science knowledge when it comes to industry interviews. This course contains a detailed review of all the common data structures and provides implementation level details in Java to allow readers to become well equipped. Now with more code solutions, lessons, and illustrations than ever, this is the course for you!
A Binary Search Tree is a binary tree in which every node has a key and an associated value. This allows for quick lookup and edits (additions or removals), hence the name “search”. A Binary Search Tree has strict conditions based on its node
value. It’s important to note that every Binary Search Tree is a binary tree, but not every binary tree is a Binary Search Tree.
What makes them different? In a Binary Search Tree, the left subtree of a subtree must contain nodes with fewer keys than a node’s key, while the right subtree will contain nodes with keys greater than that node’s key. Take a look at this visual to understand this condition.
In this example, the node Y is a parent node with two child nodes. All nodes in subtree 1 must have a value less than node Y, and subtree 2 must have a greater value than node Y.
AVL trees are a special type of Binary Search tree that are self-balanced by checking the balance factor of every node. The balance factor should either be +1, 0, or -1. The maximum height difference between the left and right sub-trees can only be one.
If this difference becomes more than one, we must re-balance our tree to make it valid using rotation techniques. These are most common for applications where searching is the most important operation. Look at this visual to see a valid AVL tree.
A red-black tree is another type of self-balancing Binary Search Tree, but it has some additional properties to AVL trees. The nodes are colored either red or black to help re-balance a tree after insertion or deletion. They save you time with balancing. So, how do we color our nodes?
A 2-3 tree is very different from what we’ve learned so far. Unlike a Binary Search Tree, a 2-3 Tree is a self-balancing, ordered, multiway search tree. It is always perfectly balanced, so every leaf node is equidistant from the root. Every node, other than leaf nodes, can be either a 2-Node (a node with a single data element and two children) or a 3-node (a node with two data elements and three children). A 2-3 tree will remain balanced no matter how many insertions or deletions occur.
A 2-3-4 tree is a search tree that can accommodate more keys than a 2-3 tree. It covers the same basics as a 2-3 tree, but adds the following properties:
To use trees, we can traverse them by visiting/checking every node of a tree. If a tree is “traversed”, this means every node has been visited. There are four ways to traverse a tree. These four processes fall into one of two categories: breadth-first traversal or depth-first traversal.
Inorder: Think of this as moving up the tree, then back down. You traverse the left child and its sub-tree until you reach the root. Then, traverse down the right child and its subtree. This is a depth-first traversal.
Preorder: You start at the root, traverse the left sub-tree, and then move over to the right sub-tree. This is a depth-first traversal.
Postorder: Begin with the left-sub tree and move over to the right sub-tree. Then, move up to visit the root node. This is a depth-first traversal.
Level order: Think of this as a sort of zig-zag pattern. This will traverse the nodes by their levels instead of subtrees. First, we visit the root and visit all children of that root, left to right. We then move down to the next level until we reach a node that has no children. This is the left node. This is a breadth-first traversal.
So, what’s the difference between a breadth-first and depth-first traversal? Let’s take a look at the algorithms Depth-First Search (DFS) and Breath-First Search (BFS) to understand this better.
Note: Algorithms are a sequence of instructions for performing certain tasks. We use algorithms with data structures to manipulate our data, in this case, to traverse our data.
Overview: We follow a path from the starting node to the ending node and then start another path until all nodes are visited. This is commonly implemented using stacks, and it requires less memory than BFS. It is best for topographical sorting, such as graph backtracking or cycle detection.
The steps for the DFS
algorithm are as follows:
visited
before proceeding, or you will be stuck in an infinite loop.Overview: We proceed level-by-level to visit all nodes at one level before going to the next. The BFS algorithm is commonly implemented using queues, and it requires more memory than the DFS algorithm. It is best for finding the shortest path between two nodes.
The steps for the BFS
algorithm are as follows:
visited
before proceeding, or you will be stuck in an infinite loop.It’s important to know how to perfom a search in a tree. Searching means we are locating a specific element or node in our data structure. Since data in a Binary Search Tree is ordered, searching is quite easy. Let’s see how it’s done.
node
with that value or reach a leaf node
, meaning that the value doesn’t exist.In the below example, we are searching for the value 3
in our tree. Take a look.
Let’s see that in Java code now!
public class BinarySearchTree {…public boolean search(int value) {if (root == null)return false;elsereturn root.search(value);}}public class BSTNode {…public boolean search(int value) {if (value == this.value)return true;else if (value < this.value) {if (left == null)return false;elsereturn left.search(value);} else if (value > this.value) {if (right == null)return false;elsereturn right.search(value);}return false;}}
Congratulations on completing your next step into the world of Java and data structures. However, there are still many interview questions and data structures to practice.
Here are some of the common data structures challenges you should look into to get a better sense of how to use trees:
To get started learning these challenges, check out Data Structures for Coding Interviews in Java, which breaks down all the data structures common to Java interviews alongside hands-on, interactive quizzes and coding challenges.
This course gets you up-to-speed on all the fundamentals of computer science in an accessible, personalized way, helping you effectively learn to code in Java.
Free Resources