Binary Trees & Binary Search Trees
We'll cover the following
- Introduction
- Height and Depth of the Tree
- Key of the node and Subtree
- Binary Search Tree
- Search in a Binary Search Tree
- Defining the Tree Node
- Defining Binary Search Tree
- Inserting into BST
- Implementing Search in a BST
- BST Insert and Search in action
- Tree traversals
- Traversals in action
- In-Order Predecessor and In-Order Successor
- Deleting a node in a BST
- Implementing Node Removal in BST
- Exercise
- Summary
Introduction
A binary tree is a linked data structure where each node points to two child nodes (at most). The child nodes are called the left child and right child.
Binary tree is a hierarchical data structure. Here are the properties of a binary tree.
- Each node can point to two children at most.
- The top most element in the tree is called root.
- Two Children are usually referred as Left Child and Right Child.
- The nodes which don’t point to any children are called leaf nodes.
- Non-leaf nodes are called internal nodes. Root is also an internal node if it’s not a leaf node. Let’s step through a few visualizations to internalize these properties.
Tree Basics Quiz
If a Tree has only 1 node, what type of node is it (select all that apply)?
A root node
An internal node
A leaf node
Height and Depth of the Tree
The Depth of the node is the number of nodes on the path from the root to the node. Root has depth 0.
The Height of the node is the number of nodes on the path from the node to the deepest leaf. All leaf nodes have height 0.
You can see that for depth, we begin counting from the root, down towards to the node, starting at 0. For height, we start counting from the node, up towards the root, again starting at 0. Some people start counting depth and height from 1 instead of 0 and it's just a matter of preference.
The Height of the tree is the maximum height of any node in the tree.
Let's step through a visualization to look at the height and depth properties.
Key of the node and Subtree
In a binary tree, the node might contain a lot of data with a key. Our examples are a little simplistic as we are only working with a single integer which acts both as a key and value. However, in many practical situations, a node contains several fields of data with a key. In our examples below, we will continue to use a single value as data and key, but it's a good thing to remember that, in practice, a node might contain more data.
A subtree of the node is the tree formed by its left and right children. Each node in a binary tree has a left subtree and a right subtree.
Let's step through the following visualization to understand the subtrees.
Binary Search Tree
The most common binary tree is called a Binary Search Tree. Why are we discussing a specialized form of Binary Tree without even getting to the code? Because the structure and layout of data in a Binary Tree would make more sense to you if we discuss insertion/deletion/lookup in a Binary Search Tree.
So what's a Binary Search Tree (BST hereafter)? The name already gives a decent hint. It's a binary tree that helps in searching the data in the tree. We'll shortly look at how a BST facilitates efficient lookups. To understand this, first let's understand the properties of a BST.
A binary tree is a BST if the key of the node is greater than all the nodes in its left subtree and is smaller than all the nodes in its right subtree. Let's look at a binary search tree.
Let's take a closer look at the above BST. Root has the key 50. Look at all the nodes in its left subtree. They are all less than 50. Now, look at the right subtree of the root. All the keys in the right subtree are greater than 50. Hence, it's a BST.
Let's look at another example and see whether it's a BST or not.
The tree above is not a BST. Why? Because of the leaf node containing 51. It's in the left subtree of 50 which breaks our invariant.
Search in a Binary Search Tree
Now that we know what a BST looks like, let's see how can it help with lookups. It's pretty simple actually. Suppose you are looking for key X. Here's what you do at each step starting at the root.
- Compare current node's key with X. If it's equal, we've found the key. All done.
- If X is less than node's key, we start looking at node's left subtree. It's because we know that right subtree cannot contain anything greater than X.
- If X is greater than node's key, we start looking in the right subtree.
- We repeat this process until we find the key or we reach the leaf node. If we reach the leaf node and haven't found the key as yet, we return not found.
Let's look at search in a BST in action.
Notice that tree has six elements but we only needed to do three comparisons before we found our desired node. If you look closely, you can see that at each step, we eliminate almost half of the tree.
One thing to remember here is that this is the best case for a BST. There are many cases where your BST might not give you an optimal performance, but they are beyond the scope of this introduction.
We have seen a lot of theory regarding Binary trees and Binary Search trees. Let's look at some code.
Defining the Tree Node
We'll define a Node class with a data field and two fields to point to the left and the right child.
function Node(data) {this.data = data;this.left = null;this.right = null;}
Defining Binary Search Tree
The following code defines how the tree would be represented in Javascript.
function BST() {this._root = null;}
Inserting into BST
In a BST, we need to find the correct location for the Node to be inserted. It depends on the value of the key. Assume, we want to insert a node with key K. Starting at the root, here's a quick description of how we are going to locate the correct location.
- Compare current node's key with K.
- If K is less than the current node,
- If left child of current node is Null, we insert K as the left child of current node and return.
- If the left child is not Null, the left child becomes the new current node, and we repeat the process from step 1.
- If K is greater than the current node,
- If right child of current node is Null, we insert K as the right child of the current node and return.
- If the right child is not Null, the right child becomes the new current node, and we repeat the process from step 1.
Using the steps described above, we insert K at the correct location so that BST is quickly searchable later on. Let's step through a quick visualization to see insertion in action.
You might be wondering what do we do in case of a duplicate. For now, we simply ignore it. There are some binary trees in which we can allow duplicates but if you are searching/sorting on a Key, then duplicated keys don't make sense in many other scenarios.
Let's look at the insertion code for a BST.
BST.prototype.insert = function(data) {var node = new Node(data);// If it's the first nodeif (this._root === null) {this._root = node;return;}var current = this._root;while (current) {if (data < current.data) {if (current.left === null) {current.left = node;return;}current = current.left;} else if (data > current.data) {if (current.right === null) {current.right = node;return;}current = current.right;} else {// Duplicates are not supportedreturn;}}};
Implementing Search in a BST
We've already described how lookup works in Binary Search Tree. Let's look at the code for the lookup method as well. It returns True if the key is found and returns false otherwise.
BST.prototype.contains = function(data) {var current = this._root;while (current) {if (data === current.data) {return true;}if (data < current.data) {current = current.left;} else {current = current.right;}}return false;};
BST Insert and Search in action
> Run the following code to see BST insert and search.
Tree traversals
We've covered search in a BST. When you want to traverse all the elements in a tree, there are a few well-known techniques to do that. These methods define the order in which nodes are visited, and each one of these traversal orders helps us solve different problems. Let's look at the three most common traversal techniques.
Pre-Order Traversal
In Pre-Order traversal, we traverse in the following order:
- First visit the node itself.
- Then visit the left subtree of the node.
- Then visit the right subtree of the node.
Let's step through a visualization to understand it better (Highly recommended to go to the last slide to see the complete order).
Tree traversals are implemented using recursion. Here's the code that fills the array in pre-order traversal.
// Returns keys in pre-order traversalBST.prototype.preOrder = function() {var output = [];function preOrderImpl(node) {if (node === null) {return;}// Visit the node itself.output.push(node.data);// Visit left subtreepreOrderImpl(node.left);// Visit the right subtreepreOrderImpl(node.right);}// Call the internal function// with Root as the starting point.preOrderImpl(this._root);return output;}
In-Order Traversal
In in-Order traversal, we traverse in the following order:
- First visit the left subtree of the node.
- Then visit the node itself.
- Then visit the right subtree of the node.
Let's step through a visualization to understand it better (Highly recommended to go to the last slide to see the complete order).
If you haven't noticed as yet - there is a nice property of In-Order traversal. It traverses the BST in ascending order. Now you know what traversal to use if you want to print/get all the nodes of tree in sorted order.
Let's look at the code for In-Order traversal.
// Returns Keys in the InOrder traversalBST.prototype.inOrder = function() {var output = [];function inOrderImpl(node) {if (node === null) {return;}// Visit left subtreeinOrderImpl(node.left);// Visit the node itself.output.push(node.data);// Visit the right subtreeinOrderImpl(node.right);}// Call the internal function// with Root as the starting point.inOrderImpl(this._root);return output;}
Post-Order Traversal
In Post-Order traversal, we traverse in the following order:
- First visit the left subtree of the node.
- Then visit the right subtree of the node.
- Then visit the node itself.
Let's step through a visualization to understand it better (Highly recommended to go to the last slide to see the complete order).
Let's take a look at the code for Post Order implementation (which is pretty obvious if you've already looked at the pre-order and in-order traversals).
// Returns Keys in the Post Order traversalBST.prototype.postOrder = function() {var output = [];function postOrderImpl(node) {if (node === null) {return;}// Visit left subtreepostOrderImpl(node.left);// Visit the right subtreepostOrderImpl(node.right);// Visit the node itself.output.push(node.data);}// Call the internal function// with Root as the starting point.postOrderImpl(this._root);return output;}
Traversals in action
Let's look at the traversals in action. We'll insert into a BST and then run all three traversals.
> Run the following code to see traversals in action
In-Order Predecessor and In-Order Successor
Why do we need to understand In-Order predecessor and In-Order successor?
You might have noticed that we haven't discussed deletion in a BST (we only discussed insertion and search). The reason is that deletion is a little tricky in a BST and understanding In-Order predecessor and successor, helps in implementing deletion from a BST.
For a given node X,
Node X's predecessor is the node that comes just before X in InOrder traversal. Also remember that In-Order traversal visits the nodes in a sorted order. Hence, X's predecessor is the node with the largest key smaller than the key of X.
Similarly,
Time for some examples.Node X's successor is the node that comes right after X in tree's InOrder traversal. As In-Order traversal visits the nodes in a sorted order, it means that X's successor is the node with the smallest key larger than the key of X.
Deleting a node in a BST
Now that we've understood a lot of terminology about BST's and traversals, let's dive into deleting a Node in a BST. It's a little complicated as we need to ensure that the removal of the node doesn't break the invariant of the BST (left children smaller than the current key and right children larger than the current key).
When deleting a node in BST, there are three cases.
- It's a leaf node and has no children.
- The node has one child (either left or right).
- The node has both left and right children.
First two cases are easier to handle than the third one. Let's look at the easy ones first.
Deleting a Leaf Node
This is the easiest to delete. We just delete it and make sure that it's parent is not pointing to it anymore.
Deleting a node with 1 child
When deleting a node with 1 child, we connect the node's parent with node's child node. In other words, grand parent connects directly with grand child.
Deleting a node with two children
When we have to delete a node with two children, we have two options. Let's say that the node to be deleted has key X.
- Replace the current node's key with its predecessor and then trigger delete for predecessor in node's left subtree.
- Replace the current node's key with its successor and then trigger delete for successor in node's right subtree.
Why would it work?
Remember predecessor is the highest value smaller than the current key. Now we know that this node has a left subtree. Hence, the predecessor to the current node is somewhere in the left subtree. In fact predecessor is the highest key in the node's left subtree. In addition, as it's the highest node in left subtree, it cannot have a right child. Otherwise that right child would have been the predecessor which ensures that we are now trying to delete a node with zero or one child. We already know that deleting a node with zero or one child is simpler and hence we've reduced the problem from a node with two children to deletion of node with zero or one child.
Similarly, successor is the smallest value larger than the current key. We already know that the node has a right subtree. Hence, the successor to the current node is somewhere in the right subtree. In fact, successor is the smallest key in node's right subtree. Hence, it's also a node with zero or one child.
Let's look at a quick visualization
Implementing Node Removal in BST
We'll implement a remove function that would take a key and delete the key ensuring that BST is left in the correct state after removal of the node. However, before we implement remove function, we need to decide what to do when the node to be deleted has two children. We have the option of either going the route of replacing it with predecessor or with its successor.
Let's pick predecessor. We know that the predecessor in the left subtree of the node will be the highest value of this subtree. Hence, we'll implement a quick maximum function.
BST.prototype.maximum = function(node) {while (node.right) {node = node.right;}return node.data;}
Now let's implement the actual removal function.
BST.prototype.remove = function(key) {this.removeImpl(key, this._root);}BST.prototype.removeImpl = function removeImpl(key, node) {if (node != null) {if (key < node.data) {// Key might be in the left subtree.node.left = this.removeImpl(key, node.left);} else if (key > node.data) {node.right = this.removeImpl(key, node.right);} else {// Node found.// Let's see if it has two children.if (node.left && node.right) {// Replace current node with// predecessor datanode.data = this.maximum(node.left);node.left = this.removeImpl(node.data, node.left);} else {// Only 1 child.// Let's return the child that's valid.node = node.left || node.right;}}}return node;}
Let's run the removal code. We'll insert into BST, then delete a few nodes and see that In-Order traversal always returns sorted data.
> Run the following code to try out node removal
Exercise
Implement a function to count the number of elements in the tree.
> Some Tests are failing as size method is not implemented. Implement it to fix the test cases.
Summary
- Binary tree is a hierarchical data structure.
- Binary Search Tree (BST) is a binary tree where a node has all smaller keys to its left and keys greater than the node on its right.
- Insertion and Search in a BST are relatively straight forward.
- Removal is a little tricky as deletion of a node can cause the BST to violate its properties.
- Binary Trees can be traversed in many orders. Most common are Pre-Order, In-Order and Post-Order traversals.