Solution: Serialize and Deserialize Binary Tree

Let's solve the Serialize and Deserialize Binary Tree problem using the Tree Depth-first Search pattern.

Statement

Serialize a given binary tree to a file and deserialize it back to a tree. Make sure that the original and the deserialized trees are identical.

  • Serialize: Write the tree to a file.

  • Deserialize: Read from a file and reconstruct the tree in memory.

Serialize the tree into a list of integers, and then, deserialize it back from the list to a tree. For simplicity’s sake, there’s no need to write the list to the files.

Constraints:

  • The number of nodes in the tree is in the range [0,500][0, 500].
  • −1000≤-1000 \leq Node.value ≤1000\leq 1000

Solution

So far, you’ve probably brainstormed some approaches and have an idea of how to solve this problem. Let’s explore some of these approaches and figure out which one to follow based on considerations such as time complexity and any implementation constraints.

Naive approach

An initial idea may be to store one of the traversals of the tree into a file when serializing a tree and read that traversal back to create a tree when deserializing. However, any one of the traversals is not unique. That is, two different trees can have the same in-order traversal. The same goes for pre-order or post-order traversal as well. As a simple example, consider a right-slanting degenerate tree and a left-slanting degenerate tree. Both of these trees have the same in-order, pre-order as well as post-order traversals, but they are different trees. However, two traversals are sufficient to uniquely represent and reconstruct a binary tree.

For serialization, this approach will store both the inorder and preorder traversal and place a delimiter to separate them.

For deserialization, pick each node from the preorder traversal, make it root, and find its index in the inorder traversal. The nodes to the left of that index will be the part of the left subtree, and the nodes to the right of that index will be the part of the right subtree.

We got the required solution but at what cost? Can we avoid using twice the space? Can we avoid using tree traversal twice?

Optimized approach using tree depth-first search

We can save the extra space by storing the preorder traversal of the tree and a marker for NULL pointers. The marker is used to identify the NULL nodes of the tree, which is helpful in deserializing the binary tree. We use the preorder (root, left, right) traversal to serialize the whole tree. While deserializing the binary tree, it’s easy to reconstruct the entire tree using preorder traversal. We start by creating the root, and then its left and right children, respectively. The preorder traversal comes under depth-first search.

Here is how we’ll implement this algorithm:

We’ll serialize the tree as follows:

  • We perform a depth-first traversal and serialize individual nodes to the stream.

  • We use a preorder (root, left, right) traversal here.

  • We serialize the marker to represent a NULL pointer, which helps deserialize the tree.

Consider the binary tree below as an example: