How many traversals need to be known to construct a BST - algorithm

I am very confused by a number of articles at different sites regarding constructing a Binary Search Tree from any one traversal (pre,post or in-order), or a combination of any two of them. For example, at this page, it says that given the pre,post or level order traversal, along with the in-order traversal, one can construct the BST. But here and there, they show us to construct a BST from pre-order alone. Also, here they show us how to construct the BST from given pre and post-order traversals. In some other site, I found a solution for constructing a BST from the post-order traversal only.
Now I know that given the inorder and pre-order traversals, it is possible to uniquely form a BST. As regards the first link I provided, although they say that we can't construct the BST from pre-order and post-order, can't I just sort the post-order array to get its inorder traversal, and then use that and the pre-order array to form the BST? Will that be same as the solution in the 4th link, or different? And given pre-order only, I can sort that to get the in-order, then use that and the pre-order to get the BST. Again, does that have to be different from the solution at links 2 and 3?
Specifically, what is sufficient to uniquely generate the BST? If uniquement is not required, then I can simply sort it to get the in-order traversal, and build one of the N possible BSTs from it recursively.

To construct a BST you need only one (not in-order) traversal.
In general, to build a binary tree you are going to need two traversals, in order and pre-order for example. However, for the special case of BST - the in-order traversal is always the sorted array containing the elements, so you can always reconstruct it and use an algorithm to reconstruct a generic tree from pre-order and in-order traversals.
So, the information that the tree is a BST, along with the elements in it (even unordered) are equivalent to an in-order traversal.
Bonus: why is one traversal not enough for a general tree, (without the information it is a BST)?
Answer: Let's assume we have n distinct elements. There are n! possible lists to these n elements, however - the possible number of trees is much larger (2 * n! possible trees for the n elements are all decayed trees, such that node.right = null in every node, thus the tree is actually a list to the right. There are n! such trees, and another n! trees where always node.left = null ) Thus, from pigeon hole principle - there is at least one list that generates 2 trees, thus we cannot reconstruct the tree from a single traversal.
(QED)

If the values for the nodes of the BST are given then only one traversal is enough because the rest of the data is provided by the values of the nodes. But if the values are unknown then, as per my understanding, constructing a unique BST from a single traversal is not possible. However, I am open to suggestions.

Related

Binary Tree Serialization and Deserialization using in-order traversal

The below is an excerpt from geeksforgeeks
If the given Binary Tree is Binary Search Tree, we can store it by
either storing preorder or postorder traversal. In case of Binary
Search Trees, only preorder or postorder traversal is sufficient to
store structure information.
Questions
Is it not possible to use in-order traversal for serialization and deserialization of Binary Trees?. if so why?
what is the distinction between Binary Tree and BST serialization?. the above statement is not clear about this distinction
In-order traversal of a BST produces the sorted list of data, regardless of how the tree looked like.
On the contrary, given a list produced by pre-order traversal, the BST can be reconstructed:
The first element is a root.
Split the rest by the value of the root. The S of BST guarantees that the split point exists, and the first/second slices encompass the left/right subtrees respectively.
Recursively apply the procedure to first and second slice.
This procedure relies heavily on the S property of BST. An arbitrary Binary Tree doesn't have the split point.

Use case of different traversal order in binary tree

There are preorder, inorder and postorder traversal for a binary tree, but no matter what order, it just traverses the tree to find a matched path. Is there any use case where I have to use any of the orders? Or are they just different ways but no difference regarding practical usage? Thanks.
There is definite practical usage with these traversals.
There are few specific use cases as below :
By using In-order traversal, you can get sorted node values if your requirement needs sorted information..
By using Pre-order traversal, you can create a copy of the tree and also can be used to get prefix expression of an expression tree.
Postorder traversal is used to delete the tree and also can be useful to get postfix expression of an expression tree.
The appropriate traversal technique shall be used based on which nodes should be fetched first for the requirement / design in hand. In case, if your requirement requires roots to be processed /picked / analyzed before picking up leaf nodes then pre-order traversal shall be helpful. Else, if leaf nodes have to be processed / fetched / analyzed before root nodes, then post-order shall be helpful.

Proving that one binary tree is a subtree of another

Assume you have two binary trees and you want to know whether one is a subtree of the other. One solution is to get the inorder and preorder traversals of both trees and check whether the traversals of the candidate subtree are substrings of the corresponding traversal for the other tree. I read several posts about this posts about this solution. One discussion shows that inorder AND preorder traversal are both necessary. Can someone explain why they are sufficient? Why is the case that if the inorder and preorder traversal of tree2 are substrings of those of tree1, then tree2 is a subtree of tree1?
Q: One discussion shows that inorder AND preorder traversal are both
necessary. Can someone explain why they are sufficient?
Because of the simple fact that it is possible to uniquely reconstruct a binary tree from these two traversals (or inorder and postorder, as well). Check this example:
Inorder : [1,2,3,4,5,6]
Preorder : [4,2,1,3,5,6]
From preorder, you know that 4 is the root of the tree. From inorder, you can determine the left and right subtree, and you proceed recursively from this point:
4
/ \
Left subtree Right subtree
Inorder : [1,2,3] Inorder : [5,6]
Preorder: [2,1,3] Preorder: [5,6]
Check for more details in this excellent article:
Reconstructing binary trees from tree traversal. Since these two serializations (traversals actually serialize tree to a string) of the tree combined together have to be unique for a binary tree, we get that one tree is a subtree of another if and only if these traversals are substrings of other two serializations.
People agreed that binary tree can represent the order on it's nodes by left/right relation. That means that left part comes before right part. You may call trees equivalent if the order is the same. So in-order string represents the the order and if you want to check the equivalence, then it is sufficient to check only in-order (by definition).
But when you want to check the full equality of trees then we have to find the way how we can distinguish equivalent trees.For example it can be level-order check. But for subtrees level order doesn't fit, because the level order string for subtree is split. For pre-order you walk the subtree form root before other parts of tree.
Suppose equivalent trees are not equal, then traversing in pre-order everything will be equal until first differ. 2 situations can happen.
1) The value of node of one tree differs from another. That means that pre-order strings differs, because you walk the tree in pre-order.
2) Children signature (no children, only left, only right, both children) differs. But in this situation easy to understand that the in-order will change and trees are not equivalent, which contradicts the conditions.
Note that this works only when all the nodes are unique. If you have all nodes of value like "a" then no matter how you walk, your string is always "aa...a". So you have to distinguish the nodes somehow, not only by "value".

Algorithm to identify if tree is subtree of other tree

I am reading cracking the coding interview, and I have a question on the solution of the following problem:
You have two very large binary trees: T1, with millions of nodes, and T2, with hundreds of nodes. Create an algorithm to decide if T2 is a subtree of T1.
The simple solution that it suggests is to create a string representing the in-order and pre-order traversals and check if T2s pre-order/in-order traversal is substring of T1's pre-order/in-order traversal.
What I wonder is why do we need to compare both traversals? And why exactly that two traversals, why not for example in-order and post-order. And mainly won't only one traversal be enough? Say only in-order or pre-order traversal?
One traversal isn't enough. Consider the graphs 1->2->3 and 2<-1->3. If you start with node 1 and do a traversal, you encounter the nodes in the order 1, 2, 3. If you simply create a string showing the pre-order the two give the same result: 1,2,3
On the other hand, if you use a post-order, then the two will give a different result. 3,2,1 and 2,3,1
I bet for any one ordering, you can find two different trees with the same result.
So the question you need to answer for yourself for any other pair you want to look at is: would there be a tree that would give the same order for both traversals? I'm going to leave that as something to think about and come back later to see if you've got it.
I think preorder traversal with sentinel to represent null node is enough.
we can use this approach to serialize/deserialize a binary tree. That means, it is an one-to-one mapping between a binary tree to its preorder+sentinel representation.
After we get strings for both small tree and big tree. then we do a string match using kmp algorithm.
I know people are saying that we have to use both preorder and inorder (or postorder and inorder). but most of them just follow what others are saying, rather than think independently.

Tree traversal and serialization

I am trying to get straight in my head how tree traversals can be used to uniquely identify a tree, and the crux of it seems to be whether the tree is a vanilla Binary Tree (BT), or if it also has the stricter stipulation of being a Binary Search Tree (BST). This article seems to indicate that for BT's, a single inorder, preorder and postorder traversal will not uniquely identify a tree (uniquely means structure and values of keys in this context). Here is a quick summary of the article:
BTs
1. We can uniquely reconstruct a BT with preorder + inorder and postorder + inorder.
2. We can also use preorder + postorder if we also stipulate that the traversals keeps track of the null children of a node.
(an open question (for me) is if the above is still true if the BT can have non-unique elements)
BSTs
3. We cannot use inorder for a unique id. We need inorder + preorder, or inorder + postorder.
Now, (finally) my question is, can we use just pre-order or just post-order to uniquely identify a BST? I think that we can, since this question and
answer
seems to say yes, we can use preorder, but any input much appreciated.
I can't tell what's being asked here. Any binary tree, whether it's ordered or not, can be serialized by writing out a sequence of operations needed to reconstruct the tree. Imagine a simple stack machine with just two instructions:
Push an empty tree (or NULL pointer if you like) onto the stack
Allocate a new internal node N, stuff a value into N, pop the top two trees off the stack and make them N's left and right children, and finally push N onto the stack.
Any binary tree can be serialized as a "program" for such a machine.
The serialization algorithm uses a postorder traversal.
Okay, you can use preorder only to identify a tree. This is possible because only in preorder traversal does the id-of-current-node comes before the ids of children. So you can read the traversal output root-to-leaves.
You can check http://en.wikipedia.org/wiki/Tree_traversal#Pre-order to confirm
So you can consider a preorder traversal as a list of insertions into a tree. Because the tree insertion into BST is deterministic, when you insert a list of values into an empty tree, you always get the same tree.

Resources