Is it always possible to turn one BST into another using tree rotations? - algorithm

Given a set of values, it's possible for there to be many different possible binary search trees that can be formed from those values. For example, for the values 1, 2, and 3, there are five BSTs we can make from those values:
1 1 2 3 3
\ \ / \ / /
2 3 1 3 1 2
\ / \ /
3 2 2 1
Many data structures that are based on balanced binary search trees use tree rotations as a primitive for reshaping a BST without breaking the required binary search tree invariants. Tree rotations can be used to pull a node up above its parent, as shown here:
rotate
u right v
/ \ -----> / \
v C A u
/ \ <----- / \
A B rotate B C
left
Given a BST containing a set of values, is it always possible to convert that BST into any arbitrary other BST for the same set of values? For example, could we convert between any of the five BSTs above into any of the other BSTs just by using tree rotations?

The answer to your question depends on whether you are allowed to have equal values in the BST that can appear different from one another. For example, if your BST stores key/value pairs, then it is not always possible to turn one BST for those key/value pairs into a different BST for the same key/value pairs.
The reason for this is that the inorder traversal of the nodes in a BST remains the same regardless of how many tree rotations are performed. As a result, it's not possible to convert from one BST to another if the inorder traversal of the nodes would come out differently. As a very simple case, suppose you have a BST holding two copies of the number 1, each of which is annotated with a different value (say, A or B). In that case, there is no way to turn these two trees into one another using tree rotations:
1:a 1:b
\ \
1:b 1:a
You can check this by brute-forcing the (very small!) set of possible trees you can make with the rotations. However, it suffices to note that an inorder traversal of the first tree gives 1:a, 1:b and an inorder traversal of the second tree gives 1:b, 1:a. Consequently, no number of rotations will suffice to convert between the trees.
On the other hand, if all the values are different, then it is always possible to convert between two BSTs by applying the right number of tree rotations. I'll prove this using an inductive argument on the number of nodes.
As a simple base case, if there are no nodes in the tree, there is only one possible BST holding those nodes: the empty tree. Therefore, it's always possible to convert between two trees with zero nodes in them, since the start and end tree must always be the same.
For the inductive step, let's assume that for any two BSTs of 0, 1, 2, .., n nodes with the same values, that it's always possible to convert from one BST to another using rotations. We'll prove that given any two BSTs made from the same n + 1 values, it's always possible to convert the first tree to the second.
To do this, we'll start off by making a key observation. Given any node in a BST, it is always possible to apply tree rotations to pull that node up to the root of the tree. To do this, we can apply this algorithm:
while (target node is not the root) {
if (node is a left child) {
apply a right rotation to the node and its parent;
} else {
apply a left rotation to the node and its parent;
}
}
The reason that this works is that every time a node is rotated with its parent, its height increases by one. As a result, after applying sufficiently many rotations of the above forms, we can get the root up to the top of the tree.
This now gives us a very straightforward recursive algorithm we can use to reshape any one BST into another BST using rotations. The idea is as follows. First, look at the root node of the second tree. Find that node in the first tree (this is pretty easy, since it's a BST!), then use the above algorithm to pull it up to the root of the tree. At this point, we have turned the first tree into a tree with the following properties:
The first tree's root node is the root node of the second tree.
The first tree's right subtree contains the same nodes as the second tree's right subtree, but possibly with a different shape.
The first tree's left subtree contains the same nodes as the second tree's left subtree, but possibly with a different shape.
Consequently, we could then recursively apply this same algorithm to make the left subtree have the same shape as the left subtree of the second tree and to make the right subtree have the same shape as the right subtree of the second tree. Since these left and right subtrees must have strictly no more than n nodes each, by our inductive hypothesis we know that it's always possible to do this, and so the algorithm will work as intended.
To summarize, the algorithm works as follows:
If the two trees are empty, we are done.
Find the root node of the second tree in the first tree.
Apply rotations to bring that node up to the root.
Recursively reshape the left subtree of the first tree to have the same shape as the left subtree of the second tree.
Recursively reshape the right subtree of the first tree to have the same shape as the right subtree of the second tree.
To analyze the runtime of this algorithm, note that applying steps 1 - 3 requires at most O(h) steps, where h is the height of the first tree. Every node will be brought up to the root of some subtree exactly once, so we do this a total of O(n) times. Since the height of an n-node tree is never greater than O(n), this means that the algorithm takes at most O(n2) time to complete. It's possible that it will do a lot better (for example, if the two trees already have the same shape, then this runs in time O(n)), but this gives a nice worst-case bound.
Hope this helps!

For binary search trees this can actually be done in O(n).
Any tree can be "straightened out", ie put into a form in which all nodes are either the root or a left child.
This form is unique (reading down from root gives the ordering of the elements)
A tree is straightened out as follows:
For any right child, perform a left rotation about itself. This decreases the number of right children by 1, so the tree is straightened out in O(n) rotations.
If A can be straightened out into S in O(n) rotations, and B into S in O(n) rotations, then since rotations are reversible one can turn A -> S -> B in O(n) rotations.

Related

Is this new sorting algorithm based on Binary Search Tree useful?

If we some how transform a Binary Search Tree into a form where no node other than root may have both right and left child and the nodes the right sub-tree of the root may only have right child, and vice versa, such a configuration of BST is inherently sorted with its root being approximately in the middle (in case of nearly complete BST’s). To to this we need to do reverse rotations. Unlike AVL and red black trees, where roatations are done to make the tree balanced, we would do reversed rotations.
I would like to explain the pseudo code and logical implementation of the algorithm through the following images. The algorithm is to first sort the left subtree with respect to the root and then the right subtree. These two subparts will be opposite to each other, that is, left would interchange with right. For simplicity I have taken a BST with right subtree, with respect to root, sorted.
To improve the complexity as compared to tree sort we can augment the above algorithm. We can add a flag to each node where 0 stands for a normal node while 1 is when the node has non null right child, in the original unsorted BST. The nodes with flag 1 have an entry in a hash table with key being their pointers and the values being the right most node. For example node 23's pointer would map to 30.5's pointer. Then we would not have to traverse all the nodes in between for the iteration. If we have 23's pointer and 30.5's pointer we can do the required operation in O(1). This will bring down time complexity , as compared to tree sort.
Please review the algorithm and give suggestion if this algorithm is usefull.

Prove or disprove: If you got pre-order traversal of a binary search tree, you can uniquely determine that binary search tree

This is task I found from old exam and I try it for solved because they can ask similar question for me on friday.
For solution I have cheap solution but I think it all matter of definition of binary search tree.
I make first tree:
1
\
1
\
1
and here is second tree
1
/
1
/
1
When you make pre order traversal you have same output for both tree.. because same element, and both have degenerated tree. But you don't have same tree! So statement is false.
Only problem is are my tree binary search tree... I think yes because binary search tree the element can have greater / lower equal element?
Please halp when I asked our teacher he say I can ask him when vacation is over but when vacation is over my exam is over.... No good thing for me.
Your answer is perfectly fine given the standard definition of BSTs. By the standard definition, BSTs can have repeated elements and identical elements can go in either subtree.
If the question intends the case where there are no duplicates, or where duplicates must exist in only the left (or only the right) subtree, then a pre-order traversal is sufficient to reconstruct the tree.
If duplicates are not allowed, construct the tree recursively as follows: make the root the first node, and then make the left and right subtrees recursively using as input traversals the portions of the original traversal with nodes less than (for left subtree) and greater than (for right subtree) the root node. If duplicates are allowed but are constrained to either the left or the right subtree, then use the same procedure but adding "or equal" to either the less than or the greater than, but not both.

Running time to check if a binary tree is subtree of another binary tree

I've come across a naive solution for the problem of checking if a binary tree is subtree of another binary tree:
Given two binary trees, check if the first tree is subtree of the second one. A subtree of a tree T is a tree S consisting of a node in T and all of its descendants in T. The subtree corresponding to the root node is the entire tree; the subtree corresponding to any other node is called a proper subtree.
For example, in the following case, tree S is a subtree of tree T:
Tree 2
10
/ \
4 6
\
30
Tree 1
26
/ \
10 3
/ \ \
4 6 3
\
30
The solution is to traverse the tree T in preorder fashion. For every visited node in the traversal, see if the subtree rooted with this node is identical to S.
It is said in the post that the algorithm has a running time of n^2 or O(m*n) in the worst case where m and n are the sizes of both trees involved.
The point of confusion here is that, if we are traversing through both trees at the same time, in the worst case, it would seem that you would simply have to recurse through all of the nodes in the larger tree to find the subtree. So how could this version of the algorithm (not this one) have a quadratic running time?
Well, basically in the isSubTree() function you only traverse T tree (the main one, not a subtree). You do nothing with S, so in the worst case this function would be executed for every node in T. However (in the worst case) for each execution, it will check if areIdentical(T, S), which in the worst case has to fully traverse one of the given trees (till one of those is zero-sized).
Trees passed to areIdentical() function are obviously smaller and smaller, but in this case it doesn't matter if it comes to time complexity. Either way this gives you O(n^2) or O(n*m) (where n,m - number of nodes in those trees).
To solve reasonably optimally, flatten the two trees. Using Lisp notation,
we get
(10 (4(30) (6))
and
(26 (10 (4(30) (6)) (3 (3))
So the subtree is a substring of the parent. Using strstr we can
complete normally in O(N) time, it might take a little bit longer
if we have lots and lots of near sub-trees. You can use a suffix
tree if you need to do lots of searches and that gets it down to O(M)
time where M is the size of the subtree.
But actually runtime doesn't improve. It's the same algorithm,
and it will have N M behaviour if, for example, all the trees
have the same node id and structure, except for the last right
child of the query sub-tree. it's just that the operations
become a lot faster.

Count nodes bigger then root in each subtree of a given binary tree in O(n log n)

We are given a tree with n nodes in form of a pointer to its root node, where each node contains a pointer to its parent, left child and right child, and also a key which is an integer. For each node v I want to add additional field v.bigger which should contain number of nodes with key bigger than v.key, that are in a subtree rooted at v. Adding such a field to all nodes of a tree should take O(n log n) time in total.
I'm looking for any hints that would allow me to solve this problem. I tried several heuristics - for example when thinking about doing this problem in bottom-up manner, for a fixed node v, v.left and v.right could provide v with some kind of set (balanced BST?) with operation bigger(x), which for a given x returns a number of elements bigger than x in that set in logarihmic time. The problem is, we would need to merge such sets in O(log n), so this seems as a no-go, as I don't know any ordered set like data structure which supports quick merging.
I also thought about top-down approach - a node v adds one to some u.bigger for some node u if and only if u lies on a simple path to the root and u<v. So v could update all such u's somehow, but I couldn't come up with any reasonable way of doing that...
So, what is the right way of thinking about this problem?
Perform depth-first search in given tree (starting from root node).
When any node is visited for the first time (coming from parent node), add its key to some order-statistics data structure (OSDS). At the same time query OSDS for number of keys larger than current key and initialize v.bigger with negated result of this query.
When any node is visited for the last time (coming from right child), query OSDS for number of keys larger than current key and add the result to v.bigger.
You could apply this algorithm to any rooted trees (not necessarily binary trees). And it does not necessarily need parent pointers (you could use DFS stack instead).
For OSDS you could use either augmented BST or Fenwick tree. In case of Fenwick tree you need to preprocess given tree so that values of the keys are compressed: just copy all the keys to an array, sort it, remove duplicates, then substitute keys by their indexes in this array.
Basic idea:
Using the bottom-up approach, each node will get two ordered lists of the values in the subtree from both sons and then find how many of them are bigger. When finished, pass the combined ordered list upwards.
Details:
Leaves:
Leaves obviously have v.bigger=0. The node above them creates a two item list of the values, updates itself and adds its own value to the list.
All other nodes:
Get both lists from sons and merge them in an ordered way. Since they are already sorted, this is O(number of nodes in subtree). During the merge you can also find how many nodes qualify the condition and get the value of v.bigger for the node.
Why is this O(n logn)?
Every node in the tree counts through the number of nodes in its subtree. This means the root counts all the nodes in the tree, the sons of the root each count (combined) the number of nodes in the tree (yes, yes, -1 for the root) and so on all nodes in the same height count together the number of nodes that are lower. This gives us that the number of nodes counted is number of nodes * height of the tree - which is O(n logn)
What if for each node we keep a separate binary search tree (BST) which consists of nodes of the subtree rooted at that node.
For a node v at level k, merging the two subtrees v.left and v.right which both have O(n/2^(k+1)) elements is O(n/2^k). After forming the BST for this node, we can find v.bigger in O(n/2^(k+1)) time by just counting the elements in the right (traditionally) subtree of the BST. Summing up, we have O(3*n/2^(k+1)) operations for a single node at level k. There are a total of 2^k many level k nodes, therefore we have O(2^k*3*n/2^(k+1)) which is simplified as O(n) (dropping the 3/2 constant). operations at level k. There are log(n) levels, hence we have O(n*log(n)) operations in total.

Partitioning a weighted tree to equally weighted subtrees

Input:
a rooted tree with n nodes;
each node p has positive integer weight w(p);
a node can have more than two children.
Problem:
divide the tree into k subtrees/partitions (obviously by removing k-1 edges);
subtree weight W(p) is the weight of all the nodes in a subtree rooted at node p;
all the subtrees should be weighted as evenly as possible - the difference between min(W(p)) and max(W(p)) should be as small as possible.
I've yet to find a suitable algorithm for this. Where should I start? Tips, instructions and pseudocode appreciated.
Assume you can't modify the tree other than to remove edges to create subtrees.
First understand that you cannot guarantee that by simply removing edges that you will have subtrees within an arbitrary bound. You can create tree that when you split them there is no way to create subtrees within a target bound. For example:
a(b(c,d,e,f),g)
You cannot split that into two balanced sections. The best you can do is remove the edge from a to b:
a(g) and b(c,d,e,f)
Also this criteria is a little underdefined when k > 2. What is better a split 10,10,10,1 or 10,10,6,5?
But you can come up with a method to split trees up in the most balanced way possible.
Implement you tree such that each node holds a count of all of its children. You can add this pretty efficiently to any tree. ( E.g. when you add a node you have to iterate up the chain of parent node incrementing the count. Remove a node and you iterate up subtracting from the count )
Then starting from the root iterate down, in a breadth first manner until you find a set of nodes that dominate child nodes in a way that is most balanced. I don't have an algorithm for this at the ready - but I think you can find one pretty readily.
I think something where when you want to divide into k subtrees you create an array of k tree roots. One of those nodes must always be the root of the current tree, then you iterate down looking for nodes to replace on of the k-1 candidates that improves the partitioning. You'll want some kind of terminating condition where you don't interate down to every leaf node. E.g. it never makes sense to subdivide anything by the largest candidate node.

Resources