Ordering binary search trees - binary-tree

When trying to order a set of numbers into a binary search tree, is there always exactly one way to order them so the tree has the shortest height, in other words most efficient?

A set of numbers can be converted to a BST by taking one element as the root of the tree and arranging all other numbers around it. I could see the following situation contradicting this theory:
Picking one root leads to a tree of height h, with the left subtree being 'taller' than the right subtree.
Picking another root leads to a different tree, also of height h, with the right subtree being 'taller' than the left subtree.
Another simple example involves swapping the order of insertion of two consecutive elements that are not directly related, and thus do not affect each other's position in the tree.
Disproof by counter-example.
Let the set S = {0, 1, 2, 3}.
Insert the elements into a binary search tree in the following order: 1, 0, 2, 3
1
/ \
0 2
\
3
Insert the elements into a binary search tree in the following order: 1, 2, 0, 3
1
/ \
0 2
\
3
Because these two trees have different orders of insertion, and yet both have minimum height, the statement that there is only one order of insertion that provides a binary search tree of minimum height is false.
If the actual ordering of elements on the tree is what you're concerned about, insert the elements of the set in the following order: 2, 1, 0, 3
2
/ \
1 3
/
0
Again, this tree has the same height as the previous trees, thus showing that a different ordering of items in the tree can also produce a tree of minimum height.
(An aside)
You can always build a minimum height tree by first sorting the elements of the set, then continually subdividing the sorted set to ensure balance and complete filling of each row.
Take the median element of the set. In the case of an even number of elements, take the larger of the two 'middle' elements. This will become the root of the tree.
Take all the elements below the median. This will become the left subtree of the root.
Take all the elements above the median. This will become the right subtree of the root.
Recursively create the left and right subtrees from these sets.
This should ensure that you have a complete binary tree, which will always be of minimum height.

Related

Printing a binary tree in Prolog based on a matrix

I wanted to print a binary tree by making a matrix based on coordinates which will corresponding nodes have in the tree. The binary tree is balanced and for every node goes that the nodes on the left are less than it and the nodes on the right are greater than it.
My tree is represented by a structure s(L,K,D), where L denotes the left branch N the node, and D the right branch. When it comes to an end the tree finishes with nil. For example, a simple tree with 1 as its first node and 2 and 3 its left and right node, will be represented as s(s(nil,2,nil),1,s(nil,3,nil)).
Binary tree example
In this example, node 7 will have coordinates (3,2). 2 because that is its depth in the binary tree (counting from 0), and 3 because of its index in the monotonically increasing list of nodes of the tree.( the list is [1,2,3,7...]).
Basically I can pack a tree in a matrix of dimensions NxM (N being the number of nodes, and M being the absolute depth of the tree[in this case 3]).
For this example the desired matrix would be:
Matrix example
, where I put ' ' character - where the blanks are, in order for the printing to be achieved in the desired way.
Is there any way to code a function which will generate this matrix?

Why segment tree needs to be a full binary tree?

When constructing a segment tree, why it needs to be a full binary tree? I took some example input arrays and when made them to complete binary tree i am getting the same minimum in a range result. Then why make it a full binary tree when complete binary tree is also giving the same result.
Input array- 1, 3, 5, 7, 9, 11
The array representation of the segment tree for example array 1, 3, 5, 7, 9, 11 can be stored like (assuming the range query is finding the minimum element)
Sorry for a lazy diagram.
The number of tree nodes is calculated as pow*2-1 where pow = smallest power of 2 greater than or equal to n.
Clearly the above segment tree representation is a full binary tree and not a complete binary tree.
Can you share your segment tree array representation how are you storing it as a complete binary tree?
A segment tree is not a complete binary tree. For a binary tree to be complete, all levels except the last one need to be completely filled. Also, for the last level, nodes should be as left as possible.
Considering your input array with 6 elements, the tree will be as follows ( assuming 1-6 indices):-
Clearly, the last level has leaf nodes 1 and 2 followed by missing nodes that could have been children of node 3 (hypothetically) and then leaf nodes 4 and 5. This level has missing nodes with nodes 4 and 5 not as left as possible.
Whereas, every node has either 0 or 2 children making it a full binary tree.

Given a number n, how many balanced binary trees (not binary search trees) are there?

The definition of balanced in this question is
The number of nodes in its left subtree and the number of nodes in its
right subtree are almost equal, which means their difference is not
greater than one
if given a n as the number of nodes in total, how many are there such trees?
Also what if we replace the number of nodes with height? Given a height, how many height balanced trees are there?
Well the difference will be made only by the last level, hence you can just find how many nodes should be left for that one, and just consider all possible combinations. Having n nodes you know that the height should be floor(log(n)) hence the same tree at depth k = floor(log(n)) - 1 is fully balanced, hence you know that is needs (m = sum(i=0..k)2^i) nodes, hence n-m nodes are left for the last level. Some definition of a balanced binary tree force "all the nodes to be left aligned", in this case it is obvious that there can be only one possibility, without this constraint you have combinations of 2^floor(log(n)) chooses n-m, because you have to pick which of the 2^floor(log(n)) possible slots you will assign with nodes, forcing a total of n-m nodes to be assigned.
For the height story you consider a sum of combinations of 2^floor(log(n)) chooses i as i goes from 1 to 2^floor(log(n)). You consider all possibilities of having either 1 node at the last level, then 2 and so on, until you don't make it a fully balanced binary tree, hence having all 2^floor(log(n)) slots assigned.

Disjoint-set forests - why should the rank be increased by one when the find of two nodes are of same rank?

I am implementing the disjoint-set datastructure to do union find. I came across the following statement in Wikipedia:
... whenever two trees of the same rank r are united, the rank of the result is r+1.
Why should the rank of the joined tree be increased by only one when the trees are of the same rank? What happens if I simply add the two ranks (i.e. 2*r)?
First, what is rank? It is almost the same as the height of a tree. In fact, for now, pretend that it is the same as the height.
We want to keep trees short, so keeping track of the height of every tree helps us do that. When unioning two trees of different height, we make the root of the shorter tree a child of the root of the taller tree. Importantly, this does not change the height of the taller tree. That is, the rank of the taller tree does not change.
However, when unioning two trees of the same height, we make one root the child of the other, and this increases the height of that overall tree by one, so we increase the rank of that root by one.
Now, I said that rank was almost the same as the height of the tree. Why almost? Because of path compression, a second technique used by the union-find data structure to keep trees short. Path compression can alter an existing tree to make it shorter than indicated by its rank. In principle, it might be better to make decisions based on the actual height than using rank as a proxy for height, but in practice, it is too hard/too slow to keep track of the true height information, whereas it is very easy/fast to keep track of rank.
You also asked "What happens if I simply add the two ranks (i.e. 2*r)?" This is an interesting question. The answer is probably nothing, meaning everything will still work just fine, with the same efficiency as before. (Well, assuming that you use 1 as your starting rank rather than 0.) Why? Because the way rank is used, what matters is the relative ordering of ranks, not their absolute magnitudes. If you add them, then your ranks will be 1,2,4,8 instead of 1,2,3,4 (or more likely 0,1,2,3), but they will still have exactly the same relative ordering so all is well. Your rank is simply 2^(the old rank). The biggest danger is that you run a larger risk of overflowing the integer used to represent the rank when dealing with very large sets (or, put another way, that you will need to use more space to store your ranks).
On the other hand, notice that by adding the two ranks, you are approximating the size of the trees rather than the heights of the trees. By always adding the two ranks, whether they are equal or not, then you are exactly tracking the sizes of the trees. Again, everything works just fine, with the same caveats about the possibility of overflowing integers if your trees are very large.
In fact, union-by-size is widely recognized as a legitimate alternative to union-by-rank. For some applications, you actually want to know the sizes of the sets, and for those applications union-by-size is actually preferabe to union-by-rank.
Because in this case - you add one tree is a "sub tree" of the other - which makes the original subtree increase its size.
Have a look at the following example:
1 3
| |
2 4
In the above, the "rank" of each tree is 2.
Now, let's say 1 is going to be the new unified root, you will get the following tree:
1
/ \
/ \
3 2
|
4
after the join the rank of "1" is 3, rank_old(1) + 1 - as expected.1
As for your second question, because it will yield false height for the trees.
If we take the above example, and merge the trees to get the tree of rank 3. What would happen if we then want to merge it with this tree2:
9
/ \
10 11
|
13
|
14
We'll find out both ranks are 4, and try to merge them the same way we did before, without favoring the 'shorter' tree - which will result in trees with higher height, and ultimately - worse time complexity.
(1) Disclaimer: The first part of this answer is taken from my answer to a similar question (though not identical due to your last part of the question)
(2) Note that the above tree is syntatically made, it cannot be created in an optimized disjoint forests algorithms, but it still demonstrates the issues needed for the answer.
If you read that paragraph in a little more depth, you'll realize that rank is more like depth, not size:
Since it is the depth of the tree that affects the running time, the tree with smaller depth gets added under the root of the deeper tree, which only increases the depth if the depths were equal. In the context of this algorithm, the term "rank" is used instead of "depth" ...
and a merge of equal depth trees only increases the depth of the tree by one since the root of the one is added to the root of the other.
Consider:
A D
/ \ merged with / \
B C E F
is:
A
/|\
B C D
/ \
E F
The depth was 2 for both, and it's 3 for the merged one.
Rank represents the depth of the tree, not the number of nodes in it. When you join a tree with a smaller rank with a tree with a larger rank, the overall rank remains the same.
Consider adding a tree with rank 4 to the root of the tree of rank 6: since we added a node above the root of the depth-4 tree, that subtree now has a rank of 5. The subtree to which we've added our depth-4 tree, however, is 6, so the rank does not change.
Now consider adding a tree with rank 6 to the root of a second tree of rank 6: since the root of the first depth-6 tree now has an extra node above it, the rank of that subtree (and the tree overall) changes to 7.
Since the rank of the tree determines the processing speed, the algorithm tries to keep the rank as low as possible by always attaching a shorter tree to the taller one, keeping the overall rank unchanged. The rank changes only when the trees have identical ranks, in which case one of them gets attached to the root of the other, bumping up the rank by one.
Actually Here two important properties should be known very well to us ....
1) What is Rank ?
2) Why Rank is Used ???
Rank is nothing but the depth of a tree .U can say rank as depth (level) of a tree . When we make union nodes then these (graph nodes ) will be formed as a tree with an ultimate root node.Rank is expressed only for those root nodes .
A merged with D
Initially A has rank (level) 0 and D has rank(level) 0 . So u can merge them making anyone of them as a root . Because if u make A as root the rank(level) will be 1
and if u make D as a root then the rank will also be 1
A
`D
Here rank ( level ) is 1 when root is A .
Now think for another ,
A merge B -----> A
`D `C / \
D B
\
C
So the level will be increased by 1 , see exactly without root (A) there is at most height / depth / rank is 2 . rank[ 1] -> {D,B} and rank [2] -> {C} ................
Now our main objective is to make tree with minimum rank(depth) as possible while merging ..
Now when two differnt rank tree merge ,then
A(rank 0) merge B(rank 1)---> B Here merged tree rank is 1 same as high rank (1)
`C / \
A C
When small rank goes under over high rank . Then the merged tree's rank(height/depth) will be the same rank associated with higher rank tree .That means the rank will not increase , the merged tree rank will be same as higher rank before ...
But if we will do the reverse work means high rank tree goes under over low rank tree then see ,
A ( rank 0 ) merge B (rank 1 ) --> A ( merged tree rank 2 greater than both )
`C `B
`C
So , whatever is seen from following observation is that if we try to keep rank (height) of merged tree as minimum possible then , we have to choose the first process. i think this part is clear !!
Now u have to understand what is our objective to keep tree's height minimum as possible ..........
when we use disjoint set union then for path compression ( finding ultimate root with whom a node is connected ) when we traverse from a node to it's root node then if it's height (rank) is long then time processing will be slow .That's why when we try to merge two trees then we try to keep heigh/depth/rank as minimum as possible

Is it always possible to turn one BST into another using tree rotations?

Given a set of values, it's possible for there to be many different possible binary search trees that can be formed from those values. For example, for the values 1, 2, and 3, there are five BSTs we can make from those values:
1 1 2 3 3
\ \ / \ / /
2 3 1 3 1 2
\ / \ /
3 2 2 1
Many data structures that are based on balanced binary search trees use tree rotations as a primitive for reshaping a BST without breaking the required binary search tree invariants. Tree rotations can be used to pull a node up above its parent, as shown here:
rotate
u right v
/ \ -----> / \
v C A u
/ \ <----- / \
A B rotate B C
left
Given a BST containing a set of values, is it always possible to convert that BST into any arbitrary other BST for the same set of values? For example, could we convert between any of the five BSTs above into any of the other BSTs just by using tree rotations?
The answer to your question depends on whether you are allowed to have equal values in the BST that can appear different from one another. For example, if your BST stores key/value pairs, then it is not always possible to turn one BST for those key/value pairs into a different BST for the same key/value pairs.
The reason for this is that the inorder traversal of the nodes in a BST remains the same regardless of how many tree rotations are performed. As a result, it's not possible to convert from one BST to another if the inorder traversal of the nodes would come out differently. As a very simple case, suppose you have a BST holding two copies of the number 1, each of which is annotated with a different value (say, A or B). In that case, there is no way to turn these two trees into one another using tree rotations:
1:a 1:b
\ \
1:b 1:a
You can check this by brute-forcing the (very small!) set of possible trees you can make with the rotations. However, it suffices to note that an inorder traversal of the first tree gives 1:a, 1:b and an inorder traversal of the second tree gives 1:b, 1:a. Consequently, no number of rotations will suffice to convert between the trees.
On the other hand, if all the values are different, then it is always possible to convert between two BSTs by applying the right number of tree rotations. I'll prove this using an inductive argument on the number of nodes.
As a simple base case, if there are no nodes in the tree, there is only one possible BST holding those nodes: the empty tree. Therefore, it's always possible to convert between two trees with zero nodes in them, since the start and end tree must always be the same.
For the inductive step, let's assume that for any two BSTs of 0, 1, 2, .., n nodes with the same values, that it's always possible to convert from one BST to another using rotations. We'll prove that given any two BSTs made from the same n + 1 values, it's always possible to convert the first tree to the second.
To do this, we'll start off by making a key observation. Given any node in a BST, it is always possible to apply tree rotations to pull that node up to the root of the tree. To do this, we can apply this algorithm:
while (target node is not the root) {
if (node is a left child) {
apply a right rotation to the node and its parent;
} else {
apply a left rotation to the node and its parent;
}
}
The reason that this works is that every time a node is rotated with its parent, its height increases by one. As a result, after applying sufficiently many rotations of the above forms, we can get the root up to the top of the tree.
This now gives us a very straightforward recursive algorithm we can use to reshape any one BST into another BST using rotations. The idea is as follows. First, look at the root node of the second tree. Find that node in the first tree (this is pretty easy, since it's a BST!), then use the above algorithm to pull it up to the root of the tree. At this point, we have turned the first tree into a tree with the following properties:
The first tree's root node is the root node of the second tree.
The first tree's right subtree contains the same nodes as the second tree's right subtree, but possibly with a different shape.
The first tree's left subtree contains the same nodes as the second tree's left subtree, but possibly with a different shape.
Consequently, we could then recursively apply this same algorithm to make the left subtree have the same shape as the left subtree of the second tree and to make the right subtree have the same shape as the right subtree of the second tree. Since these left and right subtrees must have strictly no more than n nodes each, by our inductive hypothesis we know that it's always possible to do this, and so the algorithm will work as intended.
To summarize, the algorithm works as follows:
If the two trees are empty, we are done.
Find the root node of the second tree in the first tree.
Apply rotations to bring that node up to the root.
Recursively reshape the left subtree of the first tree to have the same shape as the left subtree of the second tree.
Recursively reshape the right subtree of the first tree to have the same shape as the right subtree of the second tree.
To analyze the runtime of this algorithm, note that applying steps 1 - 3 requires at most O(h) steps, where h is the height of the first tree. Every node will be brought up to the root of some subtree exactly once, so we do this a total of O(n) times. Since the height of an n-node tree is never greater than O(n), this means that the algorithm takes at most O(n2) time to complete. It's possible that it will do a lot better (for example, if the two trees already have the same shape, then this runs in time O(n)), but this gives a nice worst-case bound.
Hope this helps!
For binary search trees this can actually be done in O(n).
Any tree can be "straightened out", ie put into a form in which all nodes are either the root or a left child.
This form is unique (reading down from root gives the ordering of the elements)
A tree is straightened out as follows:
For any right child, perform a left rotation about itself. This decreases the number of right children by 1, so the tree is straightened out in O(n) rotations.
If A can be straightened out into S in O(n) rotations, and B into S in O(n) rotations, then since rotations are reversible one can turn A -> S -> B in O(n) rotations.

Resources