I want to insert 7 items in my tree -3, -2, -1, 0, 1, 2 and 3. I get a well balanced tree of height 3 without doing rotations when I insert by this order: 0, -2, 2, -1, 1, -3, 3. But when I insert all items in ascending order, the right part of the root node does rebalancing, but the left part of the the root node doesn't. All rebalancing algorithms I have seen, do rebalancing from the inserted node up to the root node, and then they stop. Shouldn't they continue to the opposite part of the root node? And I have the feeling it gets worse, if I insert lots of items in ascending order (like 0 to 100). At the end tree is balanced, but is not height optimized.
None of the balanced binary-search trees (R/B trees, AVL trees, etc.) provide absolute balancing, that is none of them provide minimal possible height. This is because it is not possible to do such a "complete" rebalancing fast. If we want always to keep the height minimal possible, a heavy rebalancing will often be required on tree operations, and therefore the rebalancing will not work in O(log N). As a result, all the operations (insert, update, delete, etc) will not work in O(log N) time also, and this will destroy the whole idea of balanced tree.
What balanced trees do guarantee is a not-so-strict requirement on tree height: the tree height is O(log N), that is C*log N for some constant C. So the tree is not guaranteed to be ideally balanced, but the balance will always be not far from ideal.
Related
I was asked this question in a phone screen interview and I was not able to answer it. For example, in a BST, I know that the maximum number of nodes is given by 2^h (assuming the root node at height = 0)
I wanted to ask, is there a similar mathematical outcome for a balanced binary search tree as well (For AVL, Red Black trees?), i.e. the number of nodes at a particular level k.
Thanks!
A balanced binary tree starts with one node, which has two descendants. Each of those then has two descendants again. So there will be 1, 2, 4, 8 and so on nodes per level.
As a formula you can use 2^(level-1). The last row might not be completely full, so it can have less elements.
As the balancing step is costly, implementations usually do not rebalance after every mutation of the tree. They will rather apply a heuristic to find out when a rebalancing will make the most sense. So in practice, levels might have less nodes than if the tree were perfectly balanced and there might be additional levels from nodes being inserted in the wrong places.
Here is a red black tree which seems unbalanced. If this is the case, Someone please explain why it is unbalanced?.
The term "balanced" is a bit ambiguous, since different kinds of balanced trees have different constraints.
A red-black tree ensures that every path to a leaf has the same number of black nodes, and at least as many black nodes as red nodes. The result is that the longest path is at most twice as long as the shortest path, which is good enough to guarantee O(log N) time for search, insert, and delete operations.
Most other kinds of balanced trees have tighter balancing constraints. An AVL tree, for example, ensures that the lengths of the longest paths on either side of every node differ by at most 1. This is more than you need, and that has costs -- inserting or deleting in an AVL tree (after finding the target node) takes O(log N) operations on average, while inserting or deleting in a red-black tree takes O(1) operations on average.
If you wanted to keep a tree completely balanced, so that you had the same number of descendents on either side of every node, +/- 1, it would be very expensive -- insert and delete operations would take O(N) time.
Yes it is balanced. The rule says, counting the black NIL leaves, the longest possible path should consists maximum of 2*B-1 nodes where B is black nodes in shortest possible path from the root to any leaf. In your example shortest path has 2 black nodes so B = 2 so longest path can have upto 3 black nodes but it is just 2.
When trying to order a set of numbers into a binary search tree, is there always exactly one way to order them so the tree has the shortest height, in other words most efficient?
A set of numbers can be converted to a BST by taking one element as the root of the tree and arranging all other numbers around it. I could see the following situation contradicting this theory:
Picking one root leads to a tree of height h, with the left subtree being 'taller' than the right subtree.
Picking another root leads to a different tree, also of height h, with the right subtree being 'taller' than the left subtree.
Another simple example involves swapping the order of insertion of two consecutive elements that are not directly related, and thus do not affect each other's position in the tree.
Disproof by counter-example.
Let the set S = {0, 1, 2, 3}.
Insert the elements into a binary search tree in the following order: 1, 0, 2, 3
1
/ \
0 2
\
3
Insert the elements into a binary search tree in the following order: 1, 2, 0, 3
1
/ \
0 2
\
3
Because these two trees have different orders of insertion, and yet both have minimum height, the statement that there is only one order of insertion that provides a binary search tree of minimum height is false.
If the actual ordering of elements on the tree is what you're concerned about, insert the elements of the set in the following order: 2, 1, 0, 3
2
/ \
1 3
/
0
Again, this tree has the same height as the previous trees, thus showing that a different ordering of items in the tree can also produce a tree of minimum height.
(An aside)
You can always build a minimum height tree by first sorting the elements of the set, then continually subdividing the sorted set to ensure balance and complete filling of each row.
Take the median element of the set. In the case of an even number of elements, take the larger of the two 'middle' elements. This will become the root of the tree.
Take all the elements below the median. This will become the left subtree of the root.
Take all the elements above the median. This will become the right subtree of the root.
Recursively create the left and right subtrees from these sets.
This should ensure that you have a complete binary tree, which will always be of minimum height.
I observed that the height of a 2-3-4 tree can be different depending of the order of insertion of nodes.
e.g. 1,2,3,4,5,6,7,8,9,10 will yield a tree of height 2
While inserting in this order:
e.g. 1, 5, 10, 2, 3, 8, 9, 4, 7, 8 will yield a tree of height 1
Is this a normal property of a 2-3-4 tree? Inserting nodes in sequence will yield a very imbalanced tree in that case. I thought 2-3-4 trees should be balanced trees?
Thanks.
2-3-4 trees are indeed "balanced" trees in the sense that the height of the tree never exceeds some fixed bound relative to the number of nodes (which, if each node has exactly two values in it, is O(log n)). The term "balanced" here should be contrasted with "imbalanced," which would be a tree in which the height is "large" relative to the number of nodes. For example, this tree is highly imbalanced:
1
\
2
\
3
\
4
\
5
\
6
I think that you are assuming that the term "balanced" means "as compact as possible," which is not the case. It is absolutely possible to have multiple different insertion orders into a 2-3-4 tree produce trees of different height, some of which will have lower heights than others. However, the maximum possible height achievable is not too great compared to the total number of nodes in the tree, and therefore 2-3-4 trees are indeed considered balanced trees.
Hope this helps!
a balanced tree usually means it's height is O(logn).
a vaild B-Trees (including 2-3-4 Tree) have the following limits:
all non-root node have at least [m/2] elements.
all leaves are in the same height.
with this two limits, a valid B-Tree is proved to have O(logn) height.
I observed that the height of a 2-3-4 tree can be different depending of the order of insertion of nodes.
The insertion algorithm for 2-3-4 trees splits 4-nodes "on the way" to the leaf node, since they cannot take another item. This allows insertion to be done in one pass and the tree remains balanced.
I have read it in a couple of places that avl tree search faster, but not able to understand. As I understand :
max height of red-black tree = 2*log(N+1)
height of AVL tree = 1.44*logo(N+1)
Is it because AVL is shorter?
Yes.
The number of steps required to find an item depends on the distance between the item and the root.
Since the AVL tree is packed tighter (i.e. it has a lower max height) it means more items are closer to the root than in the red-black case.
The extra tight packing also means the AVL tree requires more work when inserting elements.
The best choice for any app depends on whether it is insert intensive or search intensive...
AVL tree is better than red-black tree if the input key is almost ascending/descending because then we would have to do single rotation(left-left or right-right case) to add this element. Also, since the tree would be tightly balanced, the search would also be faster.
But for randomly selected input key, RBTree are better since they require less rotation for insertion in comparison to AVL.
Overall, it depends on the input sequence, which would decide how tilted our tree is, and the operation performed.For insert-intensive use Red-Black Tree and for search-intensive use AVL.
AVL tree and RBTree do have respective advantages as well as disadvantages. You'll perceive that better if you've already learned how they work.
AVL is slightly faster than RBTree in insert operation because there would be at most one rotation involved in insertion, while there may be two for RBTree.
RBTree only require at most three rotations in deletion, but this is not guaranteed in AVL. So it can delete nodes faster than AVL.
However, above all, they both have strict logarithmic tree height.
Pick up any subtree, the property that makes AVL "balanced" guarantees that the difference of height between two child subtrees is at most one, which is to say, intuitively, the whole tree is rigidly balanced.
But when it comes to an RBTree, the rule becomes likely "looser", since property of RBTree can only guarantee the depth of a tree is not larger than twice as the logarithm of the total number of nodes.
Here're some facts that may be more precise:
An AVL tree's height is strictly less than: 1.44log(n+2)-0.328
(approximately)
A red-black tree's height is at most 2log(n+1)
See https://en.wikipedia.org/wiki/AVL_tree#Comparison_to_other_structures for detailed information.