2-3-4 tree height imbalanced - data-structures

I observed that the height of a 2-3-4 tree can be different depending of the order of insertion of nodes.
e.g. 1,2,3,4,5,6,7,8,9,10 will yield a tree of height 2
While inserting in this order:
e.g. 1, 5, 10, 2, 3, 8, 9, 4, 7, 8 will yield a tree of height 1
Is this a normal property of a 2-3-4 tree? Inserting nodes in sequence will yield a very imbalanced tree in that case. I thought 2-3-4 trees should be balanced trees?
Thanks.

2-3-4 trees are indeed "balanced" trees in the sense that the height of the tree never exceeds some fixed bound relative to the number of nodes (which, if each node has exactly two values in it, is O(log n)). The term "balanced" here should be contrasted with "imbalanced," which would be a tree in which the height is "large" relative to the number of nodes. For example, this tree is highly imbalanced:
1
\
2
\
3
\
4
\
5
\
6
I think that you are assuming that the term "balanced" means "as compact as possible," which is not the case. It is absolutely possible to have multiple different insertion orders into a 2-3-4 tree produce trees of different height, some of which will have lower heights than others. However, the maximum possible height achievable is not too great compared to the total number of nodes in the tree, and therefore 2-3-4 trees are indeed considered balanced trees.
Hope this helps!

a balanced tree usually means it's height is O(logn).
a vaild B-Trees (including 2-3-4 Tree) have the following limits:
all non-root node have at least [m/2] elements.
all leaves are in the same height.
with this two limits, a valid B-Tree is proved to have O(logn) height.

I observed that the height of a 2-3-4 tree can be different depending of the order of insertion of nodes.
The insertion algorithm for 2-3-4 trees splits 4-nodes "on the way" to the leaf node, since they cannot take another item. This allows insertion to be done in one pass and the tree remains balanced.

Related

How many permutations of 1, 2,..., n yield a skew tree? [duplicate]

I know what Binary search tree is and I know how they work. But what does it take for it to become a skewed tree? What I mean is, do all nodes have to go on one side? or is there any other combination?
Having a tree in this shape (see below) is the only way to make it a skewed tree? If not, what are other possible skewed trees?
Skewed tree example:
Also, I searched but couldn't find a good solid definition of a skewed tree. Does anyone have a good definition?
Figured out a skewed Tree is the worst case of a tree.
`
The number of permutations of 1, 2, ... n = n!
The number of BST Shapes: (1/n+1)(2n!/n!n!)
The number of skewed trees of 1, 2, ....n = 2^(n-1)
`
Here is an example I was shown:
http://i61.tinypic.com/4gji9u.png
A good definition for a skew tree is a binary tree such that all the nodes except one have one and only one child. (The remaining node has no children.) Another good definition is a binary tree of n nodes such that its depth is n-1.
A binary tree, which is dominated solely by left child nodes or right child nodes, is called a skewed binary tree, more specifically left skewed binary tree, or right skewed binary tree.

Do Red-Black tree and AVL tree have same balance condition?

For example:
An unbalanced tree:
4
/ \
1 11
/
7
/ \
5 9
After balance, it will become AVL tree:
11
/ \
4 7
/ / \
1 5 9
However, if the unbalanced tree is R-B tree like this:
(\\ means read pointers. \ means black pointers.)
4
/ \\
1 11
/
7
// \\
5 9
Is this a legal R-B tree? Or I should make it balance just like what I did to the tree?
The balance condition of AVL trees is different from the balance condition of Red-Black trees. An AVL tree is, in a sense, more balanced than a Red-Black tree.
In an AVL tree, for every node v, the difference between the height of v's right sub-tree and the height of v's left sub-tree must be at most 1. This is a very strict balance property compared to the one imposed by the Red-Black tree. In a Red-Black tree, no simple path from the root to a leaf is allowed to be more than twice as long as any other. This property results from the five color conditions a binary search tree must satisfy in order to be considered a valid Red-Black tree.
Your example Red-Black tree is indeed not balanced in the AVL sense because the difference between the height of the root's left sub-tree and the height of its right sub-tree is 2. Nevertheless, the tree is balanced in the Red-Black sense as it satisfies the five Red-Black color conditions.
The AVL balance condition implies that the height of an AVL tree is bounded by roughly 1.44log(n) while there's nothing preventing the height of a Red-Black tree from being greater: the Red-Black tree balance condition only implies a bound of 2log(n) on the height.
The fact that AVL trees tend to be shorter than Red-Black trees seems to suggest they must perform better. This is not necessarily the case: an AVL tree is indeed generally faster to search because it's more balanced than a Red-Black tree. But the reason an AVL tree is so well balanced is that keeping it balanced is computationally harder than keeping a Red-Black tree balanced. In particular, the AVL rotations make insertions and deletions in an AVL tree slower in comparison to these operations in the corresponding Red-Black tree.

Red Black Trees Rebalancing

I want to insert 7 items in my tree -3, -2, -1, 0, 1, 2 and 3. I get a well balanced tree of height 3 without doing rotations when I insert by this order: 0, -2, 2, -1, 1, -3, 3. But when I insert all items in ascending order, the right part of the root node does rebalancing, but the left part of the the root node doesn't. All rebalancing algorithms I have seen, do rebalancing from the inserted node up to the root node, and then they stop. Shouldn't they continue to the opposite part of the root node? And I have the feeling it gets worse, if I insert lots of items in ascending order (like 0 to 100). At the end tree is balanced, but is not height optimized.
None of the balanced binary-search trees (R/B trees, AVL trees, etc.) provide absolute balancing, that is none of them provide minimal possible height. This is because it is not possible to do such a "complete" rebalancing fast. If we want always to keep the height minimal possible, a heavy rebalancing will often be required on tree operations, and therefore the rebalancing will not work in O(log N). As a result, all the operations (insert, update, delete, etc) will not work in O(log N) time also, and this will destroy the whole idea of balanced tree.
What balanced trees do guarantee is a not-so-strict requirement on tree height: the tree height is O(log N), that is C*log N for some constant C. So the tree is not guaranteed to be ideally balanced, but the balance will always be not far from ideal.

Height difference between leaves in an AVL tree

What is the maximum difference between any two leaves in an AVL tree? If I take an example, my tree becomes unbalanced, if the height difference is more than 2(for any two leaves), but the answer is the difference can be any value. I really don't understand, how this is possible.Can anyone explain with examples?
The difference in levels of any two leaves can be any value! Definition of AVL describes height difference only on two sub-trees from one node.
So you need to fill subtrees with equal height then add new nodes just to create that single node difference. But nobody said that that subtree doesn't contain some subtrees with the exact same definition. Of course tree is selfbalanced but if we'll be that accurate to not touch it's balance then we can create any height difference between some leaves.
Example with leaf 24 on level 3 and leaf 10 on level 6:
According to the explanation in this Wikipedia article, the balancing operations in an AVL tree successfully aim at rearranging the tree such that the height of any two leaves differs no more than one. This is the key property of the data structure which makes the retrieval of nodes efficient (namely logarithmic in the number of nodes of the tree, as a path from the root to a leaf is traversed in the worst case).

Why is avl tree faster for searching than red black tree?

I have read it in a couple of places that avl tree search faster, but not able to understand. As I understand :
max height of red-black tree = 2*log(N+1)
height of AVL tree = 1.44*logo(N+1)
Is it because AVL is shorter?
Yes.
The number of steps required to find an item depends on the distance between the item and the root.
Since the AVL tree is packed tighter (i.e. it has a lower max height) it means more items are closer to the root than in the red-black case.
The extra tight packing also means the AVL tree requires more work when inserting elements.
The best choice for any app depends on whether it is insert intensive or search intensive...
AVL tree is better than red-black tree if the input key is almost ascending/descending because then we would have to do single rotation(left-left or right-right case) to add this element. Also, since the tree would be tightly balanced, the search would also be faster.
But for randomly selected input key, RBTree are better since they require less rotation for insertion in comparison to AVL.
Overall, it depends on the input sequence, which would decide how tilted our tree is, and the operation performed.For insert-intensive use Red-Black Tree and for search-intensive use AVL.
AVL tree and RBTree do have respective advantages as well as disadvantages. You'll perceive that better if you've already learned how they work.
AVL is slightly faster than RBTree in insert operation because there would be at most one rotation involved in insertion, while there may be two for RBTree.
RBTree only require at most three rotations in deletion, but this is not guaranteed in AVL. So it can delete nodes faster than AVL.
However, above all, they both have strict logarithmic tree height.
Pick up any subtree, the property that makes AVL "balanced" guarantees that the difference of height between two child subtrees is at most one, which is to say, intuitively, the whole tree is rigidly balanced.
But when it comes to an RBTree, the rule becomes likely "looser", since property of RBTree can only guarantee the depth of a tree is not larger than twice as the logarithm of the total number of nodes.
Here're some facts that may be more precise:
An AVL tree's height is strictly less than: 1.44log(n+2)-0.328
(approximately)
A red-black tree's height is at most 2log(n+1)
See https://en.wikipedia.org/wiki/AVL_tree#Comparison_to_other_structures for detailed information.

Resources