about AVL tree insertion operation - algorithm

In the standard process of AVL tree insertion, after we insert a new node, we will do adjustment from bottom to top, and during the process, is it possible a sub-tree height increase by one (because of insertion and rotation operation), while the sub-tree (after height increase by one), still have the same height of left/right child? If so, an example is appreciated, and if not, it will be great if anyone could explain why. Thanks. :)
Here is a reference to AVL tree (https://en.wikipedia.org/wiki/AVL_tree)
regards,
Lin

From Wikipedia Binary Tree page:
A balanced binary tree has the minimum possible maximum height (a.k.a.
depth) for the leaf nodes, because for any given number of leaf nodes
the leaf nodes are placed at the greatest height possible.
One common balanced tree structure is a binary tree structure in which
the left and right subtrees of every node differ in height by no more
than 1
For example:
This is a balanced tree.
And if we insert 1 it's height increases by 1. Yet it is a balanced tree again. Because left and right subtrees differ in height no more than 1.
BTW, AVL tree is a self-balancing binary search tree. So it is not possible to lose balance after insertion. Because after every insertion, tree balances itself by making necessary rotations.
I think you use the term balanced wrongly. You consider balanced as no height difference, but it's at most 1 height difference in definition.
Your question:
In the standard process of AVL tree insertion, is it possible a sub-tree height increase by one (because of insertion and rotation operation), while the sub-tree (after height increase by one), still have the same height of left/right child?
If we would have a tree which has the same height from left and right branches, and if we would insert a node into a leaf node on left branch, height would increase, because height of the tree is maximum(height(left_branch, right_branch)). Because after this operation height(left_branch) equals to height(right_branch)+1. So, they can't be equal.
In short, your precondition is height(left_branch) == height(right_branch)
Your operation is increasing height of left_branch by 1
So height(left_branch) == height(right_branch) condition can't be true anymore.

It is not possible after the insertion to have the left and right child of the sub-tree to remain same with a change in height.
Lets consider a simple example with only <3 nodes in a sub-tree. The possiblities of balance factor are,
+1 - which is Root node in subtree with 1 Left Child and no Right child
-1 - which is Root node in subtree with 1 Right Child and no Left child
0 - Root node in subtree has 1 node in Right and Left.
For SubTree with Balance factor +1,
if we insert into the Right, we are ok
if we insert into the Left, the balance factor changes to 2. So we need to balance the tree in which case the the height of the subtree is changed.
For SubTree with Balance factor -1,
if we insert into the Left we are ok
if we insert into the Right, the balance factor changes to -2. So we need to balance the tree in which case the the height of the subtree is changed.
For SubTree with Balance factor 0,
if we insert into the Left, we are ok. Height is changed for but the child node is changed as well.
if we insert into the Right, we are ok. Height is changed for but the child node is changed as well.
So, it is not possible to have the height changed and still have same right and left child heights.

Related

Finding the number of nodes in 2-3 tree while left sub-tree of the root has 3 children,right sub-tree of the root has 2 children

Suppose there is a 2-3 tree with n nodes.
Each node in the left sub-tree of the root has 3 children. (except the leaves).
Each node in the right sub-tree of the root has 2 children. (except the leaves).
How am I supposed to find how many nodes exist in the right/left sub-tree of the root?
Denote n':= nodes number in the right root sub-tree.
Then,Nodes number in the left root sub-tree is (n-1)-n'.
How am I supposed to find n' (to write n' as an expression of n)?
I am a little bit confused.
Thanks !
Let the total height of the tree be h. Since it's a 2-3 tree, both the left and right subtree have heigth h−1. The number of nodes in the right subtree is 2^h − 1, and the number of nodes in the left subtree is (3^h − 1)/2. Beyond that, I don't know anything really interesting to say. The quotient nʹ / n doesn't come out very pretty, but it approaches zero quite quickly as h increases.

Is it possible to determine AVL tree balanced or not if instead of height depth is given?

In question it is given we can use depth only and not height.
(As we know for height we can say if difference between height of left subtree and height of right subtree is is at most one then it will be balanced)
Using depth can we find a way to prove tree balanced or not?
I tried by finding relation between different depth trees
What I got is that
If depth max = n
Then there must be n nodes whose depth is n-1
But this is just one condition I got.
It is not sufficient condition
( You can ignore my approach and try other thing .As there is no condition on approaching the problem)
The principle is the same as with height: use the following logic:
For each node do:
Get the maximum among the depths of all the nodes in the left subtree. Default (when no left subtree is present) is the current node's depth.
Get the maximum among the depths of all the nodes in the right subtree. Default (when no right subtree is present) is the current node's depth.
The difference between these two should not be more than 1.
If you implement this with a post-order traversal through the tree, you can keep track of the maximum depths -- needed in the first two steps -- as you traverse the tree.

Which of the following are avl trees?

In the attached picture there are two Binary search trees. When I saw this question I thought that first tree is not balanced so it's not an avl tree and as second one is balanced it's obviously an avl tree.
But here the problem is when I saw the answer for this question it was given both (i) & (ii) are avl trees. How come (i) is an avl tree when it's clearly not balanced?
You are correct that tree 2 is not an AVL tree. The right subtree of the root - the one rooted at 13 - is not balanced. Specifically, its left subtree (the one rooted at 11) has height 1, and its (missing) right subtree has height -1 for a height imbalance of 2.
What’s your source on this problem? Perhaps it’s a typo, or perhaps the tree wasn’t the one that was intended?
Short Answer
Yes, both trees can be considered AVL trees if a height of an empty tree is defined as 0.
Long Answer
Let's take a definition of an AVL tree from here:
A balanced binary search tree where the height of the two subtrees (children) of a node differs by at most one
Now, what is the height of a tree? It's a number of edges on the longest path from the root to a leaf.
Let's take a node in question 13 from your example i. Its right subtree is empty and its left subtree consists of a line of 2 nodes - 10 and 11:
...
13
/
11 (height = 1)
/
10 (height = 0)
...
So, the height of the left subtree is 1 (the longest path from its root 11 to 10 is obviously 1) and that of the right subtree can be considered 0 (please see more here). Hence, the absolute difference of heights is 1.
I believe it's obvious to you that for any other node in the tree i, the absolute difference of subtree heights is not larger than 1, and so the tree is an AVL tree.
Remark
As pointed out by #templatetypedef, however, if a height of an empty subtree is defined as -1, then the tree is no longer an AVL tree because a balance factor at 13 is 1 - (-1) = 2. So, it all depends on how the height of an empty tree is defined. To make matters worse, the height of an empty tree is not defined - please check here.

Proving AVL trees can have children whose number of nodes aren't Θ of one another

Let T be an AVL tree whose left subtree is TL and whose right subtree is TR. Let's let |TL| and |TR| be the number of nodes in the left and right subtrees, respectively.
I need to prove that neither |TR| ≠ Θ(|TR|) and vice-versa but I don't know how. I assume it has to do with the case where one tree is a full AVL tree and the other is a minimal AVL tree (a Fibonacci tree), but I don't know what to do from there.
In an AVL tree of height h, the number of nodes ranges between Fh+2 - 1 and 2h - 1. This first quantity is Θ(φh) and the second is Θ(2h), where φ is the golden ratio, approximately 1.61. This means that you can construct AVL trees where the number of nodes in the left subtree is Θ(φh) and the right subtree is Θ(2h), meaning that the left subtree has asymptotically fewer nodes than the right subtree. You can then reverse left and right to show that the right subtree can't be Θ of the left subtree either.
Hope this helps!

worst case in MAX-HEAPIFY: "the worst case occurs when the bottom level of the tree is exactly half full"

In CLRS, third Edition, on page 155, it is given that in MAX-HEAPIFY,
"the worst case occurs when the bottom level of the tree is exactly half full"
I guess the reason is that in this case, Max-Heapify has to "float down" through the left subtree.
But the thing I couldn't get is "why half full" ?
Max-Heapify can also float down if left subtree has only one leaf. So why not consider this as the worst case ?
Read the entire context:
The children's subtrees each have size at most 2n/3 - the worst case occurs when the last row of the tree is exactly half full
Since the running time T(n) is analysed by the number of elements in the tree (n), and the recursion steps into one of the subtrees, we need to find an upper bound on the number of nodes in a subtree, relative to n, and that will yield that T(n) = T(max num. nodes in subtree) + O(1)
The worst case of number of nodes in a subtree is when the final row is as full as possible on one side, and as empty as possible on the other. This is called half full. And the left subtree size will be bounded by 2n/3.
If you're proposing a case with only a few nodes, then that's irrelevant, since all base cases can be considered O(1) and ignored.
Already there's an accepted answer but this answer is for those people who are still a bit confused (as I was), or something still doesn't click. So here's a little bit longer and more detailed explanation.
Though it might sound redundant, we have to be very clear about the exact definitions because through our attention to the details... chances are when you do that proving things becomes much easier.
From CLRS (section 6.1), a Binary Heap data structure is an array object that can be viewed as a nearly complete binary tree
From Wikipedia, In a complete binary tree, every level (except possibly the last level) is completely filled, and all the nodes in the last level are as far left as possible.
Again, from Wikipedia, A balanced binary tree is a binary tree structure in which the left and right sub-trees of every node differ in height by no more than 1.
Now that we are armed, let's dive in.
So, in comparison to the root, the height of the left and right sub-tree can differ by 1 at most.
Let's consider a tree T and let the height of the left sub-tree = h+1 and the height of the right sub-tree = h
What can be the worst-case in MAX_HEAPIFY? The worst-case happens when we end up doing maximum number of comparisons and swaps while trying to maintain the heap property.
When the MAX_HEAPIFY algorithm runs and if it recursively goes through the longest path then we can consider a possible worst-case because it will end up doing the maximum number of comparisons and swaps in the longest path.
Well, it seems all of the longest paths happen to be in the left sub-tree (as its height is h+1). But someone might as well ask: Why not the right sub-tree? Remember the above definition, all the nodes in the last level have to be as far left as possible.
Now because we have to cover every possibility that can lead to a worst-case, we need to get more number of longer paths, if any exist, and because of that, we ought to make the left sub-tree FULL (But Why? So that we can get more paths to choose from and opt for the path that gives the worst-case time among all).
Since the left subtree has a height h+1, it will have 2^(h+1) no. of leaf nodes, and, therefore, 2^(h+1) number of paths from the root. This is the maximum number of possible paths in a tree T of h+1 height.
Note: Please hold on to it if you are still reading, maybe just for the sake of crystal clarity.
Here's the image of the tree structure in the worst-case situation.
In the above image, as you can see, consider that the left (in yellow) sub-tree and the right (in pink) sub-tree each has x nodes. The pink portion is a complete right sub-tree and the yellow portion is the left sub-tree excluding the last level.
Notice that both the left (yellow) and the right (pink) sub-trees have a height of h.
Now, from the start, we have considered the left subtree to be of height h+1 as a whole (i.e. including the yellow portion and the last level).
Now, if I may ask, how many nodes do we have to add in the last level i.e. below the yellow portion to make the left sub-tree completely FULL?
Well, the bottom-most layer of the yellow portion has ⌈x/2⌉ nodes (i.e. Total number of leaves in a tree/subtree having n nodes = ⌈n/2⌉; for proof visit this link), and now if we add 2 children to each of these nodes or leaves => total x (≈x) nodes have been added (How? ⌈x/2⌉ leaves * 2 ≈ x nodes).
With this addition, we make the left sub-tree of height h+1 (i.e. the yellow portion with height h and the one last level added) FULL, hence meeting the worst-case criteria.
Since the left sub-tree is FULL, the whole Tree is HALF FULL.
Now someone might as well ask: What if we add more nodes, or, specifically, what if we add nodes in the right sub-tree? Well, we don't. And that's because now if we happen to add more nodes, the nodes will be added in the right sub-tree (as the left sub-tree is FULL), which, in turn, will tend to balance out the tree more. Now as the tree is starting to get more balanced, we are tending to move towards the best-case scenario and not the worst-case.
Final question : How many nodes do we have in total?
Total nodes in the tree, n = x (from the yellow portion) + x (from the pink portion) + x (addition of the last level below the yellow portion) = 3x
Can you notice something? As a by-product, the left sub-tree in total contains at most 2x nodes i.e. 2n/3 nodes (bcoz x = n/3).

Resources