We have came across a question in Thomas H. Cormen which are asking for showing
Here I am confused by this question that how there will be at most nodes
For instance, consider this problem:
In the above problem at height 2 there are 2 nodes. But if we calculate by formula:
Greatest Integer of (10/2^2+1) = 4
it does not satisfy Thomas H. Cormen questions.
Please correct me if I am wrong here.
Thanks in Advance
In Tmh Corman I observed that he is doing height numbering from 1 not from 0 so the formula is correct, I was doing wrong Interpration. So leaf as height 1 and root has height 4 for above question
Reading all the answers, I realized that the confusion comes from the precise definition of height. In page 153 of CLRS book, the height is defined as follows:
Viewing a heap as a tree, we define the height of a node in a heap to be the number of edges on the longest simple downward path from the node to a leaf...
Now let's look at the original heap provided by Nishant. The nodes 8, 9, 10, 6, and 7 are at height 0 (i.e., leaves). The nodes 4, 5 and 3 are at height 1. For example, there is one edge between node 5 and its leaf, node 10. Also there is one edge between node 3 and its leaf node 6. Node 6 looks like it is at height 1 but it is at height 0 and hence a leaf. The node 2 is the only node at height 2. You may wonder node 1 (the root) is two edges away from node 6 and 7 (leaves), and say node 1 is also at height 2. But if we look back at the definition, the bold-face word "longest" suggests that the longest simple downward path from the root to a leaf has 3 edges (passing node 2). Finally, the node 1 is at height 3.
In summary, there are 5, 3, 1, 1 nodes at height 0, 1, 2, 3, respectively.
Let's apply the formula to the observation we made in the above paragraph. I would like to point out that the formula given by Nishant is not correct.
It should be
ceiling(n/2^(h+1)) not ceiling(n/(2^h+1). Sorry about the terrible formatting. I am not able to post an image yet.
Anyways, using the correct formula,
h = 0, ceiling(10/2) = 5 (nodes 8, 9, 10, 6, and 7)
h = 1, ceiling(10/4) = 3 (nodes 4, 5 and 3)
h = 2, ceiling(10/8) = 2 (node 2, but this is okay because the formula is predicting that there are at most 2 nodes at height 2.)
h = 3, ceiling(10/16) = 1 (node 1)
With the correct definition of height, the formula works.
It looks like your formula says there are at most [n/2^h+1] nodes of height h. In your example there are two nodes of height 2, which is less than your computed possible maximum of 4(ish).
While calculating the tight bound for Build-Max-Heap author has used this property in the equation.
In this case we call the helper Max-Heapify which takes O(h) where h is the height of the sub-tree rooted at the current node (not the height of node itself with respect to the full tree).
Therefore if we consider the sub tree rooted at leaf node, it will have height 0 and number of nodes in the tree at that level would be at most n / 20+1 = n/2 (i.e h=0 for the sub tree formed from node at leaves).
Similarly for sub-tree rooted at actual root the height of the tree would be log(n) and in that case the number of nodes at that level would be 1 i.e floor of n / 2logn+1 = [n/n+1].
Formula for
no. of nodes = n/(2^(h+1))
so when h is 2, and n = 10
no. of nodes = 10/(2^(2+1)) = 10/(2^3) = 10/8 = 1.25
But
ceil of 10/8 = 2
Hence there are 2 nodes which you can see from the figure.
Though it is mentioned in Cormen that height of a node is the greatest distance traveled from node to leaf(the number of edges), if you take height to be the distance of a node from the leaf, i.e. at leaf the height is zero and at root the height is log(n). The formula stands correct.
As for the leaves you have h=0; hence by the formula n/(2^(h+1))
h=0; max number of leaves in the heap will be n/2.
what about height 1. Cormen's theory gives 10/(2^(1+1))=3(ceil) while there is 4 nodes at height 1. This is a contradiction.
It is not true that Thomas H. Cormen is counting the height of the tree starting from one, height is h = 0, 1, ..., log n and it increases as you go upwards:
and in the following formula, he added 1 plus the height:
All the confusion is coming from the fact that this will work nicely with Perfect Binary Trees, not with the one you are showing in your question, this is why he is saying ON MOST
when you consider Big-O it wouldn't really matter
This formula is wrong, it gives wrong answers in many cases like in this question for h=1 (ie second last level) it gives maximum number of nodes is 3 but there are 4 nodes. Also let us consider a tree with 4 nodes :
a
/ \
b c
/
d
node d has height 0, let us consider for height =1 using the formula n/2^(h+1) we get
4/2^(1+1) = 1
which means this level can have at most 1 node which is false !
so this formula is not right.
The formula is quite correct. Nothing is wrong with the formula!!
Lets take the tree(although its not heap yet its complete) in the question posed by Nishant on the top.
For h=0 means all leaves so ceil(10/2^(0+1)=5) so there are 5 leaves
For h=1 means all nodes which have one arc to reach the leaves, so ceil(10/2^(1+1))=3 there are 3 such nodes in your tree.
For h=2 means all nodes which have two consecutive arcs to reach leaves, so ceil(10/2^(2+1))=1 so you have only one such node(left successor of the root)
For h=3 means all nodes which have three arcs to leaves, so ceil(10/2^(3+1))=1 which is the root.
Moral of the story is that you are confused between height and level. Level starts from up to down. Which means you have 4 nodes on level 2. i.e you can reach 4 nodes if you start at root and moves two arcs down.
Whereas height is completely different. Like in above case at height 0 there are 5 nodes (3 on level 3, and 2 on level 2). Hence height h of a node n means how many arcs you can travel to reach a leaf.
regards,
Hope it clarifies the point.
Safdar from Pakistan
Related
I know how to find the minimum numbers of nodes in an AVL tree of height h (which includes external nodes) with the formula n(h) = n(h-1) + n(h-2) + 1 but I was wondering if there was a formula to just find the minimum internal nodes only of an AVL tree with height h.
So for n(3) = 4, if we're only counting internal nodes. n(4) = 7, if we're only counting internal nodes. I can draw it out and count the internal nodes but when you get to bigger AVL trees it's a mess.
I can't seem to find anything on this and trying to find a pattern with consistent answers has only led to hours of frustration. Thanks in advance.
Yep, there’s a nice way to calculate this. Let’s begin with the two simplest AVL trees, which have order 0 and order 1:
* *
|
*
This first tree has no internal nodes, and the second has one internal node. This gives us our base cases for a recurrence relation:
I(0) = 0
I(1) = 1
From here, we notice that the way to get the fewest internal nodes in an AVL tree of order n+2 is to pick two trees of order n and n+1 as children (minimizing the number of nodes) that have the fewest internal nodes possible. The resulting tree will have a number of internal nodes equal to the number of internal nodes in the two subtrees, plus one for the new root. This means that
I(n+2) = I(n) + I(n+1) + 1.
Applying this recurrence gives us the sequence
0, 1, 2, 4, 7, 12, 20, etc.
And hey - have we seen this before somewhere? We have! Adding one to each term gives us
1, 2, 3, 5, 8, 13, 21, etc.
which is the Fibonacci sequence, shifted down two positions! So our hypothesis is that
I(n) = F(n+2) - 1
You can prove that this is the case by induction on n.
Here’s a different way to arrive at this result. Imagine you take an AVL tree of height n and remove all the leaves. You’re now left with an AVL tree of height n-1 (prove this!), and all of the remaining nodes in this tree are the internal nodes of the original tree. The smallest possible number of nodes in an AVL tree of height n is F(n+2)-1, matching our result.
Is there a formula to calculate what the maximum and minimum height for an AVL tree, given a certain number of nodes?
For example:
Textbook question:
What is the maximum/minimum height for an AVL tree of 3 nodes, 5 nodes, and 7 nodes?
Textbook answer:
The maximum/minimum height for an AVL tree of 3 nodes is 2/2, for 5 nodes is 3/3, for 7 nodes is 4/3
I don't know if they figured it out by some magic formula, or if they draw out the AVL tree for each of the given heights and determined it that way.
The solution below is appropriate for working things out by hand and gaining an intuition, please see the exact formulas at the bottom of this answer for larger trees (54+ nodes).1
Well the minimum height2 is easy, just fill each level of the tree with nodes until you run out. That height is the minimum.
To find the maximum, do the same as for the minimum, but then go back one step (remove the last placed node) and see if adding that node to the opposite sub-tree (from where it just was) violates the AVL tree property. If it does, your max height is just your min height. Otherwise this new height (which should be min height+1) is your max height.
If you need an overview of what the properties of an AVL tree are, or just a general explanation of an AVL tree, Wikipedia is a great place to start.
Example:
Let's take the 7 node example case. You fill in all levels and find a completely filled tree of height 3. (1 at level 1, 2 at level 2, 4 at level 3. 1+2+4=7 nodes.) That means 3 is your minimum.
Now find the max. Remove that last node and place it on the left subtree instead of the right. The right subtree still has height 3, but the left subtree now has height 4. However these values differ by less than 2, so it is still an AVL tree. Therefore your max height is 4. (Which is min+1)
All three examples worked out below (note that the numbers correspond to order of placement, NOT value):
Formulas:
The technique shown above doesn't hold if you have a tree with a very large number nodes. In this case, one can use the following formulas to calculate the exact min/max height2.
Given n nodes3:
Minimum: ceil(log2(n+1))
Maximum: floor(1.44*log2(n+2)-.328)
If you're curious, the first time max-min>1 is when n=54.
1Thanks to Jamie S for bringing this failure at larger node counts to my attention.
2Technically, the height of a tree is the longest path length (in edges) between the root and any leaf node. However the OP's textbook uses a common alternate definition of height as the number of levels in a tree. For consistency with the OP and Wikipedia, we use that definition in this post as well.
3These formulas are from the Wikipedia AVL page, with constants plugged in. The original source is Sorting and searching by Donald E. Knuth (2nd Edition).
It's important to note the following defining characteristics of an AVL Tree.
AVL Tree Property
The nodes of an AVL tree abide by the BST property
AND The heights of the left and right sub-trees of any node differ by no more than 1.
Theorem: The AVL property is sufficient to maintain a worst case tree height of O(log N).
Note the following diagram.
- T1 is comprised of a T0 + 1 node, for a height of 1.
- T2 is comprised of T1 and a T0 + 1 node, giving a height of 2.
- T3 is comprised of a T2 for the left sub-tree and a T1 for the right
sub-tree + 1 node, for a height of 3.
- T4 is comprised of a T3 for the left sub-tree and a T2 for the right
sub-tree + 1 node, for a height of 4.
If you take the ceiling of O(log N), where N represents the number of nodes in an AVL tree, you get the height.
Example) T4 contains 12 nodes. [ceiling]O(log 12) = 4.
See the pattern developing here??
**The worst-case height is
Lets assume the number of nodes is n
Trying to find out the minimum height of an AVL tree would be the same as trying to make the tree complete i.e. fill all the possible nodes at each level and then move to the next level.
So at each level the number of eligible nodes increases by 2^(h-1) where h is the height of the tree.
So at h=1, nodes(1) = 2^(1-1) = 1 node
for h=2, nodes(2) = nodes(1)+2^(2-1) = 3 nodes
for h=3, nodes(3) = nodes(2)+2^(3-1) = 7 nodes
so just find the smallest h, for which nodes(h) is greater than the given number of nodes n.
Now for the problem of maximum height of an AVL tree:-
lets assume that the AVL tree is of height h, F(h) being the number of nodes in the AVL tree,
for its height to be maximum lets assume that its left subtree FL and right subtree FR have a difference in height of 1(as it satisfies the AVL property).
Now assuming FL is a tree with height h-1 and FR be a tree with height h-2.
now the number of nodes in
F(h)=F(h-1)+F(h-2)+1 (Eq 1)
Adding 1 on both sides :
F(h)+1=(F(h-1)+1)+ (F(h-2)+1) (Eq 2)
So we have reduced the maximum height problem to a Fibonacci sequence. And these trees F(h) are called Fibonacci Trees.
So, F(1)=1 and F(2)=2
so in order to get the maximum height just find the index of the the number in the fibonacci sequence which is less than or equal to n.
So applying (Eq 1)
F(3)= F(2) + F(1)+ 1=4, so if n is between 2 and 4 tree will have height 3.
F(4)= F(3)+ F(2)+ 1 = 7, similarly if n is between 4 and 7 tree will have height 4.
and so on.
http://lcm.csa.iisc.ernet.in/dsa/node112.html
It is roughly 1.44 * log n, where n is the number of nodes.
For a more detailed description on how that was derived. You can refer to this link starting on the middle of page 13: http://www.compsci.hunter.cuny.edu/~sweiss/course_materials/csci335/lecture_notes/chapter04.2.pdf
The maximum number of items in a B-Tree of order m and height h is defined by the equation
Or, in text format:
m^h+1 - 1
But I am looking for the formula for the minimum number of items.
I've seen this question, but the answer isn't related.
The following solution uses the definitions by Thomas Cormen.
For a tree of the height 0 you will get 1 as a minimum, of course.
For a tree of the height 1 you will get 2 successors for your single node with each containing
ceil(t/2)
nodes at minimum.
So you can say, that in each layer greater than 1 you will have at least
2*(ceil(t/2)**(h-1)) * (ceil(t/2)-1)
The 2 comes from the first layer where a single element with two successors is allowed. Every node i between has at least ceil(t/2) successors. In the base layer you will have ceil(t/2)-1 nodes. That is the third part of the formula.
Using the geometric sum formula for all heights from one to h, you will get:
2*(ceil(t/2)-1)*((1-ceil((t/2))**h)/(1-ceil(t/2)))+1
t being the order of your tree and h being the height.
Its. 2.Ceil(m/2)^h-1
.........
It is well known that deletion from an AVL tree may cause several nodes to eventually be unbalanced. My question is, what is the minimum sized AVL tree such that 2 rotations are required (I'm assuming a left-right or right-left rotation is 1 rotation)? I currently have an AVL tree with 12 nodes where deletion would cause 2 rotations. My AVL tree is inserting in this order:
8, 5, 9, 3, 6, 11, 2, 4, 7, 10, 12, 1.
If you delete the 10, 9 becomes unbalanced and a rotation occurs. In doing so, 8 becomes unbalanced and another rotation occurs. Is there a smaller tree where 2 rotations are necessary after a deletion?
After reading jpalecek's comment, my real question is: Given some constant k, what is the minimum sized AVL tree that has k rotations after 1 deletion?
A tree of four nodes requires a single rotation in the worst case. The worst case number of deletions increases with each term in the list: 4, 12, 33, 88, 232, 609, 1596, 4180, 10945, 28656, ...
This is Sloane's A027941 and is a Fibonacci-type sequence that can be generated with N(i)=1+N(i-1)+N(i-2) for i>=2, N(1)=2, N(0)=1.
To see why this is so, first note that rotating an imbalanced AVL tree reduces its height by one because its shorter leg is lengthened at the expense of its longer leg.
When a node is removed from an AVL tree, the AVL algorithm checks all of the removed node's ancestors for potential rebalancing. Therefore, to answer your question we need to identify trees with the minimum number of nodes for a given height.
In such a tree every node is either a leaf or has a balance factor of +1 or -1: if a node had a balance factor of zero this would mean that a node could be removed without triggering a rebalancing. And we know rebalancing makes a tree shorter.
Below, I show a set of worst-case trees. You can see that following the first two trees in the sequence, each tree is constructed by joining the previous two trees. You can also see that every node in each tree is either a leaf or has a non-zero balance factor. Therefore, each tree has the maximum height for its number of nodes.
For each tree, a removal in the left subtree will, in the worst case, cause rotations which ultimately reduce the height of that subtree by one. This balances the tree as a whole. On the other hand, removing a node from the right subtree may ultimately imbalance the tree resulting in a rotation of the root. Therefore, the right subtrees are of prime interest.
You can verify that Tree (c) and Tree (d) have one rotation upon removal, in the worst case.
Tree (c) appears as a right subtree in Tree (e) and Tree (d) as a right subtree in Tree (f). When a rotation is triggered in Tree (c) or (d) this shortens the trees resulting in a root rotation in Trees (d) and (f). Clearly, the sequence continues.
If you count the number of nodes in the trees this matches my original statement and completes the proof.
(In the trees below removing the highlighted node will result in a new maximum number of rotations.)
I am not good at proofs, and I'm sure the below is full of holes, but maybe it will spark something positive.
To effect k rotations on a minimized AVL tree following the deletion of a node, the following conditions must be met:
The target node must exist in a 4-node sub-tree.
The target node must either be on the short branch, or must be the root of the sub-tree and be replaced by the leaf of the short branch.
Each node in the ancestry of the root of the target sub-tree must be slightly out of balance (balance factor of +/-1). That is - when a balance factor of 0 is encountered, the rotation chain will cease.
The height and number of nodes of the minimized tree is calculated with the following equations.
Let H(k) = the minimum height of the tree affected by k rotations.
H(k) = 2k + 1, k > 0
Let N(h) = the number of nodes in a (min-node) AVL tree of height h.
N(0) = 0
N(1) = 1
N(h) = N(h-1) + N(h-2) + 1, h > 1
Let F(k) = the minimum number of nodes in the tree affected by k rotations.
F(k) = N(H(k))
(e.g:)
k = 1, H(k) = 4, N(4) = 7
k = 2, H(k) = 6, N(6) = 20
Proof (such as it is)
Minimum Height
A deletion can only cause a rotation for trees with 4 or more nodes.
A tree of 1 node must have a balance factor of 0.
A tree of 2 nodes must have a balance factor of +/-1, and deletion leads to a balanced tree of 1 node.
A tree of 3 nodes must have a balance factor of 0. Removal of a node results in a balance factor of +/-1 and no rotation occurs.
Therefore, deletion from a tree with fewer than 4 nodes can not result in a rotation.
The smallest sub-tree for which 1 rotation occurs on delete is 4 nodes, which has height of 3. Removal of the node in the short side will result in rotation. Likewise, removal of the root node, using the node on the short side as replacement will cause a rotation. It doesn't matter how the tree is configured:
B B Removal of A or replacement of B with A
/ \ / \ results in rotation. No rotation occurs
A C A D on removal of C or D, or on replacement
\ / of B with C.
D C
C C Removal of D or replacement of C with D
/ \ / \ results in rotation. No rotation occurs
B D A D on removal of A or B, or on replacement
/ \ of C with B.
A B
Deletion from a 4 node tree results in a balanced tree of height 2.
.
/ \
. .
To effect a second rotation, the target tree must have a sibling of height 4, so that the balance factor of the root is +/-1 (and therefore has a height of 5). It doesn't matter if the affected tree is on the right or left of the parent, nor is the layout of the sibling tree important (that is, the H3 child of H4 can be on the left or right, and can be any of the 4 orientations above while the H2 child can be either of the 2 possible orientations - this needs proving).
_._ _._
/ \ / \
(H4) . . (H4)
/ \ / \
. . . .
\ \
. .
It is clear that the third rotation requires that the grandparent of the affected tree be likewise imbalanced by +/-1, and the fourth requires the great-grandparent be imbalanced by +/-1, and so on.
By definition, the height of a sub-tree is the maximum height of each branch plus one for the root. One sibling must be 1 taller than the other to achieve the +/-1 imbalance in the root.
H(1) = 3 (as observed above)
H(k) = 1 + max(H(k - 1), H(k - 1) + 1)) = 1 + H(k - 1) + 1 = H(k - 1) + 2
... Inductive proof leading to H(k) = 2k + 1 eludes me.
Minimum Nodes
By definition, the number of nodes in a sub-tree is the number of nodes in the left branch plus the number of nodes in the right branch plus 1 for the root.
Also be definition, a tree of height 0 must have 0 nodes, and a tree of height 1 must have no branches and thus 1 node.
It was shown above that the one branch must be one shorter than the other.
Let N(h) = minimum number of nodes required to create a tree of height h:
N(0) = 0
N(1) = 1
// the number of nodes in the two subtrees plus the root
N(h) = N(h-1) + N(h-2) + 1
Corollary
The minimum number of nodes is not necessarily the maximum in large trees. To wit:
Delete A from the following tree and observe that the height doesn't change following rotation. Therefore, the balance factor in the parent would not change and no additional rotation would occur.
B B D
/ \ \ / \
A D => D => B E
/ \ / \ \
C E C E C
However, in the k = 2 case, it does not matter if H(4) is minimized here - the second rotation will still occur.
_._ _._
/ \ / \
(H4) . . (H4)
/ \ / \
. . . .
\ \
. .
Questions
What is the position of the target sub-tree? Clearly for k = 1, it is the root, and for k = 2, it is the left if the root's balance factor is -1 otherwise the right. Is there a formula for determining position for k >= 3?
What is the maximum nodes a tree can contain to effect k rotations? Is it possible to have an intermediate node in the ancestry that is not rotated, though its parent is?
Can anybody give me proof how the number of nodes in strictly binary tree is 2n-1 where n is the number of leaf nodes??
Proof by induction.
Base case is when you have one leaf. Suppose it is true for k leaves. Then you should proove for k+1. So you get the new node, his parent and his other leaf (by definition of strict binary tree). The rest leaves are k-1 and then you can use the induction hypothesis. So the actual number of nodes are 2*(k-1) + 3 = 2k+1 == 2*(k+1)-1.
just go with the basics, assuming there are x nodes in total, then we have n nodes with degree 1(leaves), 1 with degree 2(the root) and x-n-1 with degree 3(the inner nodes)
as a tree with x nodes will have x-1 edges. so summing
n + 3*(x-n-1) + 2 = 2(x-1) (equating the total degrees)
solving for x we get x = 2n-1
I'm guessing that what you really want is something like a proof that the depth is log2(N), where N is the number of nodes. In this case, the answer is fairly simple: for any given depth D, the number of nodes is 2D.
Edit: in response to edited question: the same fact pretty much applies. Since the number of nodes at any depth is 2D, the number of nodes further up the tree is 2D-1 + 2D-2 + ...20 = 2D-1. Therefore, the total number of nodes in a balanced binary tree is 2D + 2D-1. If you set n = 2D, you've gone the full circle back to the original equation.
I think you are trying to work out a proof for: N = 2L - 1 where L is the number
of leaf nodes and N is the total number of nodes in a binary tree.
For this formula to hold you need to put a few restrictions on how the binary
tree is constructed. Each node is either a leaf, which means it has no children, or
it is an internal node. Internal nodes have 3
possible configurations:
2 child nodes
1 child and 1 internal node
2 internal nodes
All three configurations imply that an internal node connects to two other nodes. This explicitly
rules out the situation where node connects to a single child as in:
o
/
o
Informal Proof
Start with a minimal tree of 1 leaf: L = 1, N = 1 substitute into N = 2L - 1 and the see that
the formula holds true (1 = 1, so far so good).
Now add another minimal chunk to the tree. To do that you need to add another two nodes and
tree looks like:
o
/ \
o o
Notice that you must add nodes in pairs to satisfy the restriction stated earlier.
Adding a pair of nodes always adds
one leaf (two new leaf nodes, but you loose one as it becomes an internal node). Node growth
progresses as the series: 1, 3, 5, 7, 9... but leaf growth is: 1, 2, 3, 4, 5... That is why the formula
N = 2L - 1 holds for this type of tree.
You might use mathematical induction to construct a formal proof, but this works find for me.
Proof by mathematical induction:
The statement that there are (2n-1) of nodes in a strictly binary tree with n leaf nodes is true for n=1. { tree with only one node i.e root node }
let us assume that the statement is true for tree with n-1 leaf nodes. Thus the tree has 2(n-1)-1 = 2n-3 nodes
to form a tree with n leaf nodes we need to add 2 child nodes to any of the leaf nodes in the above tree. Thus the total number of nodes = 2n-3+2 = 2n-1.
hence, proved
To prove: A strictly binary tree with n leaves contains 2n-1 nodes.
Show P(1): A strictly binary tree with 1 leaf contains 2(1)-1 = 1 node.
Show P(2): A strictly binary tree with 2 leaves contains 2(2)-1 = 3 nodes.
Show P(3): A strictly binary tree with 3 leaves contains 2(3)-1 = 5 nodes.
Assume P(K): A strictly binary tree with K leaves contains 2K-1 nodes.
Prove P(K+1): A strictly binary tree with K+1 leaves contains 2(K+1)-1 nodes.
2(K+1)-1 = 2K+2-1
= 2K+1
= 2K-1 +2*
* This result indicates that, for each leaf that is added, another node must be added to the father of the leaf , in order for it to continue to be a strictly binary tree. So, for every additional leaf, a total of two nodes must be added, as expected.
int N = 1000; insert here the value of N
int sum = 0; // the number of total nodes
int currFactor = 1;
for (int i = 0; i< log(N); ++i) //the is log(N) levels
{
sum += currFactor;
currFactor *= 2; //in each level the number of node is double than the upper level
}
if(sum == 2*N - 1)
{
cout<<"wow that the number of nodes is 2*N-1";
}