Height of tree with single node - data-structures

I have googled it that there is no right answer for tree height having only one node . Sometimes it node count and sometimes it is edges count causes sometimes it is 1 and other time it is 0 . what are the cases when node count is used and other time edge count is used ?

It depends entirely on your definition of (1) tree, and (2) height. But we certainly wish to maintain the property that height is a total function from trees to inteters; there should be no tree of undefined height.
Suppose for example we have this definition of a binary tree:
A tree is defined as either (1) the empty tree, or (2) a pair of trees, called the left and right subtrees.
type t = Empty | Node of t * t
Now we can define height, which should be a total function: the height of an empty tree is zero -- what else could it be? -- and the height of a non-empty tree is the larger of the heights of the sub-trees plus one:
let max x y = if x > y then x else y
let rec height tree = match tree with
| Empty -> 0
| Node (left, right) -> 1 + max (height left) (height right)
Now, notice the chain of logic that got us here:
height is a total function
empty is a legal tree
therefore an empty tree must have a height
the only sensible height for an empty tree is zero
therefore the height of a tree with a single node must be one.
If we deny some of those premises then we can come up with other answers. For example, what if there were no empty trees?
A tree is defined as a list, possibly empty, of trees:
type t = Node of t list
And again we could come up with a definition of height: the height of a node with an empty list is defined as zero, and the height of a node with non-empty children is the largest child height plus one.
let max x y = if x > y then x else y
let rec height tree = match tree with
| Node [] -> 0
| Node h :: t -> max (1 + height h) (height (Node t))
In this definition the height of a tree with a single node is zero, and we are counting edges. Again, look at our reasoning:
height is a total function
an empty tree is not a legal tree, but a leaf is
therefore a leaf must have a height
a sensible height for a leaf is zero
therefore a tree that is a single leaf could have height zero.
But we could also have said that the height of a leaf is one, with the same definition otherwise, and we'd be counting nodes. There's no objection to that logically.
what are the cases when node count is used and other time edge count is used ?
If an empty tree is legal then plainly only the node count makes sense. If we try to count edges then there is no way to distinguish the height of the empty tree from the height of a single-node tree, and keep height a total function.
If an empty tree is not legal then either makes sense. Since the relationship between the two height functions is "they differ by exactly one", it doesn't matter which definition you use; if you want to use the other definition, just add or subtract one appropriately.
When balancing a tree we don't care about the absolute heights; we care about the differences in heights between two trees. In those algorithms whether we count edges or nodes is irrelevant. The differences will be the same regardless. A lot of the time it doesn't matter, so pick whichever you like better.

The height of a node is the number of edges on the longest path from the node to a leaf.A leaf node will have a height of 0. Height of tree is height of its root node.
In your case height of tree will be 0.
for detailed answer check this one out.
What is the difference between tree depth and height?

Related

Finding Height of a node in an AVL Tree for Balance Factor calculation

AVL tree is a binary search tree that is balanced i.e height = O(log(n)). This is achieved by making sure every node follows the AVL tree property:
Height of the left subtree(LST) - Height of the right subtree(RST) is
in the range [-1, 0, 1]
where Height(LST) - Height(RST) is called Balance factor(BF) for a given node.
The height of the node is usually defined as, "length of the path(#edges) from that node to the deepest node"
eg:
By this definition, height of leaf is 0.
But almost everytime while discussing AVL trees, people consider height of the leaf to be 1.
My question is, can we take the height of leaf to be 0? This will make following BSTs also AVL trees, right?
Height concept confuses me because of these articles:
https://www.geeksforgeeks.org/minimum-number-of-nodes-in-an-avl-tree-with-given-height/
https://www.tutorialspoint.com/minimum-number-of-nodes-in-an-avl-tree-with-given-height-using-cplusplus
Firstly, they start the height with 0.
Then they say, minimum number of nodes required for avl tree of height 2 to be 4 BUT if height is starting with zero, I can have the following AVL trees too, right?
By this definition, height of leaf is 0.
This is correct.
BUT if height is starting with zero, I can have the following AVL trees too, right?
No, the parent of the leaf has height 1, as the path from that node to the leaf has 1 edge.
So it should be:
O -- height 2, balance factor 2
/
O -- height 1, balance factor 1
/
O -- height 0
The balance factor of the root is 2 because the height of its left subtree is 1 and that of its right subtree is -1 (!). Note that if a single node has height 0, then an empty tree has height -1. This is also what is mentioned in Wikipedia:
The height of a node is the length of the longest downward path to a leaf from that node. [...] The root node has depth zero, leaf nodes have height zero, and a tree with only a single node (hence both a root and leaf) has depth and height zero. Conventionally, an empty tree (tree with no nodes, if such are allowed) has height −1.
And so these are not valid AVL trees: they need to be rebalanced.

Finding te number of AVL trees of height N

The definition of an AVL tree I have is:
"The balancing factor for vertex x in a binary search tree T is the difference between the height of x's left subtree and right subtree.
A binary tree T is called an AVL tree if the balancing factor of each of its vectors is either 0, -1, or 1."
I need to find a regresive function for calculating the number of AVL trees of height N. I know the solution is:
V[i] = V[i-1]^2 + 2V[i-1]*V[i-2]
V[0] = 1
V[1] = 3
V[2] = 15
Can someone please explain? I am completely lost.
Got it myself thanks to n.m.'s comments.
The answer is as follows:
v[i] = v[i-1]v[i-1] + v[i-1]v[i-2] + v[i-2]v[i-1]
Where the first component is where both subtrees are of the same height (0), the second one is where there left tree is of a bigger height (1) and the third one is where the right tree is of the bigger height (-1).

Convert AVL Trees to Red Black tree

I read this statement somewhere that the nodes of any AVL tree T can be colored “red” and “black” so that T becomes a red-black tree.
This statement seems quite convincing but I didn't understand how to formally proof this statement.
According to wiki, A red black tree should satisfy these five properties:
a.A node is either red or black.
b.The root is black. This rule is sometimes omitted. Since the root can always be changed from red to black, but not necessarily vice versa,
c. All leaves (NIL) are black.
d.If a node is red, then both its children are black.
e.Every path from a given node to any of its descendant NIL nodes contains,the same number of black nodes.
The four conditions is quite simple, I got stuck how to proof statement 5
First, define the height of a tree (as used for AVL trees):
height(leaf) = 1
height(node) = 1 + max(height(node.left), height(node.right))
Also, define the depth of a path (as used for red-black trees, a path is the chain of descendants from a given node to some leaf) to be the number of black nodes on the path.
As you point out, the tricky bit about coloring an AVL tree as a red-black tree is making sure that every path has the same depth. You will need to use the AVL invariant: that the subtrees of any given node can differ in height by at most one.
Intuitively, the trick is to use a coloring algorithm whose depth is predictable for a given height, such that you don't need to do any further global coordination. Then, you can tweak the coloring locally, to ensure the children of each node have the same depth; this is possible only because the AVL condition puts strict limits on their height difference.
This tree-coloring algorithm does the trick:
color_black(x):
x.color = black;
if x is a node:
color_children(x.left, x.right)
color_red(x): // height(x) must be even
x.color = red
color_children(x.left, x.right) // x will always be a node
color_children(a,b):
if height(a) < height(b) or height(a) is odd:
color_black(a)
else:
color_red(a)
if height(b) < height(a) or height(b) is odd:
color_black(b)
else:
color_red(b)
For the root of the AVL tree, call color_black(root) to ensure b.
Note that the tree is traversed in depth-first order, also ensuring a.
Note that red nodes all have even height. Leaves have height 1, so they will be colored black, ensuring c. Children of red nodes will either have odd height or will be shorter than their sibling, and will be marked black, ensuring d.
Finally, to show e. (that all paths from root have the same depth),
use induction on n>=1 to prove:
for odd height = 2*n-1,
color_black() creates a red-black tree, with depth n
for even height = 2*n,
color_red() sets all paths to depth n
color_black() creates a red-black tree with depth n+1
Base case, for n = 1:
for odd height = 1, the tree is a leaf;
color_black() sets the leaf to black; the sole path has depth 1,
for even height = 2, the root is a node, and both children are leaves, marked black as above;
color_red() sets node to red; both paths have depth 1
color_black() sets node to black; both paths have depth 2
The induction step is where we use the AVL invariant: sibling trees can differ in height by at most 1. For a node with a given height:
subcase A: both subtrees are (height-1)
subcase B: one subtree is (height-1), and the other is (height-2)
Induction step: given the hypothesis is true for n, show that it holds for n+1:
for odd height = 2*(n+1)-1 = 2*n+1,
subcase A: both subtrees have even height 2*n
color_children() calls color_red() for both children,
via induction hypothesis, both children have depth n
for parent, color_black() adds a black node, for depth n+1
subcase B: subtrees have heights 2*n and 2*n-1
color_children() calls color_red() and color_black(), resp;
for even height 2*n, color_red() yields depth n (induction hyp.)
for odd height 2*n-1, color_black() yields depth n (induction hyp.)
for parent, color_black() adds a black node, for depth n+1
for even height = 2*(n+1) = 2*n + 2
subcase A: both subtrees have odd height 2*n+1 = 2*(n+1)-1
color_children() calls color_black() for both children, for depth n+1
from odd height case above, both children have depth n+1
for parent, color_red() adds a red node, for unchanged depth n+1
for parent, color_black() adds a black node, for depth n+2
subcase B: subtrees have heights 2*n+1 = 2*(n+1)-1 and 2*n
color_children() calls color_black() for both children, for depth n+1
for odd height 2*n+1, color_black() yields depth n+1 (see above)
for even height 2*n, color_black() yields depth n+1 (induction hyp.)
for parent, color_red() adds a red node, for depth n+1
for parent, color_black() adds a black node, for depth n+2 = (n+1)+1
Well, simple case for #5 is a single descendant, which is a leaf, which is black by #3.
Otherwise, the descendant node is red, which is required to have 2 black descendants by #4.
Then these two cases recursively apply at each node, so you'll always have the same amount of black nodes in each path.
Even if you can convert an AVL tree to a red-black tree, the cost is very large. The shape of a tree has nothing to do with the internal structure, which requires a total rebuilding.
The maximum local height difference bound of the red-black tree is 2.

Finding the minimum and maximum height in a AVL tree, given a number of nodes?

Is there a formula to calculate what the maximum and minimum height for an AVL tree, given a certain number of nodes?
For example:
Textbook question:
What is the maximum/minimum height for an AVL tree of 3 nodes, 5 nodes, and 7 nodes?
Textbook answer:
The maximum/minimum height for an AVL tree of 3 nodes is 2/2, for 5 nodes is 3/3, for 7 nodes is 4/3
I don't know if they figured it out by some magic formula, or if they draw out the AVL tree for each of the given heights and determined it that way.
The solution below is appropriate for working things out by hand and gaining an intuition, please see the exact formulas at the bottom of this answer for larger trees (54+ nodes).1
Well the minimum height2 is easy, just fill each level of the tree with nodes until you run out. That height is the minimum.
To find the maximum, do the same as for the minimum, but then go back one step (remove the last placed node) and see if adding that node to the opposite sub-tree (from where it just was) violates the AVL tree property. If it does, your max height is just your min height. Otherwise this new height (which should be min height+1) is your max height.
If you need an overview of what the properties of an AVL tree are, or just a general explanation of an AVL tree, Wikipedia is a great place to start.
Example:
Let's take the 7 node example case. You fill in all levels and find a completely filled tree of height 3. (1 at level 1, 2 at level 2, 4 at level 3. 1+2+4=7 nodes.) That means 3 is your minimum.
Now find the max. Remove that last node and place it on the left subtree instead of the right. The right subtree still has height 3, but the left subtree now has height 4. However these values differ by less than 2, so it is still an AVL tree. Therefore your max height is 4. (Which is min+1)
All three examples worked out below (note that the numbers correspond to order of placement, NOT value):
Formulas:
The technique shown above doesn't hold if you have a tree with a very large number nodes. In this case, one can use the following formulas to calculate the exact min/max height2.
Given n nodes3:
Minimum: ceil(log2(n+1))
Maximum: floor(1.44*log2(n+2)-.328)
If you're curious, the first time max-min>1 is when n=54.
1Thanks to Jamie S for bringing this failure at larger node counts to my attention.
2Technically, the height of a tree is the longest path length (in edges) between the root and any leaf node. However the OP's textbook uses a common alternate definition of height as the number of levels in a tree. For consistency with the OP and Wikipedia, we use that definition in this post as well.
3These formulas are from the Wikipedia AVL page, with constants plugged in. The original source is Sorting and searching by Donald E. Knuth (2nd Edition).
It's important to note the following defining characteristics of an AVL Tree.
AVL Tree Property
The nodes of an AVL tree abide by the BST property
AND The heights of the left and right sub-trees of any node differ by no more than 1.
Theorem: The AVL property is sufficient to maintain a worst case tree height of O(log N).
Note the following diagram.
- T1 is comprised of a T0 + 1 node, for a height of 1.
- T2 is comprised of T1 and a T0 + 1 node, giving a height of 2.
- T3 is comprised of a T2 for the left sub-tree and a T1 for the right
sub-tree + 1 node, for a height of 3.
- T4 is comprised of a T3 for the left sub-tree and a T2 for the right
sub-tree + 1 node, for a height of 4.
If you take the ceiling of O(log N), where N represents the number of nodes in an AVL tree, you get the height.
Example) T4 contains 12 nodes. [ceiling]O(log 12) = 4.
See the pattern developing here??
**The worst-case height is
Lets assume the number of nodes is n
Trying to find out the minimum height of an AVL tree would be the same as trying to make the tree complete i.e. fill all the possible nodes at each level and then move to the next level.
So at each level the number of eligible nodes increases by 2^(h-1) where h is the height of the tree.
So at h=1, nodes(1) = 2^(1-1) = 1 node
for h=2, nodes(2) = nodes(1)+2^(2-1) = 3 nodes
for h=3, nodes(3) = nodes(2)+2^(3-1) = 7 nodes
so just find the smallest h, for which nodes(h) is greater than the given number of nodes n.
Now for the problem of maximum height of an AVL tree:-
lets assume that the AVL tree is of height h, F(h) being the number of nodes in the AVL tree,
for its height to be maximum lets assume that its left subtree FL and right subtree FR have a difference in height of 1(as it satisfies the AVL property).
Now assuming FL is a tree with height h-1 and FR be a tree with height h-2.
now the number of nodes in
F(h)=F(h-1)+F(h-2)+1 (Eq 1)
Adding 1 on both sides :
F(h)+1=(F(h-1)+1)+ (F(h-2)+1) (Eq 2)
So we have reduced the maximum height problem to a Fibonacci sequence. And these trees F(h) are called Fibonacci Trees.
So, F(1)=1 and F(2)=2
so in order to get the maximum height just find the index of the the number in the fibonacci sequence which is less than or equal to n.
So applying (Eq 1)
F(3)= F(2) + F(1)+ 1=4, so if n is between 2 and 4 tree will have height 3.
F(4)= F(3)+ F(2)+ 1 = 7, similarly if n is between 4 and 7 tree will have height 4.
and so on.
http://lcm.csa.iisc.ernet.in/dsa/node112.html
It is roughly 1.44 * log n, where n is the number of nodes.
For a more detailed description on how that was derived. You can refer to this link starting on the middle of page 13: http://www.compsci.hunter.cuny.edu/~sweiss/course_materials/csci335/lecture_notes/chapter04.2.pdf

Height of recursion tree vs the levels

I read that the height of tree is lgn for base=2 and number of levels are lgn+1. What's the difference b/w both? Aren't they including top most level in the height calculation or the base cases? Can someone please prove me this with some practical example using more grammar instead of Mathematical equations?
First, the height of a balanced tree is O(lg n); a completely unbalanced tree (such as a linked list) has height O(n).
The height of a node is its distance from the root (and thus the height of a tree is the maximum distance of any node from the root). From this definition, you can see that the height of the root is 0, its children have height 1, their children have height 2, and so on. A level can be considered all nodes with the same height.
Now consider the set of levels in a tree. The only way to have 0 levels is to have an empty tree; as soon as you have even a single node, there will be at least one level, the one containing the root node with height 0. That is, there is a differene between labelling levels and counting levels. Level 1 is the node with height 0; level 2 is the set of nodes with height 1, level i is the set of nodes with height i-1, until you get to level lg n + 1 consisting of nodes with height lg n.

Resources