Number of nodes in the bottom level of a balanced binary tree - algorithm

I am wondering about two questions that came up when studying about binary search trees. They are the following:
What is the maximum number of nodes in the bottom level of a balanced binary search tree with n nodes?
What is the minimum number of nodes in the bottom level of a balanced binary search tree with n nodes?
I cannot find any formulas in my textbook regarding this. Is there any way to answers these questions? Please let me know.

Using notation:
H = Balanced binary tree height
L = Total number of leaves in a full binary tree of height H
N = Total number of nodes in a full binary tree of height H
The relation is L = (N + 1) / 2 as demonstrated below. That would be the maximum number of leaf nodes for a given tree height H. The minimum number of nodes at a given height is 1 (cannot be zero, because then the tree height would be reduced by one).
Drawing trees with increasing heights, one can observe that:
H = 1, L = 1, N = 1
H = 2, L = 2, N = 3
H = 3, L = 4, N = 7
H = 4, L = 8, N = 15
...
The relation between tree height (H) and the total number of leaves (L)
and the total number of nodes (N) becomes apparent:
L = 2^(H-1)
N = (2^H) - 1
The correctness is easily proven using mathematical induction.
Examples above show that it is true for small H.
Simply put in the value of H (e.g. H=1) and compute L and N.
Assuming the formulas are true for some H, one can show they are also true for HH=H+1:
For L, the assumption is that L=2^(H-1) is true.
As each node has two children, increasing the height by one
is going to replace each leaf node with two new leaves, effectively
doubling the total number of leaves. Therefore, in case of HH=H+1,
the total number of leaves (LL) is going to be doubled:
LL = L * 2
= 2^(H-1) * 2
= 2^(H)
= 2^(HH-1)
For N, the assumption is that N=(2^H)-1 is true.
Increasing the height by one (HH=H+1) increases the total number
of nodes by the total number of added leaf nodes. Therefore,
NN = N + LL
= (2^H) - 1 + 2^(HH-1)
= 2^(HH-1) - 1 + 2^(HH-1)
= 2 * 2^(HH-1) - 1
= (2^HH) - 1
Applying the mathematical induction, the correctness is proven.
H can be expressed in terms of N:
N = (2^H) - 1 // +1 to both sides
N + 1 = 2^H // apply log2 monotone function to both sides
log2(N+1) = log2(2^H)
= H * log2(2)
= H
The direct relation between L and N (which is the answer to the question asked) is:
L = 2^(H - 1) // replace H = log2(N + 1)
= 2^(log2(N + 1) - 1)
= 2^(log2(N + 1) - log2(2))
= 2^(log2( (N + 1) / 2 ))
= (N + 1) / 2
For Big O analysis, the constants are discarded, so the Binary Search Tree lookup time complexity (i.e. H with respect to the input size N) is O(log2(N)). Also, keeping in mind the formula for changing the logarithm base:
log2(N) = log10(N) / log10(2)
and discarding the constant factor 1/log10(2), where instead of 10 one can have an arbitrary logarithm base, the time complexity is simply O(log(N)) regardless of the chosen logarithm base constant.

Assuming that it's a full binary tree, the number of nodes in the leaf will always be equal to (n/2)+1.
For the minimum number of nodes, the total number of nodes could be 1 (satisfying the condition that it should be a balanced tree).

I got the answers from my professor.
1) Maximum number of nodes at the last level: ⌈n/2⌉
If there is a balanced binary search tree with 7 nodes, then the answer would be ⌈7/2⌉ = 4 and for a tree with 15 nodes, the answer would be ⌈15/2⌉ = 8.
But what is troubling is the fact that this formula gives the right answer only when the last level of a balanced tree is completely filled from left to right.
For example, a balanced binary search tree with 5 nodes, the above formula gives an answer of 3 which is not true because a tree with 5 nodes can contain a maximum nodes of 4 nodes at the last level. So I am guessing he meant full balanced binary search tree.
2) Minimum number of nodes at the last level: 1

The maximum number of nodes at level L in a binary tree is 2^L (if you assume that the vertex is level 0). This is easy to see because at each level you spawn 2 children from each previous leaf. The fact that it is balanced/search tree is irrelevant. So you have to find the biggest L such that 2^L < n and subtract it from n. Which in math language is:
The minimum number of nodes depends on the way you balance your tree. There can be height-balanced trees, weight-balanced trees and I assume other balanced trees. Even with height balanced trees you can define what do you mean by a balanced tree. Because technically a tree of 2^N nodes that has a hight of N + 2 is still a balanced tree.

Related

Proof that a binary tree with n leaves has a height of at least log n

I've been able to create a proof that shows the maximum total nodes in a tree is equal to n = 2^(h+1) - 1 and logically I know that the height of a binary tree is log n (can draw it out to see) but I'm having trouble constructing a formal proof to show that a tree with n leaves has "at least" log n. Every proof I've come across or been able to put together always deals with perfect binary trees, but I need something for any situation. Any tips to lead me in the right direction?
Lemma: the number of leaves in a tree of height h is no more than 2^h.
Proof: the proof is by induction on h.
Base Case: for h = 0, the tree consists of only a single root node which is also a leaf; here, n = 1 = 2^0 = 2^h, as required.
Induction Hypothesis: assume that all trees of height k or less have fewer than 2^k leaves.
Induction Step: we must show that trees of height k+1 have no more than 2^(k+1) leaves. Consider the left and right subtrees of the root. These are trees of height no more than k, one less than the height of the whole tree. Therefore, each has at most 2^k leaves, by the induction hypothesis. Since the total number of leaves is just the sum of the numbers of leaves of the subtrees of the root, we have n = 2^k + 2^k = 2^(k+1), as required. This proves the claim.
Theorem: a binary tree with n leaves has height at least log(n).
We have already noted in the lemma that the tree consisting of just the root node has one leaf and height zero, so the claim is true in that case. For trees with more nodes, the proof is by contradiction.
Let n = 2^a + b where 0 < b <= 2^a. Now, assume the height of the tree is less than a + 1, contrary to the theorem we intend to prove. Then the height is at most a. By the lemma, the maximum number of leaves in a tree of height a is 2^a. But our tree has n = 2^a + b > 2^a leaves, since 0 < b; a contradiction. Therefore, the assumption that the height was less than a+1 must have been incorrect. This proves the claim.

Number of comparisons to find an element in a BST with 635 elements?

I am a freshman in Computer Science University, so please give me a understandable justification.
I have a binary tree that is equilibrated by height which has 635 nodes. What is the number of comparisons that will occur in the worst case scenario and why?
Here's one way to think about this. Every time you do a comparison in a binary search tree, one of the following happens:
You have walked off the tree. In this case, you're done.
The value you're looking for matches the node you're currently exploring. In this case, you're done.
The value you're looking for does not match the node you're exploring. In that case, you either descend to the left or descend to the right.
The key observation here is that after each step, you either terminate (yay!) or descend lower in the tree. At each point, you make one comparison. Since you can't descend forever, there are only so many comparisons that you can make - specifically, if the tree has height h, the maximum number of comparisons you can make is h + 1, which happens if you do one comparison per level.
In your question, you're given that you have a balanced binary search tree of 635 nodes. It's not 100% clear what "balanced" means in this context, since there are many different ways of determining whether a tree is balanced and they all lead to different tree heights. I'm going to assume that you are given a complete binary search tree, which is one in which all levels except the last are filled.
The reason this is important is that if you have a complete binary search tree of height h, it can have at most 2h + 1 - 1 nodes in it. If we try to solve for the height of the tree in terms of the number of nodes, we get this:
n = 2h+1 - 1
n + 1 = 2h+1
lg (n + 1) = h + 1
lg (n + 1) - 1 = h
Therefore, if you have the number of nodes n, you can determine the minimum height of a complete binary search tree holding n nodes. In your case, n = 635, so we get
lg (635 + 1) - 1 = h
lg (636) - 1 = h
9.312882955 - 1 = h
8.312882955 = h
Therefore, the tree has height 8.312882955. Of course, trees can't have fractional height, so we can take the ceiling to find that the height of the tree would be 9. Since the maximum number of comparisons made is h + 1, there are at most 10 comparisons made when doing a lookup.
Hope this helps!
Without any loss of generality you can say the maximum no. of comparison will be the height of the BST ... you dont have to visit every node in the node because each comparison takes you closer to the node...
Let's say it is a balanced BST (all nodes except last have 2 child nodes).
For instance,
Level 0 --> Height 1 --> Number of nodes = 1
Level 1 --> Height 2 --> Number of nodes = 2
Level 2 --> Height 3 --> Number of nodes = 3
Level 3 --> Height 4 --> Number of nodes = 8
......
......
Level n --> Height n+1 --> Number of nodes = 2^n or 2^(h-1)
Using the above logic, you can derive the search time for best, worst or average case.

Best case height for a binary tree with N internal nodes

I am working through Algorithms in C++ by Robert Sedgewick and came across the following statement:
The height of a binary tree with N internal nodes is at least lg N
and at most N-1. The best case occurs in a balanced tree with 2^i
internal nodes at every level except possibly the bottom level. If the
height is "h" then we must have
2^(h-1) < N+1 <= 2^h
since there are N+1 external nodes.
There wasn't much explanation surrounding the inequality, so my question is: how did the author deduce the inequality and what is it showing exactly?
Thanks!
The inequality 2^(h-1) < N + 1 <= 2^h demonstrates that, for a given height h, there is a range of node quantities that will have h as a minimum height in common. This is indicative of the property: all binary trees containing N nodes will have a height of at least log(N) rounded up to the next integer.
For example, a tree with either 4, 5, 6 or 7 nodes can have at best a minimum height of 3. One less than this range, and you can have a tree of height 2; one more and the best you can do is a height of 4.
If we map out the minimum height for a tree that grows from 3 nodes to 8 nodes using the base 2 logarithms for N and round up, the inequality becomes clear:
log(3) = 1.58 -> 2 [lower bound]
log(4) = 2 -> 3 [2^(h-1)]
log(5) = 2.32 -> 3
log(6) = 2.58 -> 3
log(7) = 2.81 -> 3
log(8) = 3 -> 4 [2^h | upper bound]
It might be useful to notice that the range (made up of N+1 different quantities) is directly related to the number of external nodes for a given tree. Take a tree with 3 nodes and having a height of 2:
*
/ \
* *
add one node to this tree,
* * * *
/ \ / \ / \ / \
* * or * * or * * or * *
/ \ / \
* * * *
and regardless of where you place it, the height will increase by 1. We can then keep creating leaf nodes without changing the height until the tree contains 7 nodes in total, at which point, any further additions will increase the minimum possible height once more:
*
/ \
* *
/ \ / \
* * * *
Originally, N was equal to 3 nodes, which meant N+1 = 4 and we saw that there were 4 quantities that had a common minimum height.
If you need more information, I suggest you look up the properties of complete and balanced binary trees.
Let's call the minimum height required to fit N nodes in a binary tree minheight(N).
One way to derive a lower bound on the tree height for a given number N of nodes is to work from the other direction: given a tree of height h, what is the maximum number of nodes that can be packed into it?
Let's call this function of height maxnodes(h). Clearly the number of nodes on a binary tree of given height is maximised when the tree is full, i.e. when each internal node has 2 children. Induction will quickly show that maxnodes(h) = 2^h - 1.
So, if we have N nodes, every h for which maxnodes(h) >= N is an upper bound for minheight(N): that is, you could fit all N nodes on a tree of that height. Of all these upper bounds, the best (tightest) one will be the minimum. So what we want to find is the smallest h such that
N <= maxnodes(h) = 2^h - 1
So how to find this smallest satisfying value of h?
The important property of maxnodes(h) is that it is nondecreasing w.r.t. h (in fact it's strictly increasing, but nondecreasing is sufficient). What that means is that you can never fit more nodes into a full binary tree by reducing its height. (Obvious really but it helps to spell things out sometimes!) This makes rearranging the above equation to find the minimum value of h easy:
2^h - 1 >= N
2^h >= N+1 # This is the RHS of your inequality, just flipped around
h >= log2(N+1) # This step is only allowed because log(x) is nondecreasing
h must be integer, so the smallest value of h satisfying h >= log2(N+1) is RoundUp(log2(N+1)).
I find this to be the most useful way to describe the lower bound, but it can be used to derive the LHS of the inequality you're asking about. Starting from the 2nd equation in the previous block:
2^h >= N+1
The set of h values that satisfy this inequality begins at h = log2(N+1) and stretches out to positive infinity. Since h = log2(N+1) is the minimum satisfying value in this set, anything lower must not satisfy the inequality, so in particular h-1 will not satisfy it. If a >= inequality does not hold between two real (non-infinite) numbers then the corresponding < inequality must hold, so:
2^(h-1) < N+1

binary tree data structures

Can anybody give me proof how the number of nodes in strictly binary tree is 2n-1 where n is the number of leaf nodes??
Proof by induction.
Base case is when you have one leaf. Suppose it is true for k leaves. Then you should proove for k+1. So you get the new node, his parent and his other leaf (by definition of strict binary tree). The rest leaves are k-1 and then you can use the induction hypothesis. So the actual number of nodes are 2*(k-1) + 3 = 2k+1 == 2*(k+1)-1.
just go with the basics, assuming there are x nodes in total, then we have n nodes with degree 1(leaves), 1 with degree 2(the root) and x-n-1 with degree 3(the inner nodes)
as a tree with x nodes will have x-1 edges. so summing
n + 3*(x-n-1) + 2 = 2(x-1) (equating the total degrees)
solving for x we get x = 2n-1
I'm guessing that what you really want is something like a proof that the depth is log2(N), where N is the number of nodes. In this case, the answer is fairly simple: for any given depth D, the number of nodes is 2D.
Edit: in response to edited question: the same fact pretty much applies. Since the number of nodes at any depth is 2D, the number of nodes further up the tree is 2D-1 + 2D-2 + ...20 = 2D-1. Therefore, the total number of nodes in a balanced binary tree is 2D + 2D-1. If you set n = 2D, you've gone the full circle back to the original equation.
I think you are trying to work out a proof for: N = 2L - 1 where L is the number
of leaf nodes and N is the total number of nodes in a binary tree.
For this formula to hold you need to put a few restrictions on how the binary
tree is constructed. Each node is either a leaf, which means it has no children, or
it is an internal node. Internal nodes have 3
possible configurations:
2 child nodes
1 child and 1 internal node
2 internal nodes
All three configurations imply that an internal node connects to two other nodes. This explicitly
rules out the situation where node connects to a single child as in:
o
/
o
Informal Proof
Start with a minimal tree of 1 leaf: L = 1, N = 1 substitute into N = 2L - 1 and the see that
the formula holds true (1 = 1, so far so good).
Now add another minimal chunk to the tree. To do that you need to add another two nodes and
tree looks like:
o
/ \
o o
Notice that you must add nodes in pairs to satisfy the restriction stated earlier.
Adding a pair of nodes always adds
one leaf (two new leaf nodes, but you loose one as it becomes an internal node). Node growth
progresses as the series: 1, 3, 5, 7, 9... but leaf growth is: 1, 2, 3, 4, 5... That is why the formula
N = 2L - 1 holds for this type of tree.
You might use mathematical induction to construct a formal proof, but this works find for me.
Proof by mathematical induction:
The statement that there are (2n-1) of nodes in a strictly binary tree with n leaf nodes is true for n=1. { tree with only one node i.e root node }
let us assume that the statement is true for tree with n-1 leaf nodes. Thus the tree has 2(n-1)-1 = 2n-3 nodes
to form a tree with n leaf nodes we need to add 2 child nodes to any of the leaf nodes in the above tree. Thus the total number of nodes = 2n-3+2 = 2n-1.
hence, proved
To prove: A strictly binary tree with n leaves contains 2n-1 nodes.
Show P(1): A strictly binary tree with 1 leaf contains 2(1)-1 = 1 node.
Show P(2): A strictly binary tree with 2 leaves contains 2(2)-1 = 3 nodes.
Show P(3): A strictly binary tree with 3 leaves contains 2(3)-1 = 5 nodes.
Assume P(K): A strictly binary tree with K leaves contains 2K-1 nodes.
Prove P(K+1): A strictly binary tree with K+1 leaves contains 2(K+1)-1 nodes.
2(K+1)-1 = 2K+2-1
= 2K+1
= 2K-1 +2*
* This result indicates that, for each leaf that is added, another node must be added to the father of the leaf , in order for it to continue to be a strictly binary tree. So, for every additional leaf, a total of two nodes must be added, as expected.
int N = 1000; insert here the value of N
int sum = 0; // the number of total nodes
int currFactor = 1;
for (int i = 0; i< log(N); ++i) //the is log(N) levels
{
sum += currFactor;
currFactor *= 2; //in each level the number of node is double than the upper level
}
if(sum == 2*N - 1)
{
cout<<"wow that the number of nodes is 2*N-1";
}

Minimum and maximum height of binary search trees, 2-3-4 trees and B trees

Can anyone please tell me how you find the min/max height of B trees, 2-3-4 trees and binary search trees?
Thanks.
PS: This is not homework.
Minimal and Maximal height of a 2-4 tree
For maximal height of a 2-4 tree, we will be having one key per node, hence it will behave like a Binary Search Tree.
keys at level 0 = 1
keys at level 1 = 2
keys at level 2 = 4 and so on. . . .
Adding total number of keys at each level we get a GP on solving which we will get the maximal height of the tree.
Hence, height = log2(n+1) - 1
Solving it for a total of 10^6 keys we will get :
⇒ 1 * (2^0+ 2^1 + 2^2 + . .. . . . +2^h) = 10^6
⇒ 1*(2^(h+1) - 1) = 10^6
⇒ h = log2(10^6 + 1) - 1
⇒ Maximal height of 2-4 tree with a total of 10^6 keys is 19
For minimal height of a 2-4 tree, we will be having three keys(maximum possible number) per node.
keys at level 0 = 3
keys at level 1 = 3*(4)
keys at level 2 = 3*(4^2) and so on . . .
Hence, height = log4(n+1) - 1
Adding total number of keys at each level we will get a GP on solving which we will get the minimal height. Solving it for a total of 10^6 keys we get:
⇒ 3 * (4^0+ 4^1 + 4^2 + . .. . . . +4^h) = 10^6
⇒ (4^(h+1) - 1) = 10^6
⇒ h = log4(10^6 + 1) - 1
⇒ Minimal height of 2-4 tree is 9
Binary Search Tree
For maximal height we will have a continuous chain of length n(total number of nodes) hence giving us a height equal to n-1(as height starts from 0).
For minimal height we will have a perfectly balanced tree and as solved earlier we will have a height equal to log2(n+1)-1
If you want to know the length of the longest branch you have to traverse the whole tree keeping note of "the longest branch so far".
Start from root node and look for its children
if it is having a child node then
Select the left most child and store others in any one data structure
else
if the height of that node is maximum til now
set it as max
end if
end if
Loop through all nodes of tree and whatever you get at last is the maximum height
Similar you can do for minimum
Minimal height of a binary tree is O(log n), maximal is O(n), depending on how balanced it is.
Wikipedia has a lovely bit about B Tree Heights.
I'm not familiar with 2-3-4 trees, but according to wikipedia they have similar isometry to red-black and B trees, so the above link should educate you on that as well.
As for B trees, the min/max heights depend on the branching factor chosen for the implementation.
Binary trees have a maximum height of n when input is inserted in order, the minimum height is of course log_2(n) when the tree is perfectly balanced. When input is inserted in random order the average height is about 1.39 * log_2 n.
I am not too familiar with b trees but the minimum height is of course log_m(n) when perfectly balanced (m is the number of children per node). According to Wikipedia the maximum height is log_(m/2)(n).
2-3 trees have a maximum height of log_2(n) when the tree consists of only 2-nodes and the minimum height is about log_3(n) [~0.631 log_2(n)] when the tree consists of only 3-nodes.

Resources