Proof for number of internal nodes in a tree - algorithm

I was reading about compressed tries and read the following:
a compressed trie is a tree which has L leaves and every internal node in the trie has at least 2 children.
Then the author wrote that a tree with L leaves such that every internal node has alteast 2 children, has atmost L-1 internal nodes. I am really confused why this is true.
Can somebody offer a inductive proof for it?

Inductive proof:
we will prove it by induction on L - the number of leaves in the tree.
base: a tree made out of 1 leaf is actually a tree with only a root. It has 0 internal nodes, and the claim is true.
assume the claim is true for a compressed tree with L leaves.
Step: let T be a tree with L+1 leaves. choose an arbitrary leaf, let it be l, and trim it.
In order to make the tree compressed again - you need to make l's father a leaf [if l's father has more then 2 sons including l, skip this step]. We do it by giving it the same value as l's brother, and trimming the brother.
Now you have a tree T' with L leaves.
By induction: T' has at most L-1 internal nodes.
so, T had L-1+1 = L internal nodes, and L+1 leaves, at most.
Q.E.D.
alternative proof:
a binary tree with L leaves has L-1 internal nodes (1 + 2 + 4 + ... + L/2 = L-1)
Since at "worst case" you have a binary tree [every internal node has at least 2 sons!], then you cannot have more then L-1 internal nodes!

You should try drawing a tree with L internal nodes, where every node has 2 children and there are L leaves. If you see why this is impossible, it won't be hard to figure out why it does work for L-1 internal nodes.

Ok, so I'll have a go.
Define trees for a start:
T_0 = { Leaf }
T_i = T_i-1 union { Node(c1, ..., cn) | n >= 1 && ci in T_i-1 }
Trees = sum T_i
Now, the (sketch) proof of your assertion.
It is easy to check it for T_0
For T_i: if t \in T_i it is either in T_i-1 or in the new elements. In the former case, use IH. In the latter case check the assertion (simple: if cis have L_i leaves, t has L = L_1 + ... + L_n leaves. It also hase no more than L_1 - 1 + L_2 - 1 + ... + L_n - 1 + 1 inner nodes (by IH for children, +1 for self). Because we assume each inner node has at least two children (that's the fact from the trie definition), it is no more than L_1 + l_2 + ... + L_n - 2 + 1 = L - 1).
By induction, if the assertion holds for t in T_i for all i, it holds for t in Trees.

Related

A weight-balanced tree is a binary tree in which for each node

A weight-balanced tree is a binary tree in which for each node, the number of nodes in the left sub tree is at least half and at most twice the number of nodes in the right sub tree. The maximum possible height (number of nodes on the path from the root to the furthest leaf) of such a tree on n nodes is best described by which of the following?
A. log2(n)
B. log4/3(n)
C. log3(n)
D. log3/2(n)
My Try:
The number of nodes in the left sub tree is at least half and at most twice the number of nodes in the right sub tree.
There n nodes in the tree, one node is root now (n-1) nodes are left. to get the maximum height of the tree we divide these (n-1) nodes in three parts each of size n−13
Now keep two parts in LST and one part in RST.
LST = 2∗(n−1)/3, and
RST=(n−1)/3
Therefore, T(n)= T(2/3(n-1) + (n-1)/3) and for maximum height we will only consider H(n)=H(2/3(n-1))+1
and H(1)=0
I tried to solve the H(n) Recurrence using substitution but i'm stuck at a point:
2^k/3^k(n-k)=1 Here how to solve for k ? Please help
You have done almost correct but there is a mistake in the recursion statement.
It is should be like this : Root + Left Sub Tree + Right Sub Tree.
And if we have taken 1 node for root than it will be 1 and rest remaining nodes will be
(n-1). And from the remaining nodes we have to distribute in such a way that we get
maximum height with the following condition.
Let nodes in LST be : nl and Let nodes in RST be : nr
Equation/Condtion : nr/2 <= nl <= 2*nr
From that we can get the value of nr which will be n-1/3. Refer the figure for
calculation.
Now we came to the point where you have done the mistake.
T(n) = T(n-1/3) + T(2(n-1)/3) + 1 Where T(1) = 1 And T(0) = (0)
And then we will get log 3/2 n in the end.
Figure with calculations :

Proof that a binary tree with n leaves has a height of at least log n

I've been able to create a proof that shows the maximum total nodes in a tree is equal to n = 2^(h+1) - 1 and logically I know that the height of a binary tree is log n (can draw it out to see) but I'm having trouble constructing a formal proof to show that a tree with n leaves has "at least" log n. Every proof I've come across or been able to put together always deals with perfect binary trees, but I need something for any situation. Any tips to lead me in the right direction?
Lemma: the number of leaves in a tree of height h is no more than 2^h.
Proof: the proof is by induction on h.
Base Case: for h = 0, the tree consists of only a single root node which is also a leaf; here, n = 1 = 2^0 = 2^h, as required.
Induction Hypothesis: assume that all trees of height k or less have fewer than 2^k leaves.
Induction Step: we must show that trees of height k+1 have no more than 2^(k+1) leaves. Consider the left and right subtrees of the root. These are trees of height no more than k, one less than the height of the whole tree. Therefore, each has at most 2^k leaves, by the induction hypothesis. Since the total number of leaves is just the sum of the numbers of leaves of the subtrees of the root, we have n = 2^k + 2^k = 2^(k+1), as required. This proves the claim.
Theorem: a binary tree with n leaves has height at least log(n).
We have already noted in the lemma that the tree consisting of just the root node has one leaf and height zero, so the claim is true in that case. For trees with more nodes, the proof is by contradiction.
Let n = 2^a + b where 0 < b <= 2^a. Now, assume the height of the tree is less than a + 1, contrary to the theorem we intend to prove. Then the height is at most a. By the lemma, the maximum number of leaves in a tree of height a is 2^a. But our tree has n = 2^a + b > 2^a leaves, since 0 < b; a contradiction. Therefore, the assumption that the height was less than a+1 must have been incorrect. This proves the claim.

Number of nodes in the bottom level of a balanced binary tree

I am wondering about two questions that came up when studying about binary search trees. They are the following:
What is the maximum number of nodes in the bottom level of a balanced binary search tree with n nodes?
What is the minimum number of nodes in the bottom level of a balanced binary search tree with n nodes?
I cannot find any formulas in my textbook regarding this. Is there any way to answers these questions? Please let me know.
Using notation:
H = Balanced binary tree height
L = Total number of leaves in a full binary tree of height H
N = Total number of nodes in a full binary tree of height H
The relation is L = (N + 1) / 2 as demonstrated below. That would be the maximum number of leaf nodes for a given tree height H. The minimum number of nodes at a given height is 1 (cannot be zero, because then the tree height would be reduced by one).
Drawing trees with increasing heights, one can observe that:
H = 1, L = 1, N = 1
H = 2, L = 2, N = 3
H = 3, L = 4, N = 7
H = 4, L = 8, N = 15
...
The relation between tree height (H) and the total number of leaves (L)
and the total number of nodes (N) becomes apparent:
L = 2^(H-1)
N = (2^H) - 1
The correctness is easily proven using mathematical induction.
Examples above show that it is true for small H.
Simply put in the value of H (e.g. H=1) and compute L and N.
Assuming the formulas are true for some H, one can show they are also true for HH=H+1:
For L, the assumption is that L=2^(H-1) is true.
As each node has two children, increasing the height by one
is going to replace each leaf node with two new leaves, effectively
doubling the total number of leaves. Therefore, in case of HH=H+1,
the total number of leaves (LL) is going to be doubled:
LL = L * 2
= 2^(H-1) * 2
= 2^(H)
= 2^(HH-1)
For N, the assumption is that N=(2^H)-1 is true.
Increasing the height by one (HH=H+1) increases the total number
of nodes by the total number of added leaf nodes. Therefore,
NN = N + LL
= (2^H) - 1 + 2^(HH-1)
= 2^(HH-1) - 1 + 2^(HH-1)
= 2 * 2^(HH-1) - 1
= (2^HH) - 1
Applying the mathematical induction, the correctness is proven.
H can be expressed in terms of N:
N = (2^H) - 1 // +1 to both sides
N + 1 = 2^H // apply log2 monotone function to both sides
log2(N+1) = log2(2^H)
= H * log2(2)
= H
The direct relation between L and N (which is the answer to the question asked) is:
L = 2^(H - 1) // replace H = log2(N + 1)
= 2^(log2(N + 1) - 1)
= 2^(log2(N + 1) - log2(2))
= 2^(log2( (N + 1) / 2 ))
= (N + 1) / 2
For Big O analysis, the constants are discarded, so the Binary Search Tree lookup time complexity (i.e. H with respect to the input size N) is O(log2(N)). Also, keeping in mind the formula for changing the logarithm base:
log2(N) = log10(N) / log10(2)
and discarding the constant factor 1/log10(2), where instead of 10 one can have an arbitrary logarithm base, the time complexity is simply O(log(N)) regardless of the chosen logarithm base constant.
Assuming that it's a full binary tree, the number of nodes in the leaf will always be equal to (n/2)+1.
For the minimum number of nodes, the total number of nodes could be 1 (satisfying the condition that it should be a balanced tree).
I got the answers from my professor.
1) Maximum number of nodes at the last level: ⌈n/2⌉
If there is a balanced binary search tree with 7 nodes, then the answer would be ⌈7/2⌉ = 4 and for a tree with 15 nodes, the answer would be ⌈15/2⌉ = 8.
But what is troubling is the fact that this formula gives the right answer only when the last level of a balanced tree is completely filled from left to right.
For example, a balanced binary search tree with 5 nodes, the above formula gives an answer of 3 which is not true because a tree with 5 nodes can contain a maximum nodes of 4 nodes at the last level. So I am guessing he meant full balanced binary search tree.
2) Minimum number of nodes at the last level: 1
The maximum number of nodes at level L in a binary tree is 2^L (if you assume that the vertex is level 0). This is easy to see because at each level you spawn 2 children from each previous leaf. The fact that it is balanced/search tree is irrelevant. So you have to find the biggest L such that 2^L < n and subtract it from n. Which in math language is:
The minimum number of nodes depends on the way you balance your tree. There can be height-balanced trees, weight-balanced trees and I assume other balanced trees. Even with height balanced trees you can define what do you mean by a balanced tree. Because technically a tree of 2^N nodes that has a hight of N + 2 is still a balanced tree.

Mathematical function to find number of leaf nodes in a binary tree [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 9 years ago.
Improve this question
There is a unbalanced binary tree of unknown depth. The number of nodes having two child nodes is denoted by T2. The node having only one child is denoted by T1 and leaf nodes are denoted by L. If it is given that T1 = m and T2 = n nodes then can you define a mathematical function f(m, n) which gives number of leaf nodes L?
For example, in the below tree total T2 nodes are m = 3, and total T1 nodes are n = 2. The number of leaf nodes L = f(m,n) = 4. Can you find a mathematical function f(m,n) which gives number of leaf nodes for all trees?
This is pretty easy. In a fully binary tree (that is, every non-leaf has 2 children) of m internal nodes, there are exactly m+1 leaf nodes. Every node that has only one child can be removed, and you still have a binary tree. So, the number of leaf nodes in L is simply m+1. Or answering the question: f(m, n) = m + 1.
It might be useful to give an example of what I mean by "removing T1 nodes". Consider your example. The right 5 has only one child. If we remove the 5 and put the 9 under the 2 directly, the number of leafs does not change.
If we do the same for the 9 (put the 4 directly under the 2), we have a full binary tree, that is, all non-leafs have 2 children.
See the picture for a graphic explanation of how to remove all nodes of type T1 without changing the number of leaf nodes.
All that remains is to prove that in a tree of m internal nodes, where every non-leaf has exactly 2 children, the number of leaf nodes is m+1:
Proof by induction. Induction hypothesis: |L| = |T2|+1
Base: the tree consists of a single node. Clearly, |L|=1 and |T2|=0, so it holds.
Step: Consider a tree with a root that is not a leaf. By the assumption, it has two children, left and right. By the induction hypothesis: |Lleft|=|T2left| + 1 and |Lright| = |T2right| + 1. For the total tree, we have |T2| = |T2left| + |T2right| + 1 and |L| = |Lleft| + |Lright|. Therefore, |L| = |T2left| + 1 + |T2right| + 1 = |T2| + 1.
Alternative proof
The property can also be proved directly, without the handwaving argument of removing the T1 nodes. Again, by induction, with the induction hypothesis |L| = |T2| + 1.
Base: the tree is a single node, so |L| = 1 and |T2| = 0.
Step case 1: the tree has a root with only 1 child, X, then |L| = |LX| and |T2| = |T2X|, so |L| = |T2| + 1 by the induction hypothesis.
Step case 2: the tree has a root with two children, left and right. By the induction hypothesis: |Lleft|=|T2left| + 1 and |Lright| = |T2right| + 1. For the total tree, we have |T2| = |T2left| + |T2right| + 1 and |L| = |Lleft| + |Lright|. Therefore, |L| = |T2left| + 1 + |T2right| + 1 = |T2| + 1.
Therefore, |L| = |T2| + 1 or in other words f(m, n) = m + 1.

binary tree data structures

Can anybody give me proof how the number of nodes in strictly binary tree is 2n-1 where n is the number of leaf nodes??
Proof by induction.
Base case is when you have one leaf. Suppose it is true for k leaves. Then you should proove for k+1. So you get the new node, his parent and his other leaf (by definition of strict binary tree). The rest leaves are k-1 and then you can use the induction hypothesis. So the actual number of nodes are 2*(k-1) + 3 = 2k+1 == 2*(k+1)-1.
just go with the basics, assuming there are x nodes in total, then we have n nodes with degree 1(leaves), 1 with degree 2(the root) and x-n-1 with degree 3(the inner nodes)
as a tree with x nodes will have x-1 edges. so summing
n + 3*(x-n-1) + 2 = 2(x-1) (equating the total degrees)
solving for x we get x = 2n-1
I'm guessing that what you really want is something like a proof that the depth is log2(N), where N is the number of nodes. In this case, the answer is fairly simple: for any given depth D, the number of nodes is 2D.
Edit: in response to edited question: the same fact pretty much applies. Since the number of nodes at any depth is 2D, the number of nodes further up the tree is 2D-1 + 2D-2 + ...20 = 2D-1. Therefore, the total number of nodes in a balanced binary tree is 2D + 2D-1. If you set n = 2D, you've gone the full circle back to the original equation.
I think you are trying to work out a proof for: N = 2L - 1 where L is the number
of leaf nodes and N is the total number of nodes in a binary tree.
For this formula to hold you need to put a few restrictions on how the binary
tree is constructed. Each node is either a leaf, which means it has no children, or
it is an internal node. Internal nodes have 3
possible configurations:
2 child nodes
1 child and 1 internal node
2 internal nodes
All three configurations imply that an internal node connects to two other nodes. This explicitly
rules out the situation where node connects to a single child as in:
o
/
o
Informal Proof
Start with a minimal tree of 1 leaf: L = 1, N = 1 substitute into N = 2L - 1 and the see that
the formula holds true (1 = 1, so far so good).
Now add another minimal chunk to the tree. To do that you need to add another two nodes and
tree looks like:
o
/ \
o o
Notice that you must add nodes in pairs to satisfy the restriction stated earlier.
Adding a pair of nodes always adds
one leaf (two new leaf nodes, but you loose one as it becomes an internal node). Node growth
progresses as the series: 1, 3, 5, 7, 9... but leaf growth is: 1, 2, 3, 4, 5... That is why the formula
N = 2L - 1 holds for this type of tree.
You might use mathematical induction to construct a formal proof, but this works find for me.
Proof by mathematical induction:
The statement that there are (2n-1) of nodes in a strictly binary tree with n leaf nodes is true for n=1. { tree with only one node i.e root node }
let us assume that the statement is true for tree with n-1 leaf nodes. Thus the tree has 2(n-1)-1 = 2n-3 nodes
to form a tree with n leaf nodes we need to add 2 child nodes to any of the leaf nodes in the above tree. Thus the total number of nodes = 2n-3+2 = 2n-1.
hence, proved
To prove: A strictly binary tree with n leaves contains 2n-1 nodes.
Show P(1): A strictly binary tree with 1 leaf contains 2(1)-1 = 1 node.
Show P(2): A strictly binary tree with 2 leaves contains 2(2)-1 = 3 nodes.
Show P(3): A strictly binary tree with 3 leaves contains 2(3)-1 = 5 nodes.
Assume P(K): A strictly binary tree with K leaves contains 2K-1 nodes.
Prove P(K+1): A strictly binary tree with K+1 leaves contains 2(K+1)-1 nodes.
2(K+1)-1 = 2K+2-1
= 2K+1
= 2K-1 +2*
* This result indicates that, for each leaf that is added, another node must be added to the father of the leaf , in order for it to continue to be a strictly binary tree. So, for every additional leaf, a total of two nodes must be added, as expected.
int N = 1000; insert here the value of N
int sum = 0; // the number of total nodes
int currFactor = 1;
for (int i = 0; i< log(N); ++i) //the is log(N) levels
{
sum += currFactor;
currFactor *= 2; //in each level the number of node is double than the upper level
}
if(sum == 2*N - 1)
{
cout<<"wow that the number of nodes is 2*N-1";
}

Resources