please explain the underlined statement in the picture. It's from section 6.2 in CLRS. How is the subtree size 2n/3 at most ?
Remember that balance in binary trees is generally a good thing for time complexities! The worst case time complexity occurs when the tree is the most inbalanced it can be. We're working with heaps here – heaps are complete binary trees. The most inbalanced a complete tree can have is when its bottomost level is half-full. This is shown below.
-------*-------
/ \
* *
/ \ / \
/ \ / \
/ \ / \
/-------\ /-------\
/---------\ <-- last level is half-full
Suppose there are m nodes in the last level. Then there must be m - 1 nodes remaining in the left subtree.
-------*-------
/ \
* *
/ \ / \
/ \ / \
/ m-1 \ / \
/-------\ /-------\
/--- m ---\
Why? Well in general, a tree with m leaf nodes must have m - 1 internal nodes. Imagine if these m leaf nodes represented players in a tournament, if one player is eliminated per game, there must be m - 1 games to determine the winner. Each game corresponds to an internal node. Hence there are m - 1 internal nodes.
Because the tree is complete, the right subtree must also have m - 1 nodes.
-------*-------
/ \
* *
/ \ / \
/ \ / \
/ m-1 \ / m-1 \
/-------\ /-------\
/--- m ---\
Hence we have total number of nodes (including the root):
n = 1 + [(m - 1) + m] + (m - 1)
= 3m - 1
Let x = number of nodes in the left subtree. Then:
x = (m - 1) + m
= 2m - 1
We can solve these simultaneous equations, eliminating variable m:
2n - 3x = 1
x = (2n - 1) / 3
Hence x is less than 2n/3. This explains the original statement:
The children's subtrees each have size at most 2n/3 – the worst case occurs when the bottom level of the tree is exactly half full
Related
Let me explain as best as i can. This is about binary tree using vector.
According to author, the implementation is as follows:
A simple structure for representing a binary tree T is based on a way of numbering
the nodes of T. For every node v of T, let f(v) be the integer defined as follows:
• If v is the root of T, then f(v) = 1
• If v is the left child of node u, then f(v) = 2 f(u)
• If v is the right child of node u, then f(v) = 2 f(u)+ 1
The numbering function f is known as a level numbering of the nodes in a binary
tree T, because it numbers the nodes on each level of T in increasing order from
left to right, although it may skip some numbers (see figures below).
Let n be the number of nodes of T, and let fM be the maximum value of f(v)
over all the nodes of T. The vector S has size N = fM + 1, since the element of S at
index 0 is not associated with any node of T. Also, S will have, in general, a number
of empty elements that do not refer to existing nodes of T. For a tree of height h,
N = O(2^h). In the worst case, this can be as high as 2^n − 1.
Question:
The last statement worst case 2^n-1 does not seem right. Here n=number of nodes. I think he meant 2^h-1 instead of 2^n-1. Using figure a) as an example, this would mean 2^n -1 means 2^15-1 = 32768-1 = 32767. Does not make sense.
Any insight is appreciated.
Thanks.
The worst case is when the tree is degenerated to a chain from the root, where each node has two children, but at least one of which is always a leaf. When this chain has n nodes, then the height of the tree is n/2. The vector must span all the levels and allocate room for full levels, even though there is in this degenerate tree only one node per level. The size S of the vector will still be O(2h), but now that in this degenerate case h is O(n/2) = O(n), this makes it O(2n) in the worst case.
The formula 2n-1 seems to suggest the author does not have a proper binary tree in mind, and then the above reasoning should be done with a degenerate tree that consists of a single chain where every node has at the most one child.
Example of worst case
Here is an example tree (not a proper tree, but the principle for proper trees is similar):
1
/
2
\
5
\
11
So n = 4, and h = 3.
The vector however needs to store all the slots where nodes could have been, so something like this:
_____ 1 _____
/ \
__2__ __ __
/ \ / \
_5_
/ \ / \ / \ / \
11
...so the vector has a size of 1+2+4+8 = 15. (Even 16 when we account for the unused slot 0 in the vector)
This illustrates that the size S of the vector is always O(2h). In this worst case (worst with respect to n, not with respect to h), S is O(2n).
Example n=6
When n=6, we could have this as a best case:
1
/ \
2 3
/ \ \
4 5 7
This tree can be represented by a vector of size 8, where the entries at index 0 and index 6 are filled with nulls (unused).
However, for n=6 we could have a worst case ("worst" for the impact on the vector size) when the tree is very unbalanced:
1
\
2
\
3
\
4
\
5
\
7
Now the tree's height is 5 instead of 2, and the vector needs to put that node 7 in the slot at index 63... S is 64. Remember that the vector spans each complete binary level, which doubles in size at each next level.
So when n is 6, S can be 8, 16, 32, or 64. It depends on the shape of the tree. In each case we have that S=O(2h). But when we express S in terms of n, then there is variation, and the best case is that S=O(n), while the worst case is S=O(2n).
I'm curious whether it is possible to construct a binary search tree in such a way that it has minimal height for its n elements but it is not an AVL tree.
Or in other words, is every binary search tree with minimal height by definition also an AVL tree?
The AVL requirement is that left and right depths differ at most by 1.
An optimal BST of N elements, where D = ²log N, has the property that sum of depths is minimal. The effect is that the depth of every element resides at most ceil(D) deep.
To have a minimal sum of depths the tree must be filled most full from the top on down, so the sum of individual lengths is minimal.
Not optimal BST - and not AVL:
f
/ \
a q
/ \
n x
/ \ \
j p y
Elememts: 8
Depts: 0 + 1 + 1 + 2 + 2 + 3 + 3 + 3 = 15
Optimal BST - and AVL:
_ f _
/ \
j q
/ \ / \
a n p x
\
y
Elememts: 8
Depts: 0 + 1 + 1 + 2 + 2 + 2 + 2 + 3 = 13
So there is no non-AVL optimal BST.
What would be the space complexity of breadth first search on a binary tree? Since it would only store one level at a time, I don't think it would be O(n).
The space complexity is in fact O(n), as witnessed by a perfect binary tree. Consider an example of depth four:
____________________14____________________
/ \
_______24_________ __________8_________
/ \ / \
__27__ ____11___ ___23___ ____22___
/ \ / \ / \ / \
_4 5 _13 _2 _17 _12 _26 _25
/ \ / \ / \ / \ / \ / \ / \ / \
29 0 9 6 16 19 20 1 10 7 21 15 18 30 28 3
Note that the number of nodes at each depth is given by
depth num_nodes
0 1
1 2
2 4
3 8
4 16
In general, there are 2^d nodes at depth d. The total number of nodes in a perfect binary tree of depth d is n = 1 + 2^1 + 2^2 + ... + 2^d = 2^(d+1) - 1. As d goes to infinity, 2^d/n goes to 1/2. So, roughly half of all nodes occur at the deepest level. Since n/2 = O(n), the space complexity is linear in the number of nodes.
The illustration credit goes to the binarytree package.
This algorithm is of mergesort, I know this may be looking weird to you but my main focus is on calculating space complexity of this algorithm.
If we look at the recurrence tree of mergesort function and try to trace the algorithm then the stack size will be log(n). But since merge function is also there inside the mergesort which is creating two arrays of size n/2, n/2 , then first should I find the space complexity of recurrence relation and then, should I add in that n/2 + n/2 that will become O(log(n) + n).
I know the answer, but I am confused in the process. Can anyone tell me correct procedure?
This confusion is due to merge function which is not recursive but called in a recursive function
And why we are saying that space complexity will be O(log(n) + n) and by the definition of recursive function space complexity, we usually calculate the height of recursive tree
Merge(Leftarray, Rightarray, Array) {
nL <- length(Leftarray)
nR <- length(Rightarray)
i <- j <- k <- 0
while (i < nL && j < nR) {
if (Leftarray[i] <= Rightarray[j])
Array[k++] <- Leftarray[i++]
else
Array[k++] <- Rightarray[j++]
}
while (i < nL) {
Array[k++] <- Leftarray[i++]
}
while (j < nR) {
Array[k++] <- Rightarray[j++]
}
}
Mergesort(Array) {
n <- length(Array)
if (n < 2)
return
mid <- n / 2
Leftarray <- array of size (mid)
Rightarray <- array of size (n-mid)
for i <- 0 to mid-1
Leftarray[i] <- Array[i]
for i <- mid to n-1
Right[i-mid] <- Array[mid]
Mergesort(Leftarray)
Mergesort(Rightarray)
Merge(Leftarray, Rightarray)
}
MergeSort time Complexity is O(nlgn) which is a fundamental knowledge. Merge Sort space complexity will always be O(n) including with arrays. If you draw the space tree out, it will seem as though the space complexity is O(nlgn). However, as the code is a Depth First code, you will always only be expanding along one branch of the tree, therefore, the total space usage required will always be bounded by O(3n) = O(n).
For example, if you draw the space tree out, it seems like it is O(nlgn)
16 | 16
/ \
/ \
/ \
/ \
8 8 | 16
/ \ / \
/ \ / \
4 4 4 4 | 16
/ \ / \ / \ / \
2 2 2 2..................... | 16
/ \ /\ ........................
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | 16
where height of tree is O(logn) => Space complexity is O(nlogn + n) = O(nlogn). However, this is not the case in the actual code as it does not execute in parallel. For example, in the case where N = 16, this is how the code for mergesort executes. N = 16.
16
/
8
/
4
/
2
/ \
1 1
notice how number of space used is 32 = 2n = 2*16 < 3n
Then it merge upwards
16
/
8
/
4
/ \
2 2
/ \
1 1
which is 34 < 3n. Then it merge upwards
16
/
8
/ \
4 4
/
2
/ \
1 1
36 < 16 * 3 = 48
then it merge upwards
16
/ \
8 8
/ \
4 4
/ \
2 2
/\
1 1
16 + 16 + 14 = 46 < 3*n = 48
in a larger case, n = 64
64
/ \
32 32
/ \
16 16
/ \
8 8
/ \
4 4
/ \
2 2
/\
1 1
which is 64*3 <= 3*n = 3*64
You can prove this by induction for the general case.
Therefore, space complexity is always bounded by O(3n) = O(n) even if you implement with arrays as long as you clean up used space after merging and not execute code in parallel but sequential.
Example of my implementation is given below:
This implementation of MergeSort is quite inefficient in memory space and has some bugs:
the memory is not freed, I assume you rely on garbage collection.
the target array Array is not passed to Merge by MergeSort.
Extra space in the amount of the size of the Array is allocated by MergeSort for each recursion level, so at least twice the size of the initial array (2*N) is required, if the garbage collection is optimal, for example if it uses reference counts, and up to N*log2(N) space is used if the garbage collector is lazy. This is much more than required, as a careful implementation can use as little as N/2 extra space.
Dynamic Programming | Set 33 (Find if a string is interleaved of two other strings)
http://www.geeksforgeeks.org/check-whether-a-given-string-is-an-interleaving-of-two-other-given-strings-set-2/
I found a question in this website and it said "The worst case time complexity of recursive solution is O(2^n).". Therefore, I tried to draw a tree diagram about the worst case this question. I assume that when the length and value of a and b are the same, it will lead to a worst case. It will split into 2 parts till the length of a/b is 0 (using substring).
aa,aa,aaaa
/ \
/ \
a,aa,aaa aa,a,aaa
/ \ / \
/ \ / \
-,aa,aa a,a,aa a,a,aa aa,-,aa
/ / | | \ \
/ / | | \ \
-,a,a -,a,a a,-,a -,a,a a,-,a a,-,a
In this case, it has 13 nodes which is really a worst case, but how to calculate it step by step? IF the length of c increases by 2, it will got 49 nodes. If it increases to a large number, it will be difficult to draw a tree diagram.
Can someone explain this in details, please?
The recurrence for the running time is
T(n) = 2T(n-1)
If you draw the recursion tree you'll see that you have a binary tree with height = n.
Since it is a binary tree, it will have 2^n leaves hence the worst case scenario is O(2^n).