Merging two Binary Search Trees efficiently and create one BST - algorithm

Say I have to create one BST by merging two BSTs where T1 and T2 are both BSTs such that T1 has more nodes than T2 and with this algorithm that, for each node in T2, remove node from T2 and insert the value of node into T1 and finally return T1. Reading other posts the runtime of this algo seems to be O(m * log(n)) where n is number of nodes in T1 and m is number of nodes in T2. I am not 100% sure why this results in that running time complexity though. Could anyone explain to me why it's O(m * log(n))? My first guess was O(mn) because within the iteration, we have to remove one node from T2 which results in the runtime of O(m) and the following insertion's runtime is O(n) so shouldn't it be O(mn)?

You said Binary Search Tree, not Balanced Binary Search Tree. And, time complexity of Inserting a value into a BST (Binary Search Tree) is:
Best Case: O(log(n)). (when it is fully balanced).
Worst Case: O(n).
Since, we are moving all the elements from T2 (initially having m elements) to T1 (initially having n elements). Therefore the overall time complexity would be:
Best Case: O(m.log(n)).
Worst Case: O(m.n).
Whereas, in practical we can consider the average time complexity, which would be in between the Best and Worst Case Scenarios.
EDIT: Consider a case for the clarification.
T2 T1 # initial BST
# #
8 3
\ \
9 4
\ \
10 5
\ \
11 6
\
7
m = 4 & n = 5 (initial values).
Moving the 8 to the T1 from T2
T2 T1
# #
9 3
\ \
10 4
\ \
11 5
\
6
\
7
\
8
Steps taken to insert 8 into its right place = 5 or (n)
Total Steps taken = 5 or (n)
Moving the 9 to the T1 from T2
T2 T1
# #
10 3
\ \
11 4
\
5
\
6
\
7
\
8
\
9
Steps taken to insert 9 into its right place = 6 or (n+1)
Total Steps taken = 5+6 or (n) + (n+1)
Moving the 10 to the T1 from T2
T2 T1
# #
11 3
\
4
\
5
\
6
\
7
\
8
\
9
\
10
Steps taken to insert 10 into its right place = 7 (n+2)
Total Steps taken = 5+6+7 or (n) + (n+1) + (n+2)
Moving the 11 to the T1 from T2
T2 T1
# #
NULL 3
\
4
\
5
\
6
\
7
\
8
\
9
\
10
\
11
Steps taken to insert 11 into its right place = 8 (n+3)
Total Steps taken = 5+6+7+8 or (n) + (n+1) + (n+2) + (n+3)
Therefore, by observation we can say, in worst case the number of steps to be taken to move m number of values from T2 to T1:
=> (n) + (n+1) + (n+2) + ....... + (n + (m-1))
=> (n+n+n+...m times) + (1+2+3+....+(m-1))
=> (n.m) + (m.(m-1))/2
Since, the n > m. therefore, we can neglect the second equation after the addition sign.
Hence, the time complexity: O(n.m).

Related

Is there such a BST that has optimal height but does not satisfy the AVL condition?

I'm curious whether it is possible to construct a binary search tree in such a way that it has minimal height for its n elements but it is not an AVL tree.
Or in other words, is every binary search tree with minimal height by definition also an AVL tree?
The AVL requirement is that left and right depths differ at most by 1.
An optimal BST of N elements, where D = ²log N, has the property that sum of depths is minimal. The effect is that the depth of every element resides at most ceil(D) deep.
To have a minimal sum of depths the tree must be filled most full from the top on down, so the sum of individual lengths is minimal.
Not optimal BST - and not AVL:
f
/ \
a q
/ \
n x
/ \ \
j p y
Elememts: 8
Depts: 0 + 1 + 1 + 2 + 2 + 3 + 3 + 3 = 15
Optimal BST - and AVL:
_ f _
/ \
j q
/ \ / \
a n p x
\
y
Elememts: 8
Depts: 0 + 1 + 1 + 2 + 2 + 2 + 2 + 3 = 13
So there is no non-AVL optimal BST.

Space complexity of breadth first search of binary tree?

What would be the space complexity of breadth first search on a binary tree? Since it would only store one level at a time, I don't think it would be O(n).
The space complexity is in fact O(n), as witnessed by a perfect binary tree. Consider an example of depth four:
____________________14____________________
/ \
_______24_________ __________8_________
/ \ / \
__27__ ____11___ ___23___ ____22___
/ \ / \ / \ / \
_4 5 _13 _2 _17 _12 _26 _25
/ \ / \ / \ / \ / \ / \ / \ / \
29 0 9 6 16 19 20 1 10 7 21 15 18 30 28 3
Note that the number of nodes at each depth is given by
depth num_nodes
0 1
1 2
2 4
3 8
4 16
In general, there are 2^d nodes at depth d. The total number of nodes in a perfect binary tree of depth d is n = 1 + 2^1 + 2^2 + ... + 2^d = 2^(d+1) - 1. As d goes to infinity, 2^d/n goes to 1/2. So, roughly half of all nodes occur at the deepest level. Since n/2 = O(n), the space complexity is linear in the number of nodes.
The illustration credit goes to the binarytree package.

space complexity of merge sort using array

This algorithm is of mergesort, I know this may be looking weird to you but my main focus is on calculating space complexity of this algorithm.
If we look at the recurrence tree of mergesort function and try to trace the algorithm then the stack size will be log(n). But since merge function is also there inside the mergesort which is creating two arrays of size n/2, n/2 , then first should I find the space complexity of recurrence relation and then, should I add in that n/2 + n/2 that will become O(log(n) + n).
I know the answer, but I am confused in the process. Can anyone tell me correct procedure?
This confusion is due to merge function which is not recursive but called in a recursive function
And why we are saying that space complexity will be O(log(n) + n) and by the definition of recursive function space complexity, we usually calculate the height of recursive tree
Merge(Leftarray, Rightarray, Array) {
nL <- length(Leftarray)
nR <- length(Rightarray)
i <- j <- k <- 0
while (i < nL && j < nR) {
if (Leftarray[i] <= Rightarray[j])
Array[k++] <- Leftarray[i++]
else
Array[k++] <- Rightarray[j++]
}
while (i < nL) {
Array[k++] <- Leftarray[i++]
}
while (j < nR) {
Array[k++] <- Rightarray[j++]
}
}
Mergesort(Array) {
n <- length(Array)
if (n < 2)
return
mid <- n / 2
Leftarray <- array of size (mid)
Rightarray <- array of size (n-mid)
for i <- 0 to mid-1
Leftarray[i] <- Array[i]
for i <- mid to n-1
Right[i-mid] <- Array[mid]
Mergesort(Leftarray)
Mergesort(Rightarray)
Merge(Leftarray, Rightarray)
}
MergeSort time Complexity is O(nlgn) which is a fundamental knowledge. Merge Sort space complexity will always be O(n) including with arrays. If you draw the space tree out, it will seem as though the space complexity is O(nlgn). However, as the code is a Depth First code, you will always only be expanding along one branch of the tree, therefore, the total space usage required will always be bounded by O(3n) = O(n).
For example, if you draw the space tree out, it seems like it is O(nlgn)
16 | 16
/ \
/ \
/ \
/ \
8 8 | 16
/ \ / \
/ \ / \
4 4 4 4 | 16
/ \ / \ / \ / \
2 2 2 2..................... | 16
/ \ /\ ........................
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | 16
where height of tree is O(logn) => Space complexity is O(nlogn + n) = O(nlogn). However, this is not the case in the actual code as it does not execute in parallel. For example, in the case where N = 16, this is how the code for mergesort executes. N = 16.
16
/
8
/
4
/
2
/ \
1 1
notice how number of space used is 32 = 2n = 2*16 < 3n
Then it merge upwards
16
/
8
/
4
/ \
2 2
/ \
1 1
which is 34 < 3n. Then it merge upwards
16
/
8
/ \
4 4
/
2
/ \
1 1
36 < 16 * 3 = 48
then it merge upwards
16
/ \
8 8
/ \
4 4
/ \
2 2
/\
1 1
16 + 16 + 14 = 46 < 3*n = 48
in a larger case, n = 64
64
/ \
32 32
/ \
16 16
/ \
8 8
/ \
4 4
/ \
2 2
/\
1 1
which is 64*3 <= 3*n = 3*64
You can prove this by induction for the general case.
Therefore, space complexity is always bounded by O(3n) = O(n) even if you implement with arrays as long as you clean up used space after merging and not execute code in parallel but sequential.
Example of my implementation is given below:
This implementation of MergeSort is quite inefficient in memory space and has some bugs:
the memory is not freed, I assume you rely on garbage collection.
the target array Array is not passed to Merge by MergeSort.
Extra space in the amount of the size of the Array is allocated by MergeSort for each recursion level, so at least twice the size of the initial array (2*N) is required, if the garbage collection is optimal, for example if it uses reference counts, and up to N*log2(N) space is used if the garbage collector is lazy. This is much more than required, as a careful implementation can use as little as N/2 extra space.

A statement from heapsort algorithm in clrs book

please explain the underlined statement in the picture. It's from section 6.2 in CLRS. How is the subtree size 2n/3 at most ?
Remember that balance in binary trees is generally a good thing for time complexities! The worst case time complexity occurs when the tree is the most inbalanced it can be. We're working with heaps here – heaps are complete binary trees. The most inbalanced a complete tree can have is when its bottomost level is half-full. This is shown below.
-------*-------
/ \
* *
/ \ / \
/ \ / \
/ \ / \
/-------\ /-------\
/---------\ <-- last level is half-full
Suppose there are m nodes in the last level. Then there must be m - 1 nodes remaining in the left subtree.
-------*-------
/ \
* *
/ \ / \
/ \ / \
/ m-1 \ / \
/-------\ /-------\
/--- m ---\
Why? Well in general, a tree with m leaf nodes must have m - 1 internal nodes. Imagine if these m leaf nodes represented players in a tournament, if one player is eliminated per game, there must be m - 1 games to determine the winner. Each game corresponds to an internal node. Hence there are m - 1 internal nodes.
Because the tree is complete, the right subtree must also have m - 1 nodes.
-------*-------
/ \
* *
/ \ / \
/ \ / \
/ m-1 \ / m-1 \
/-------\ /-------\
/--- m ---\
Hence we have total number of nodes (including the root):
n = 1 + [(m - 1) + m] + (m - 1)
= 3m - 1
Let x = number of nodes in the left subtree. Then:
x = (m - 1) + m
= 2m - 1
We can solve these simultaneous equations, eliminating variable m:
2n - 3x = 1
x = (2n - 1) / 3
Hence x is less than 2n/3. This explains the original statement:
The children's subtrees each have size at most 2n/3 – the worst case occurs when the bottom level of the tree is exactly half full

Converting a tree to Binary index tree

I am reading about BIT(Binary Index Tree) which are useful when we have to do the following task on a array.
change the value of i position of element
Cumulative sum upto i element
As the time complexity will be O(logn) in second case , i am wondering what if the above task is to perform on a simple tree , how i would covert a tree to Binary index tree.
For Example
1
/ \
2 8
/ / \ \
4 5 6 7
/ \
10 9
How can i convert this To BIT so that i can perform the above operation same as i perform in the case of array in O(log n) time

Resources