Converting a tree to Binary index tree - algorithm

I am reading about BIT(Binary Index Tree) which are useful when we have to do the following task on a array.
change the value of i position of element
Cumulative sum upto i element
As the time complexity will be O(logn) in second case , i am wondering what if the above task is to perform on a simple tree , how i would covert a tree to Binary index tree.
For Example
1
/ \
2 8
/ / \ \
4 5 6 7
/ \
10 9
How can i convert this To BIT so that i can perform the above operation same as i perform in the case of array in O(log n) time

Related

Data Structures and Algorithmn in C++ 2nd Ed - Goodrich . Page 295 question on vector-based structure binary tree worst case for space 2^n - 1

Let me explain as best as i can. This is about binary tree using vector.
According to author, the implementation is as follows:
A simple structure for representing a binary tree T is based on a way of numbering
the nodes of T. For every node v of T, let f(v) be the integer defined as follows:
• If v is the root of T, then f(v) = 1
• If v is the left child of node u, then f(v) = 2 f(u)
• If v is the right child of node u, then f(v) = 2 f(u)+ 1
The numbering function f is known as a level numbering of the nodes in a binary
tree T, because it numbers the nodes on each level of T in increasing order from
left to right, although it may skip some numbers (see figures below).
Let n be the number of nodes of T, and let fM be the maximum value of f(v)
over all the nodes of T. The vector S has size N = fM + 1, since the element of S at
index 0 is not associated with any node of T. Also, S will have, in general, a number
of empty elements that do not refer to existing nodes of T. For a tree of height h,
N = O(2^h). In the worst case, this can be as high as 2^n − 1.
Question:
The last statement worst case 2^n-1 does not seem right. Here n=number of nodes. I think he meant 2^h-1 instead of 2^n-1. Using figure a) as an example, this would mean 2^n -1 means 2^15-1 = 32768-1 = 32767. Does not make sense.
Any insight is appreciated.
Thanks.
The worst case is when the tree is degenerated to a chain from the root, where each node has two children, but at least one of which is always a leaf. When this chain has n nodes, then the height of the tree is n/2. The vector must span all the levels and allocate room for full levels, even though there is in this degenerate tree only one node per level. The size S of the vector will still be O(2h), but now that in this degenerate case h is O(n/2) = O(n), this makes it O(2n) in the worst case.
The formula 2n-1 seems to suggest the author does not have a proper binary tree in mind, and then the above reasoning should be done with a degenerate tree that consists of a single chain where every node has at the most one child.
Example of worst case
Here is an example tree (not a proper tree, but the principle for proper trees is similar):
1
/
2
\
5
\
11
So n = 4, and h = 3.
The vector however needs to store all the slots where nodes could have been, so something like this:
_____ 1 _____
/ \
__2__ __ __
/ \ / \
_5_
/ \ / \ / \ / \
11
...so the vector has a size of 1+2+4+8 = 15. (Even 16 when we account for the unused slot 0 in the vector)
This illustrates that the size S of the vector is always O(2h). In this worst case (worst with respect to n, not with respect to h), S is O(2n).
Example n=6
When n=6, we could have this as a best case:
1
/ \
2 3
/ \ \
4 5 7
This tree can be represented by a vector of size 8, where the entries at index 0 and index 6 are filled with nulls (unused).
However, for n=6 we could have a worst case ("worst" for the impact on the vector size) when the tree is very unbalanced:
1
\
2
\
3
\
4
\
5
\
7
Now the tree's height is 5 instead of 2, and the vector needs to put that node 7 in the slot at index 63... S is 64. Remember that the vector spans each complete binary level, which doubles in size at each next level.
So when n is 6, S can be 8, 16, 32, or 64. It depends on the shape of the tree. In each case we have that S=O(2h). But when we express S in terms of n, then there is variation, and the best case is that S=O(n), while the worst case is S=O(2n).

Building an AVL Tree out of Binary Search Tree

I need to suggest an algorithm that takes BST (Binary Search Tree), T1 that has 2^(n + 1) - 1 keys, and build an AVL tree with same keys. The algorithm should be effective in terms of worst and average time complexity (as function of n).
I'm not sure how should I approach this. It is clear that the minimal size of a BST that has 2^(n + 1) - 1 keys is n (and that will be the case if it is full / balanced), but how does it help me?
There is the straight forward method that is to iterate over the tree , each time adding the root of T1 to the AVL tree and then removing it from T1:
Since T1 may not be balanced the delete may cost O(n) in worst case
Insert to the AVL will cost O(log n)
There are 2^(n + 1) - 1
So in total that will cost O(n*logn*2^n) and that is ridiculously expensive.
But why should I remove from T1? I'm paying a lot there and for no good reason.
So I figured why not using tree traversal over T1 , and for each node I'm visiting , add it to the AVL tree:
There are 2^(n + 1) - 1 nodes so traversal will cost O(2^n) (visiting each node once)
Adding the current node each time to the AVL will cost O(logn)
So in total that will cost O(logn * 2^n).
and that is the best time complexity I could think of, the question is, can it be done in a faster way? like in O(2^n) ?
Some way that will make the insert to the AVL tree cost only O(1)?
I hope I was clear and that my question belongs here.
Thank you very much,
Noam
There is an algorithm that balances a BST and runs in linear time called Day Stout Warren Algorithm
Basically all it does is convert the BST into a sorted array or linked list by doing an in-order traversal (O(n)). Then, it recursively takes the middle element of the array, makes it the root, and makes its children the middle elements of the left and right subarrays respectively (O(n)). Here's an example,
UNBALANCED BST
5
/ \
3 8
/ \
7 9
/ \
6 10
SORTED ARRAY
|3|5|6|7|8|9|10|
Now here are the recursive calls and resulting tree,
DSW(initial array)
7
7.left = DSW(left array) //|3|5|6|
7.right = DSW(right array) //|8|9|10|
7
/ \
5 9
5.left = DSW(|3|)
5.right = DSW(|6|)
9.left = DSW(|8|)
9.right = DSW(|10|)
7
/ \
5 9
/ \ / \
3 6 8 10

Searching in a balanced binary search tree

I was reading about balanced binary search tree. I found this statement about searching in such tree:
It is not true that when you are looking for something in a balanced binary search tree with n elements, it can in worst case needed n/2 comparisons.
Why it is not true?
Isn't it that we look either to the right side or the left side of the tree so the comparisons should be n/2?
The search worst case of Balanced Binary Search tree is governed by its height. It is O(height) where the height is log2(n) since it is balanced.
In worst case, the node that we looking for resides in a leaf or doesn't exist at all, and hence we need to traverse the tree from the root to its leafs which is O(lgn) and not O(n/2)
Consider the following balanced binary tree for n=7 (this is in fact a complete binary search tree, but lets leave that out of this discussion, as a complete binary search tree is also a balanced binary search tree).
5 depth 1 (root)
/----+----\
2 6 depth 2
/--+--\ /--+--\
1 3 4 7 depth 3
For searching of any number in this tree, the worst case scenario is that we reach the maximum depth of the tree (e.g., 3 in this case), until we terminate the search. At depth 3, we have performed 3 comparisons, hence, at arbitrary depth l, we would have performed l comparisons.
Now, for a complete binary search tree as the one above, of arbitrary size, we can hold 1 + 2^(maxDepth-1) different numbers. Now, let's say we have a complete binary search tree with exactly n (distinct) numbers. Then the following holds
n = 1 + 2^(maxDepth-1) (+)
Hence
(+) <=> 2^(maxDepth-1) = n - 1
<=> log2(2^(maxDepth - 1)) = log2(n - 1)
<=> maxDepth - 1 = log2(n - 1)
=> maxDepth = log2(n - 1) + 1
Recall from above that maxDepth told us the worst case number of comparisons for us to find a number (or it's non-existance) in our complete binary tree. Hence
worst case scenario, n nodes : log2(n-1) + 1
For studying asymptotic or limiting behaviour of this search, n can be considered sufficiently large, and hence log2(n) ~= log2(n-1) holds, and subsequently, we can say that a quite good (tight) upper bound for the algorithm is O(log2(n)). Hence
The time complexity for searching in a complete binary tree,
for n nodes, is O(log2(n))
For a non-complete binary search tree, an analogous reasoning as the one above leads to the same time complexity. Note that for a non-balanced search tree the worst case scenario for n nodes is n comparisons.
Answer: From above, it's clear that O(n/2) is not a proper bound for the time complexity of a binary search tree of size n; whereas however O(log2(n)) is. (Note that the prior might be a correct bound for sufficiently large n, but not a very good/tight one!)
Imagine the tree with 10 nodes: 1,2,3,4,5..10.
If you are looking for 5, how many comparisons would it take? How about if you look for 10?
It's actually never N/2.
The worst case scenario is that the element you are searching for is a leaf (or isn't contained in a tree), and the number of comparisons then is equal to tree height which is log2(n).
The best balanced binary tree is the AVL tree. I say "the best" conditioned to the fact that their modifying operations are O(log(n)). If the tree is perfectly balanced, then its height is still less (but it is not known a way for modifying it in O(log(n)).
It could be shown that the maximum height of an AVL tree is less than
1.4404 log(n+2) - 0.3277
Consequently the worst case for a search in an AVL tree is an unsuccessful search whose path from the root ends in the deepest node. But by the previous result, this path cannot be longer than 1.4404 log(n+2) - 0.3277.
And since 1.4404 log(n+2) - 0.3277 < n/2, the statement is false (assuming a n enough large)
lets first see the BST(binary search tree) properties which tell that..
-- root must be > then left_child
-- root must be < right child
10
/ \
8 12
/ \ / \
5 9 11 15
/ \ / \
1 7 14 25
height of given tree is 3(number of edges in longest path 10-14).
suppose you query to search 14 in given balanced BST
node-- 10 14 > 10
/ \ go to right sub tree because all nodes
8 12 in right sub tree are > 10
/ \ / \
5 9 11 15 n = 11 total node
/ \ / \
1 7 14 25
node-- 12 14 > 12
/ \ again go to right sub tree..
11 15
/ \ n = 5
14 25
node-- 15 14 > 15
/ \ this time node value is > required value so
14 25 goto left sub tree
n = 3
'node -- 14 14 == 14 value find
n = 1'
from above example we can see that at every comparison size of problem(number of nodes) halve we can also say that at every comparison we switch to next level thus height of tree is increased by 1 .
as max height of balanced BST is log(N) in worst case we need to go to leaf of tree hence we take log(N) step to do so..
hence O of BST search is log(N).

Time complexity of a recursive algorithm which split into two per run mostly

Dynamic Programming | Set 33 (Find if a string is interleaved of two other strings)
http://www.geeksforgeeks.org/check-whether-a-given-string-is-an-interleaving-of-two-other-given-strings-set-2/
I found a question in this website and it said "The worst case time complexity of recursive solution is O(2^n).". Therefore, I tried to draw a tree diagram about the worst case this question. I assume that when the length and value of a and b are the same, it will lead to a worst case. It will split into 2 parts till the length of a/b is 0 (using substring).
aa,aa,aaaa
/ \
/ \
a,aa,aaa aa,a,aaa
/ \ / \
/ \ / \
-,aa,aa a,a,aa a,a,aa aa,-,aa
/ / | | \ \
/ / | | \ \
-,a,a -,a,a a,-,a -,a,a a,-,a a,-,a
In this case, it has 13 nodes which is really a worst case, but how to calculate it step by step? IF the length of c increases by 2, it will got 49 nodes. If it increases to a large number, it will be difficult to draw a tree diagram.
Can someone explain this in details, please?
The recurrence for the running time is
T(n) = 2T(n-1)
If you draw the recursion tree you'll see that you have a binary tree with height = n.
Since it is a binary tree, it will have 2^n leaves hence the worst case scenario is O(2^n).

insert N items into an empty binary search tree

Why is the worst case big-O for inserting N items into an empty binary search tree n^2? there are no balance checks.
Each item is O(n), and there are n items. Even though the O(n) per item is an "increasing as it goes" n, you still get 0 + 1 + 2 + 3 ... (n-1) which is n(n-1)/2 = O(n^2).
In other words, suppose we're adding 10, 20, 30, 40:
Step 1: empty tree, insert 10:
10
Step 2: compare 20 with 10; bigger, therefore tree becomes:
10
\
20
Step 3: compare 30 with 10; bigger, so move down to node with 20.
compare 30 with 20; bigger, therefore tree becomes:
10
\
20
\
30
Step 4: compare 40 with 10; bigger, so move down to node with 20.
compare 40 with 20; bigger, so move down to node with 30.
compare 40 with 30; bigger, therefore tree becomes:
10
\
20
\
30
\
40
Notice how we get one more comparison each time - so the first element takes 0 comparisons, the second takes 1, the third takes 2 etc - summing to n(n-1).
Of course, this is only the case if you insert in sort order (either small to big or big to small). Inserting in an order which happens to balance the tree will be significantly cheaper.
In the worst case, your BST is a list, and insertion of N items to the end of an empty is is O(n^2).

Resources