My professor showed the following problem in class and mentioned that the answer is O(1) while mine was quit different, I hope to get some help knowing of what mistakes did I made.
Question:
Calculate the Amortized Time Complexity for F method in AVL tree, when we start from the minimal node and each time we call F over the last found member.
Description of F: when we are at specific node F continues just like inorder traversal starting from the current one until the next one in inorder traversal for the next call.
What I did:
First I took an random series of m calls to F.
I said for the first one we need O(log n) - to find the most minimal node then for the next node we need to do inorder again but continues one more step so O(log n)+1 an so on until I scan m elements.
Which gets me to:
To calculate Amortized Time we do T(m)/m then I get:
Which isn't O(1) for sure.
The algorithm doesn't start by searching for any node, but instead is already passed a node and will start from that node. E.g. pseudocode for F would look like this:
F(n):
if n has right child
n = right child of n
while n has left child
n = left child of n
return n
else
prev = n
cur = parent of n
while prev is right child of cur and cur is not root
prev = cur
cur = parent of prev
if cur is root and prev is right child of cur
error "Reached end of traversal"
else
return cur
The above code basically does an in-order traversal of a tree starting from a node until the next node is reached.
Amortized runtime:
Pick an arbitrary tree and m. Let r_0 be the lowest common ancestor of all nodes visited by F. Now define r_(n + 1) as the lowest common ancestor of all nodes in the right subtree of r_n that will be returned by F. This recursion bottoms out for r_u, which will be the m-th node in in-order traversal. Any r_n will be returned by F in some iteration, so all nodes in the left subtree of r_n will be returned by F as well.
All nodes that will be visited by F are either also returned by F or are nodes on the path from r_0 to r_u. Since r_0 is an ancestor of r_1 and r_1 is an ancestor of r_2, etc., the path from r_0 to r_u can be at most as long as the right subtree is high. The height of the tree is limited by log_phi(m + 2), so in total at most
m + log_phi(m + 2)
nodes will be visited during m iterations of F. All nodes visited by F form a subtree, so there are at most 2 * (m + log_phi(m + 2)) edges that will be traversed by the algorithm, leading to an amortized runtime-complexity of
2 * (m + log_phi(m + 2)) / m = 2 + 2 * log_phi(m + 2) / m = O(1)
(The above bounds are in reality considerably tighter, but for the calculation presented here completely sufficient)
Related
A weight-balanced tree is a binary tree in which for each node, the number of nodes in the left sub tree is at least half and at most twice the number of nodes in the right sub tree. The maximum possible height (number of nodes on the path from the root to the furthest leaf) of such a tree on n nodes is best described by which of the following?
A. log2(n)
B. log4/3(n)
C. log3(n)
D. log3/2(n)
My Try:
The number of nodes in the left sub tree is at least half and at most twice the number of nodes in the right sub tree.
There n nodes in the tree, one node is root now (n-1) nodes are left. to get the maximum height of the tree we divide these (n-1) nodes in three parts each of size n−13
Now keep two parts in LST and one part in RST.
LST = 2∗(n−1)/3, and
RST=(n−1)/3
Therefore, T(n)= T(2/3(n-1) + (n-1)/3) and for maximum height we will only consider H(n)=H(2/3(n-1))+1
and H(1)=0
I tried to solve the H(n) Recurrence using substitution but i'm stuck at a point:
2^k/3^k(n-k)=1 Here how to solve for k ? Please help
You have done almost correct but there is a mistake in the recursion statement.
It is should be like this : Root + Left Sub Tree + Right Sub Tree.
And if we have taken 1 node for root than it will be 1 and rest remaining nodes will be
(n-1). And from the remaining nodes we have to distribute in such a way that we get
maximum height with the following condition.
Let nodes in LST be : nl and Let nodes in RST be : nr
Equation/Condtion : nr/2 <= nl <= 2*nr
From that we can get the value of nr which will be n-1/3. Refer the figure for
calculation.
Now we came to the point where you have done the mistake.
T(n) = T(n-1/3) + T(2(n-1)/3) + 1 Where T(1) = 1 And T(0) = (0)
And then we will get log 3/2 n in the end.
Figure with calculations :
I am doing a problem in binary trees, and when I came across a problem find the right most node in the last level of a complete binary tree and the issue here is we have to do it in O(n) time which was a stopping point, Doing it in O(n) is simple by traversing all the elements, but is there a way to do this in any complexity less than O(n), I have browsed through internet a lot, and I couldn't get anything regarding the thing.
Thanks in advance.
Yes, you can do it in O(log(n)^2) by doing a variation of binary search.
This can be done by first going to the leftest element1, then to the 2nd leftest element, then to the 4th leftest element, 8th ,... until you find there is no such element.
Let's say the last element you found was the ith, and the first you didn't was 2i.
Now you can simply do a binary search over that range.
This is O(log(n/2)) = O(logn) total iterations, and since each iteration is going down the entire tree, it's total of O(log(n)^2) time.
(1) In here and the followings, the "x leftest element" is referring only to the nodes in the deepest level of the tree.
I assume that you know the number of nodes. Let n such number.
In a complete binary tree, a level i has twice the number of nodes than the level i - 1.
So, you could iteratively divide n between 2. If there remainder then n is a right child; otherwise, is a left child. You store into a sequence, preferably a stack, whether there is remainder or not.
Some such as:
Stack<char> s;
while (n > 1)
{
if (n % 2 == 0)
s.push('L');
else
s.push('R');
n = n/2; // n would int so division is floor
}
When the while finishes, the stack contains the path to the rightmost node.
The number of times that the while is executed is log_2(n).
This is the recursive solution with time complexity O(lg n* lg n) and O(lg n) space complexity (considering stack storage space).
Space complexity can be reduced to O(1) using Iterative version of the below code.
// helper function
int getLeftHeight(TreeNode * node) {
int c = 0;
while (node) {
c++;
node = node -> left;
}
return c;
}
int getRightMostElement(TreeNode * node) {
int h = getLeftHeight(node);
// base case will reach when RightMostElement which is our ans is found
if (h == 1)
return node -> val;
// ans lies in rightsubtree
else if ((h - 1) == getLeftHeight(node -> right))
return getRightMostElement(node -> right);
// ans lies in left subtree
else getRightMostElement(node -> left);
}
Time Complexity derivation -
At each recursion step, we are considering either left subtree or right subtree i.e. n/2 elements for maximum height (lg n) function calls,
calculating height takes lg n time -
T(n) = T(n/2) + c1 lgn
= T(n/4) + c1 lgn + c2 (lgn - 1)
= ...
= T(1) + c [lgn + (lgn-1) + (lgn-2) + ... + 1]
= O(lgn*lgn)
Since it's a complete binary tree, going over all the right nodes until you reach the leaves will take O(logN), not O(N). In regular binary tree it takes O(N) because in the worst case all the nodes are lined up to the right, but since it's a complete binary tree, it can't be
How can we prove that the update and query operations on a segment tree (http://letuskode.blogspot.in/2013/01/segtrees.html) (not to be confused with an interval tree) are O(log n)?
I thought of a way which goes like this - At every node, we make at most two recursive calls on the left and right sub-trees. If we could prove that one of these calls terminates fairly quickly, the time complexity would be logarithmically bounded. But how do we prove this?
Lemma: at most 2 nodes are used at each level of the tree(a level is set of nodes with a fixed distance from the root).
Proof: Let's assume that at the level h at least 3 nodes were used(let's call them L, M and R). It means that the entire interval from the left bound of the L node to the right bound of the R node lies inside the query range. That's why M is fully covered by a node(let's call it UP) from the h - 1 level that fully lies inside the query range. But it implies that M could not be visited at all because the traversal would stop in the UP node or higher. Here are some pictures to clarify this step of the proof:
h - 1: UP UP UP
/\ /\ /\
h: L M R L M R L M R
That's why at most two nodes at each level are used. There are only log N levels in a segment tree so at most 2 * log N are used in total.
The claim is that there are at most 2 nodes which are expanded at each level. We will prove this by contradiction.
Consider the segment tree given below.
Let's say that there are 3 nodes that are expanded in this tree. This means that the range is from the left most colored node to the right most colored node. But notice that if the range extends to the right most node, then the full range of the middle node is covered. Thus, this node will immediately return the value and won't be expanded. Thus, we prove that at each level, we expand at most 2 nodes and since there are logn levels, the nodes that are expanded are 2⋅logn=Θ(logn)
If we prove that there at most N nodes to visit on each level and knowing that Binary segment tree has max logN height - we can say that query operatioin has is O(LogN) complexity. Other answers tell you that there at most 2 nodes to visit on each level but i assume that there at most 4 nodes to visit 4 nodes are visited on the level. You can find the same statement without proof in other sources like Geek for Geeks
Other answers show you too small segment tree. Consider this example: Segment tree with leaf nodes size - 16, indexes start from zero. You are looking for the range [0-14]
See example: Crossed are nodes that we are visiting
At each level (L) of tree there would be at max 2 nodes which could have partial overlap. (If unable to prove - why ?, please mention)
So, at level (L+1) we have to explore at max 4 nodes. and total height/levels in the tree is O(log(N)) (where N is number of nodes). Hence time complexity is O(4*Log(N)) ~ O(Log(N)).
PS: Please refer diagram attached by #Oleksandr Papchenko to get better understanding.
I will try to give simple mathematical explanation.
Look at the code below . As per the segment tree implementation for range_query
int query(int node, int st, int end, int l, int r)
{
/*if range lies inside the query range*/
if(l <= st && end <= r )
{
return tree[node];
}
/*If range is totally outside the query range*/
if(st>r || end<l)
return INT_MAX;
/*If query range intersects both the children*/
int mid = (st+end)/2;
int ans1 = query(2*node, st, mid, l, r);
int ans2 = query(2*node+1, mid+1, end, l, r);
return min(ans1, ans2);
}
you go left and right and if its range then you return value.
So at each level 2 nodes are selected let's call LeftMost and rightMost. If say some other node is selected in between called mid one, then their least common ancestor must have been same and that range would have been included. thus
thus , For segment tree with logN levels.
Search at each level = 2
Total search = (search at each level ) * (number of levels) = (2logN)
Therefore search complexity = O(2logN) ~ O(logN).
P.S for space complexity (https://codeforces.com/blog/entry/49939 )
Suppose a Node (in a BST) is defined as follows (ignore all the setters/getters/inits).
class Node
{
Node parent;
Node leftChild;
Node rightChild;
int value;
// ... other stuff
}
Given some a reference to some Node in a BST (called startNode) and another Node (called target), one is to check whether the tree containing startNode has any node whose value is equal to target.value.
I have two algorithms to do this:
Algorithm #1:
- From `startNode`, trace the way to the root node (using the `Node.parent` reference) : O(n)
- From the root node, do a regular binary search for the target : O(log(n))
T(n) = O(log(n) + n)
Algorithm #2: Basically perform a DFS
(Psuedo-code only)
current_node = startnode
While the root has not been reached
go up one level from the current_node
perform a binary-search from this node downward (excluding the branch from which we just go up)
What is the time-complexity of this algorithm?
The naive answer would be O(n * log(n)), where n is for the while loop, as there are at most n nodes, and log(n) is for the binary-search. But obviously, that is way-overestimating!
The best (partial) answer I could come up with was:
Suppose each sub-branch has some m_i nodes and that there are k
sub-branches.
In other words, k is the number of nodes between startNode and the root node
The total time would be
.
T(n) = log(m1) + log(m2) + ... + log(mk)
= log(m1 * m2 * ... * mk)
Where m1 + m2 + ... + mk = n (the total number of nodes in the tree)
(This is the best estimation I could get as I forgot most of my maths to do any better!)
So I have two questions:
0) What is the time-complexity of algorithm #2 in terms of n
1) Which algorithm does better in term of time-complexity?
Ok, after digging through my old Maths books, I was able to find that the upper bound of a product of k numbers whose sum is n is p <= (n /k) ^k.
With that said, the T(n) function would become:
T(n) = O(f(n, k))
Where
f(n, k) = log((n/k)^k)
= k * log(n/k)
= k * log(n) - k * log(k)
(Remember, k is the number nodes between the startNode and the root, while n is the total number of node)
How would I go from here? (ie., how do I simplify the f(n, k)? Or is that good enough for Big-O analysis? )
Below is an iterative algorithm to traverse a Binary Search Tree in in-order fashion (first left child , then the parent , finally right child) without using a Stack :
(Idea : the whole idea is to find the left-most child of a tree and find the successor of the node at hand each time and print its value , until there's no more node left.)
void In-Order-Traverse(Node root){
Min-Tree(root); //finding left-most child
Node current = root;
while (current != null){
print-on-screen(current.key);
current = Successor(current);
}
return;
}
Node Min-Tree(Node root){ // find the leftmost child
Node current = root;
while (current.leftChild != null)
current = current.leftChild;
return current;
}
Node Successor(Node root){
if (root.rightChild != null) // if root has a right child ,find the leftmost child of the right sub-tree
return Min-Tree(root.rightChild);
else{
current = root;
while (current.parent != null && current.parent.leftChild != current)
current = current.parent;
return current.parrent;
}
}
It's been claimed that the time complexity of this algorithm is Theta(n) assuming there are n nodes in the BST , which is for sure correct . However I cannot convince myself as I guess some of the nodes are traversed more than constant number of times which depends on the number of nodes in their sub-trees and summing up all these number of visits wouldn't result time complexity of Theta(n)
Any idea or intuition on how to prove it ?
It is easier to reason with edges rather than nodes. Let us reason based on the code of Successor function.
Case 1 (then branch)
For all nodes with a right child, we will visit the right subtree once ("right-turn" edge), then always visit the left subtree ("left-turn" edges) with Min-Tree function. We can prove that such traversal will create a path whose edges are unique - the edges will not be repeated in any traversal made from any other node with a right child, since the traversal ensures that you never visit any "right-turn" edge of other nodes on the tree. (Proof by construction).
Case 2 (else branch)
For all nodes without a right child (else branch), we will visit the ancestors by following "right-turn" edges until you have to make a "left-turn" edge or encounter the root of the binary tree. Again, the edges in the path generated are unique - will never be repeated in any other traversal made from any other node without a right child. This is because:
Except for the starting node and the node reached by following "left-turn" edge, all other nodes in between has a right child (which means those are excluded from else branch). The starting node of course does not have a right child.
Each node has a unique parent (only the root node does not have parent), and the path to parent is either "left-turn" or "right-turn" (the node is a left child or a right child). Given any node (ignoring the right child condition), there is only one path that creates the pattern: many "right-turn" then a "left-turn".
Since the nodes in between have a right child, there is no way for an edge to appear in 2 traversal starting at different nodes. (Since we are currently considering nodes without a right child).
(The proof here is quite hand-waving, but I think it can be formally proven by contradiction).
Since the edges are unique, the total number of edges traversed in case 1 only (or case 2 only) will be O(n) (since the number of edges in a tree is equal to the number of vertices - 1). Therefore, after summing the 2 cases up, In-Order Traversal will be O(n).
Note that I only know each edge is visited at most once - I don't know whether all edges are visited or not from the proof, but the number of edges is bounded by the number of vertices, which is just right.
We can easily see that it is also Omega(n) (each node is visited once), so we can conclude that it is Theta(n).
The given program runs in Θ(N) time. Θ(N) doesn't mean that each node is visited exactly once. Remember there is a constant factor. So Θ(N) could actually be limited by 5 N or 10 N or even a 1000 N. So as such it doesn't give you an exact count on the number of times a node is visited.
The Time complexity of in-order iterative traversal of Binary Search Tree can be analyzed as follows,
Consider a Tree with N nodes,
Let the execution time be denoted by the complexity function T(N).
Let the left sub tree and right sub tree contain X and N-X-1 nodes respectively,
Then the time complexity T(N) = T(X) + T(N-X-1) + c,
Now consider the two extreme cases of a BST,
CASE 1: A BST which is perfectly balanced, i.e. both the sub trees have equal number of nodes. For example consider the BST shown below,
10
/ \
5 14
/ \ / \
1 6 11 16
For such a Tree the complexity function is,
T(N) = 2 T(⌊N/2⌋) + c
Master Theorem gives us a complexity of Θ(N) in this case.
CASE 2: A fully unbalanced BST, i.e. either the left sub tree or right sub tree is empty. There for X = 0. For example consider the BST shown below,
10
/
9
/
8
/
7
Now T(N) = T(0) + T(N-1) + c,
T(N) = T(N-1) + c
T(N) = T(N-2) + c + c
T(N) = T(N-3) + c + c + c
.
.
.
T(N) = T(0) + N c
Since T(N) = K, where K is a constant,
T(N) = K + N c
There for T(N) = Θ(N).
Thus the complexity is Θ(N) for all the cases.
We focus on edges instead of nodes.
( to have a better intuition look at this picture : http://i.stack.imgur.com/WlK5O.png)
We claim that in this algorithm every edge is visited at most twice, (actually it's visited exactly twice);
First time when it's traversed downward and and the second time when it's traversed upward.
To visit an edge more than twice , we have to traverse that edge it downward again : down , up , down , ....
We prove that it's not possible to have a second downward visit of an edge.
Let's assume that we traverse an edge (u , v) downward for the second time , this means that one of the ancestors of u has a successor which is a decedent of u.
This is not possible :
We know that when we are traversing an edge upward , we are looking for a left-turn edge to find a successor , so while u is on the left side of the the successor, successor of this successor is on the right side of it , by moving to the right side of a successor (to find its successor) reaching u again and therefore edge (u,v) again is impossible. (to find a successor we either move to the right or to the up but not to the left)