Time complexity for finding the diameter of a binary tree - algorithm

I have seen various posts here that computes the diameter of a binary tree. One such solution can be found here (Look at the accepted solution, NOT the code highlighted in the problem).
I'm confused why the time complexity of the code would be O(n^2). I don't see how traversing the nodes of a tree twice (once for the height (via getHeight()) and once for the diameter (via getDiameter()) would be n^2 instead of n+n which is 2n. Any help would be appreciated.

As you mentioned, the time complexity of getHeight() is O(n).
For each node, the function getHeight() is called. So the complexity for a single node is O(n). Hence the complexity for the entire algorithm (for all nodes) is O(n*n).

It should be O(N) to calculate the height of every subtree rooted at every node, you only have to traverse the tree one time using an in-order traversal.
int treeHeight(root)
{
if(root == null) return -1;
root->height = max(treeHeight(root->rChild),treeHeight(root->lChild)) + 1;
return root->height;
}
This will visit each node 1 time, so has order O(N).
Combine this with the result from the linked source, and you will be able to determine which 2 nodes have the longest path between in at worst another traversal.
Indeed this describes the way to do it in O(N)
The different between this solution (the optimized one) and the referenced one is that the referenced solution re-computes tree height every time after shrinking the search size by only 1 node (the root node). Thus from above the complexity will be O(N + (N - 1) + ... + 1).
The sum
1 + 2 + ... + N
is equal to
= N(N + 1)/2
And so the complexity of sum of all the operations from the repeated calls to getHeight will be O(N^2)
For completeness sake, conversely, the optimized solution getHeight() will have complexity O(1) after the pre computation because each node will store the value as a data member of the node.

All subtree heights may be precalculated (using O(n) time), so what total time complexity of finding the diameter would be O(n).

Related

Why is the time complexity of performing n union find (union by size) operations O(n log n)?

In Tree based Implementation of Union Find operation, each element is stored in a node, which contains a pointer to a set name. A node v whose set pointer points back to v is also a set name. Each set is a tree, rooted at a node with a self-referencing set pointer.
To perform a union, we simply make the root of one tree point to the root of the other. To perform a find, we follow set name pointers from the starting node until reaching a node whose set name pointer refers back to itself.
In Union by size -> When performing a union, we make the root of smaller tree
point to the root of the larger. This implies O(n log n) time for
performing n union find operations. Each time we follow a pointer, we are going to a subtree of size at most double the size of the previous subtree. Thus, we will follow at most O(log n) pointers for any find.
I do not understand how for each union operation, Find operation is always O(log n). Can someone please explain how the worst case complexity is actually computed?
Let's assume for the moment, that each tree of height h contains at least 2^h nodes. What happens, if you join two such trees?
If they are of different height, the height of the combined tree is the same as the height of the higher one, thus the new tree still has more than 2^h nodes (same height but more nodes).
Now if they are the same height, the resulting tree will increase its height by one, and will contain at least 2^h + 2^h = 2^(h+1) nodes. So the condition will still hold.
The most basic trees (1 node, height 0) also fulfill the condition. It follows, that all trees that can be constructed by joining two trees together fulfill it as well.
Now the height is just the maximal number of steps to follow during a find. If a tree has n nodes and height h (n >= 2^h) this gives immediately log2(n) >= h >= steps.
You can do n union find (union by rank or size) operations with complexity O(n lg* n) where lg* n is the inverse Ackermann function using path compression optimization.
Note that O(n lg* n) is better than O(n log n)
In the question Why is the Ackermann function related to the amortized complexity of union-find algorithm used for disjoint sets? you can find details about this relation.
We need to prove that maximum height of trees is log(N) where N is the number of items in UF (1)
In the base case, all trees have a height of 0. (1) of course satisfied
Now assuming all the trees satisfy (1) we need to prove that joining any 2 trees with i, j (i <= j) nodes will create a new tree with maximum height is log(i + j)(2):
Because the joining 2 trees procedure gets root node of the smaller tree and attach it to the root node of the bigger one so the height of the new tree will be:
max(log(j), 1 + log(i)) = max(log(j), log(2i)) <= log(i + j) => (2) proved
log(j): height of new tree is still the height of the bigger tree
1 + log(i): when height of 2 trees are the same
See the picture below for more details:
Ref: book Algorithms

Big O Time Complexity for Recursive Pattern

I have question on runtime for recursive pattern.
Example 1
int f(int n) {
if(n <= 1) {
return 1;
}
return f(n - 1) + f(n - 1);
}
I can understand that the runtime for the above code is O(2^N) because if I pass 5, it calls 4 twice then each 4 calls 3 twice and follows till it reaches 1 i.e., something like O(branches^depth).
Example 2
Balanced Binary Tree
int sum(Node node) {
if(node == null) {
return 0;
}
return sum(node.left) + node.value + sum(node.right);
}
I read that the runtime for the above code is O(2^log N) since it is balanced but I still see it as O(2^N). Can anyone explain it?
When the number of element gets halved each time, the runtime is log N. But how a binary tree works here?
Is it 2^log N just because it is balanced?
What if it is not balanced?
Edit:
We can solve O(2^log N) = O(N) but I am seeing it as O(2^N).
Thanks!
Binary tree will have complexity O(n) like any other tree here because you are ultimately traversing all of the elements of the tree. By halving we are not doing anything special other than calculating sum for the corresponding children separately.
The term comes this way because if it is balanced then 2^(log_2(n)) is the number of elements in the tree (leaf+non-leaf).(log2(n) levels)
Again if it is not balanced it doesn't matter. We are doing an operation for which every element needs to be consideredmaking the runtime to be O(n).
Where it could have mattered? If it was searching an element then it would have mattered (whether it is balanced or not).
I'll take a stab at this.
In a balanced binary tree, you should have half the child nodes to the left and half to the right of each parent node. The first layer of the tree is the root, with 1 element, then 2 elements in the next layer, then 4 elements in the next, then 8, and so on. So for a tree with L layers, you have 2^L - 1 nodes in the tree.
Reversing this, if you have N elements to insert into a tree, you end up with a balanced binary tree of depth L = log_2(N), so you only ever need to call your recursive algorithm for log_2(N) layers. At each layer, you are doubling the number of calls to your algorithm, so in your case you end up with 2^log_2(N) calls and O(2^log_2(N)) run time. Note that 2^log_2(N) = N, so it's the same either way, but we'll get to the advantage of a binary tree in a second.
If the tree is not balanced, you end up with depth greater than log_2(N), so you have more recursive calls. In the extreme case, when all of your children are to the left (or right) of their parent, you have N recursive calls, but each call returns immediately from one of its branches (no child on one side). Thus you would have O(N) run time, which is the same as before. Every node is visited once.
An advantage of a balanced tree is in cases like search. If the left-hand child is always less than the parent, and the right-hand child is always greater than, then you can search for an element n among N nodes in O(log_2(N)) time (not 2^log_2(N)!). If, however, your tree is severely imbalanced, this search becomes a linear traversal of all of the values and your search is O(N). If N is extremely large, or you perform this search a ton, this can be the difference between a tractable and an intractable algorithm.

What's the time complexity of this algorithm (pseudo code)?

Assume the tree T is a binary tree.
Algorithm computeDepths(node, depth)
Input: node and its depth. For all depths, call with computeDepths(T.root, 0)
Output: depths of all the nodes of T
if node != null
depth ← node.depth
computeDepths(node.left, depth + 1)
computeDepths(node.right, depth + 1)
return depth
end if
I ran it on paper with a full and complete binary tree containing 7 elements, but I still can't put my head around what time complexity it is. If I had to guess, I'd say it's O(n*log n).
It is O(n)
To get an idea on the time complexity, we need to find out the amount of work done by the algorithm, compared with the size of the input. In this algorithm, the work done per function call is constant (only assigning a given value to a variable). So let's count how many times the function is called.
The first time the function is called, it's called on the root.
Then for any subsequent calls, the function checks if the node is null, if it is not null, it set the depth accordingly and set the depths of its children. Then this is done recursively.
Now note that the function is called once per node in the tree, plus two times the number of leaves. In a binary tree, the number of leaves is n/2 (rounded up), so the total number of function calls is:
n + 2*(n/2) = 2n
So this is the amount of work done by the algorithm. And so the time complexity is O(n).

Disjoint Set in a special ways?

We implement Disjoint Data structure with tree. in this data structure makeset() create a set with one element, merge(i, j) merge two tree of set i and j in such a way that tree with lower height become a child of root of the second tree. if we do n makeset() operation and n-1 merge() operations in random manner, and then do one find operation. what is the cost of this find operation in worst case?
I) O(n)
II) O(1)
III) O(n log n)
IV) O(log n)
Answer: IV.
Anyone could mentioned a good tips that the author get this solution?
The O(log n) find is only true when you use union by rank (also known as weighted union). When we use this optimisation, we always place the tree with lower rank under the root of the tree with higher rank. If both have the same rank, we choose arbitrarily, but increase the rank of the resulting tree by one. This gives an O(log n) bound on the depth of the tree. We can prove this by showing that a node that is i levels below the root (equivalent to being in a tree of rank >= i) is in a tree of at least 2i nodes (this is the same as showing a tree of size n has log n depth). This is easily done with induction.
Induction hypothesis: tree size is >= 2^j for j < i.
Case i == 0: the node is the root, size is 1 = 2^0.
Case i + 1: the length of a path is i + 1 if it was i and the tree was then placed underneath
another tree. By the induction hypothesis, it was in a tree of size >= 2^i at
that time. It is being placed under another tree, which by our merge rules means
it has at least rank i as well, and therefore also had >= 2^i nodes. The new tree
therefor has >= 2^i + 2^i = 2^(i + 1) nodes.

Time complexity of level order traversal

What is the time complexity of binary tree level order traversal ? Is it O(n) or O(log n)?
void levelorder(Node *n)
{ queue < Node * >q;
q.enqueue(n);
while(!q.empty())
{
Node * node = q.front();
DoSmthwith node;
q.dequeue();
if(node->left != NULL)
q.enqueue(node->left);
if (node->right != NULL)
q.enqueue(node->right);
}
}
It is O(n), or to be exact Theta(n).
Have a look on each node in the tree - each node is "visited" at most 3 times, and at least once) - when it is discovered (all nodes), when coming back from the left son (non leaf) and when coming back from the right son (non leaf), so total of 3*n visits at most and n visites at least per node. Each visit is O(1) (queue push/pop), totaling in - Theta(n).
Another way to approach this problem is identifying that a level-order traversal is very similar to the breadth-first search of a graph. A breadth-first traversal has a time complexity that is O(|V| + |E|) where |V| is the number of vertices and |E| is the number of edges.
In a tree, the number of edges is around equal to the number of vertices. This makes it overall linear in the number of nodes.
The time and space complexities are O(n). n = no. of nodes.
Space complexity - Queue size would be proportional to number of nodes
O(n)
Time complexity - O(n) as each node is visited twice. Once during
enqueue operation and once during dequeue operation.
This is a special case of BFS. You can read about BFS (Breadth First Search) http://en.wikipedia.org/wiki/Breadth-first_search .

Resources