Leetcode 543: Diameter of Binary tree top down or bottom-up? - algorithm

I'm working on diameter of a binary tree and I've copied over the question.
Given the root of a binary tree, return the length of the diameter of the tree.
The diameter of a binary tree is the length of the longest path between any two nodes in a tree. This path may or may not pass through the root. The length of a path between two nodes is represented by the number of edges between them.
The official solution is below.
def diameterOfBinaryTree(self, root: Optional[TreeNode]) -> int:
max_diameter = 0
def getDepth(root):
nonlocal max_diameter
if not root:
return 0
left = getDepth(root.left)
right = getDepth(root.right)
diameter = right + left
max_diameter = max(max_diameter, diameter)
return max(left, right) + 1
getDepth(root)
return max_diameter
The solution has the time complexity as O(N) b/c each node in the tree is only processed once. Is this a bottom-up algorithm b/c the leaf nodes are processed before their parent nodes, and they return information (their longest path, essentially) to their callers? If the algorithm was instead top-down where the diameter of parent nodes are calculated before the diameter of their child nodes are calculated, would the algorithm be O(N^2) since there's repeated diameter calculations?

Related

the center of a tree with weighted edges

i am trying to give a solution to my university assignement..given a connected tree T=(V,E). every edge e has a specific positive cost c..d(v,w) is the distance between node v and w..I'm asked to give the pseudocode of an algorithm that finds the center of such a tree(the node that minimizes the maximum distance to every other node)..
My solution consists first of all in finding the first two taller branches of the tree..then the center will be in the taller branch in a distance of H/2 from the root(H is the difference between the heights of the two taller branches)..the pseudocode is:
Algorithm solution(Node root, int height, List path)
root: the root of the tree
height : the height measured for every branch. Initially height=0
path : the path from the root to a leaf. Initially path={root}
Result : the center of the tree
if root==null than
return "error message"
endif
/*a list that will contain an element <h,path> for every
leaf of the tree. h is the distanze of the leaf from the root
and path is the path*/
List L = empty
if isLeaf(root) than
L = L union {<height,path>}
endif
foreach child c of root do
solution(c,height+d(root,c),path UNION {c})
endfor
/*for every leaf in the tree I have stored in L an element containing
the distance from the root and the relative path. Now I'm going to pick
the two most taller branches of the tree*/
Array array = sort(L)
<h1,path1> = array[0]//corresponding to the tallest branch
<h2,path2> = array[1]//corresponding to the next tallest branch
H = h1 - h2;
/*The center will be the node c in path1 with d(root,c)=H/2. If such a
node does not exist we can choose the node with te distance from the root
closer to H/2 */
int accumulator = 0
for each element a in path1 do
if d(root,a)>H/2 than
return MIN([d(root,a)-H/2],[H/2-d(root,a.parent)])
endif
end for
end Algorithm
is this a correct solution??is there an alternative and more efficient one??
Thank you...
Your idea is correct. You can arbitrarily pick any vertex to be a root of the tree, and then traverse the tree in 'post-order'. Since the weights are always positive, you can always pick the two longest 'branches' and update the answer in O(1) for each node.
Just remember that you are looking for the 'global' longest path (i.e. diameter of the graph), rather than the 'local' longest paths that go through the roots of the subtrees.
You can find more information if you search for "(weighted) Jordan Center (in a tree)". The optimal algorithm is O(N) for trees, so asymptotically your solution is optimal since you only use a single DFS which is O(|V| + |E|) == O(|V|) for a tree.

Efficiently find the depth of a graph from every node

I have a problem where I am to find the minimum possible depth of a graph which implies that I have to find the maximum depth from each node and return the least of them all. Obviously a simple DFS from each node will do the trick but when things get crazy with extremely large input, then DFS becomes inefficient (time limit). I tried keeping the distance of each leaf to the node being explored in memory to but that didn't help much.
How do I efficiently find the minimum depth of a very large graph. It is worthy of note that the graph in question has no cycle.
To find the graph centre/center of an undirected tree graph you could:
Do a DFS to find a list of all leaf nodes O(n)
Remove all these leaf nodes from the graph and note during the deletion which new nodes become leaf nodes
Repeat step 2 until the graph is completely deleted
The node/nodes deleted in the last stage of the algorithm will be the graph centres of your tree.
Each node is deleted once, so this whole process can be done in O(n).
What you seem to be looking for is the diameter / 2. You could compute the diameter of a tree as below and call it as findDiameter(n, null), for an arbitrary node n of the tree.
public findDiameter(node n, node from) returns <depth, diameter>
// apply recursively on all children
foreach child in (neighbors(n) minus from) do
<de, di> = findDiameter(child, n)
// depth of this node is 1 more than max depth of children
depth = 1 + max(de)
// max diameter either uses this node, then it is 1 + the 2 largest depths
// or it does not use this node, then it's the max depth of the neighbors
diameter = max(max(di), 1 + max(de) + oneButLargest(de))
All you need to do is in the loop over the neighbors keep track of the largest diameter and the 2 largest depths.

Time analysis of a Binary Search Tree in-order traversal algorithm

Below is an iterative algorithm to traverse a Binary Search Tree in in-order fashion (first left child , then the parent , finally right child) without using a Stack :
(Idea : the whole idea is to find the left-most child of a tree and find the successor of the node at hand each time and print its value , until there's no more node left.)
void In-Order-Traverse(Node root){
Min-Tree(root); //finding left-most child
Node current = root;
while (current != null){
print-on-screen(current.key);
current = Successor(current);
}
return;
}
Node Min-Tree(Node root){ // find the leftmost child
Node current = root;
while (current.leftChild != null)
current = current.leftChild;
return current;
}
Node Successor(Node root){
if (root.rightChild != null) // if root has a right child ,find the leftmost child of the right sub-tree
return Min-Tree(root.rightChild);
else{
current = root;
while (current.parent != null && current.parent.leftChild != current)
current = current.parent;
return current.parrent;
}
}
It's been claimed that the time complexity of this algorithm is Theta(n) assuming there are n nodes in the BST , which is for sure correct . However I cannot convince myself as I guess some of the nodes are traversed more than constant number of times which depends on the number of nodes in their sub-trees and summing up all these number of visits wouldn't result time complexity of Theta(n)
Any idea or intuition on how to prove it ?
It is easier to reason with edges rather than nodes. Let us reason based on the code of Successor function.
Case 1 (then branch)
For all nodes with a right child, we will visit the right subtree once ("right-turn" edge), then always visit the left subtree ("left-turn" edges) with Min-Tree function. We can prove that such traversal will create a path whose edges are unique - the edges will not be repeated in any traversal made from any other node with a right child, since the traversal ensures that you never visit any "right-turn" edge of other nodes on the tree. (Proof by construction).
Case 2 (else branch)
For all nodes without a right child (else branch), we will visit the ancestors by following "right-turn" edges until you have to make a "left-turn" edge or encounter the root of the binary tree. Again, the edges in the path generated are unique - will never be repeated in any other traversal made from any other node without a right child. This is because:
Except for the starting node and the node reached by following "left-turn" edge, all other nodes in between has a right child (which means those are excluded from else branch). The starting node of course does not have a right child.
Each node has a unique parent (only the root node does not have parent), and the path to parent is either "left-turn" or "right-turn" (the node is a left child or a right child). Given any node (ignoring the right child condition), there is only one path that creates the pattern: many "right-turn" then a "left-turn".
Since the nodes in between have a right child, there is no way for an edge to appear in 2 traversal starting at different nodes. (Since we are currently considering nodes without a right child).
(The proof here is quite hand-waving, but I think it can be formally proven by contradiction).
Since the edges are unique, the total number of edges traversed in case 1 only (or case 2 only) will be O(n) (since the number of edges in a tree is equal to the number of vertices - 1). Therefore, after summing the 2 cases up, In-Order Traversal will be O(n).
Note that I only know each edge is visited at most once - I don't know whether all edges are visited or not from the proof, but the number of edges is bounded by the number of vertices, which is just right.
We can easily see that it is also Omega(n) (each node is visited once), so we can conclude that it is Theta(n).
The given program runs in Θ(N) time. Θ(N) doesn't mean that each node is visited exactly once. Remember there is a constant factor. So Θ(N) could actually be limited by 5 N or 10 N or even a 1000 N. So as such it doesn't give you an exact count on the number of times a node is visited.
The Time complexity of in-order iterative traversal of Binary Search Tree can be analyzed as follows,
Consider a Tree with N nodes,
Let the execution time be denoted by the complexity function T(N).
Let the left sub tree and right sub tree contain X and N-X-1 nodes respectively,
Then the time complexity T(N) = T(X) + T(N-X-1) + c,
Now consider the two extreme cases of a BST,
CASE 1: A BST which is perfectly balanced, i.e. both the sub trees have equal number of nodes. For example consider the BST shown below,
10
/ \
5 14
/ \ / \
1 6 11 16
For such a Tree the complexity function is,
T(N) = 2 T(⌊N/2⌋) + c
Master Theorem gives us a complexity of Θ(N) in this case.
CASE 2: A fully unbalanced BST, i.e. either the left sub tree or right sub tree is empty. There for X = 0. For example consider the BST shown below,
10
/
9
/
8
/
7
Now T(N) = T(0) + T(N-1) + c,
T(N) = T(N-1) + c
T(N) = T(N-2) + c + c
T(N) = T(N-3) + c + c + c
.
.
.
T(N) = T(0) + N c
Since T(N) = K, where K is a constant,
T(N) = K + N c
There for T(N) = Θ(N).
Thus the complexity is Θ(N) for all the cases.
We focus on edges instead of nodes.
( to have a better intuition look at this picture : http://i.stack.imgur.com/WlK5O.png)
We claim that in this algorithm every edge is visited at most twice, (actually it's visited exactly twice);
First time when it's traversed downward and and the second time when it's traversed upward.
To visit an edge more than twice , we have to traverse that edge it downward again : down , up , down , ....
We prove that it's not possible to have a second downward visit of an edge.
Let's assume that we traverse an edge (u , v) downward for the second time , this means that one of the ancestors of u has a successor which is a decedent of u.
This is not possible :
We know that when we are traversing an edge upward , we are looking for a left-turn edge to find a successor , so while u is on the left side of the the successor, successor of this successor is on the right side of it , by moving to the right side of a successor (to find its successor) reaching u again and therefore edge (u,v) again is impossible. (to find a successor we either move to the right or to the up but not to the left)

Finding the heaviest length-constrained path in a weighted Binary Tree

UPDATE
I worked out an algorithm that I think runs in O(n*k) running time. Below is the pseudo-code:
routine heaviestKPath( T, k )
// create 2D matrix with n rows and k columns with each element = -∞
// we make it size k+1 because the 0th column must be all 0s for a later
// function to work properly and simplicity in our algorithm
matrix = new array[ T.getVertexCount() ][ k + 1 ] (-∞);
// set all elements in the first column of this matrix = 0
matrix[ n ][ 0 ] = 0;
// fill our matrix by traversing the tree
traverseToFillMatrix( T.root, k );
// consider a path that would arc over a node
globalMaxWeight = -∞;
findArcs( T.root, k );
return globalMaxWeight
end routine
// node = the current node; k = the path length; node.lc = node’s left child;
// node.rc = node’s right child; node.idx = node’s index (row) in the matrix;
// node.lc.wt/node.rc.wt = weight of the edge to left/right child;
routine traverseToFillMatrix( node, k )
if (node == null) return;
traverseToFillMatrix(node.lc, k ); // recurse left
traverseToFillMatrix(node.rc, k ); // recurse right
// in the case that a left/right child doesn’t exist, or both,
// let’s assume the code is smart enough to handle these cases
matrix[ node.idx ][ 1 ] = max( node.lc.wt, node.rc.wt );
for i = 2 to k {
// max returns the heavier of the 2 paths
matrix[node.idx][i] = max( matrix[node.lc.idx][i-1] + node.lc.wt,
matrix[node.rc.idx][i-1] + node.rc.wt);
}
end routine
// node = the current node, k = the path length
routine findArcs( node, k )
if (node == null) return;
nodeMax = matrix[node.idx][k];
longPath = path[node.idx][k];
i = 1;
j = k-1;
while ( i+j == k AND i < k ) {
left = node.lc.wt + matrix[node.lc.idx][i-1];
right = node.rc.wt + matrix[node.rc.idx][j-1];
if ( left + right > nodeMax ) {
nodeMax = left + right;
}
i++; j--;
}
// if this node’s max weight is larger than the global max weight, update
if ( globalMaxWeight < nodeMax ) {
globalMaxWeight = nodeMax;
}
findArcs( node.lc, k ); // recurse left
findArcs( node.rc, k ); // recurse right
end routine
Let me know what you think. Feedback is welcome.
I think have come up with two naive algorithms that find the heaviest length-constrained path in a weighted Binary Tree. Firstly, the description of the algorithm is as follows: given an n-vertex Binary Tree with weighted edges and some value k, find the heaviest path of length k.
For both algorithms, I'll need a reference to all vertices so I'll just do a simple traversal of the Tree to have a reference to all vertices, with each vertex having a reference to its left, right, and parent nodes in the tree.
Algorithm 1
For this algorithm, I'm basically planning on running DFS from each node in the Tree, with consideration to the fixed path length. In addition, since the path I'm looking for has the potential of going from left subtree to root to right subtree, I will have to consider 3 choices at each node. But this will result in a O(n*3^k) algorithm and I don't like that.
Algorithm 2
I'm essentially thinking about using a modified version of Dijkstra's Algorithm in order to consider a fixed path length. Since I'm looking for heaviest and Dijkstra's Algorithm finds the lightest, I'm planning on negating all edge weights before starting the traversal. Actually... this doesn't make sense since I'd have to run Dijkstra's on each node and that doesn't seem very efficient much better than the above algorithm.
So I guess my main questions are several. Firstly, do the algorithms I've described above solve the problem at hand? I'm not totally certain the Dijkstra's version will work as Dijkstra's is meant for positive edge values.
Now, I am sure there exist more clever/efficient algorithms for this... what is a better algorithm? I've read about "Using spine decompositions to efficiently solve the length-constrained heaviest path problem for trees" but that is really complicated and I don't understand it at all. Are there other algorithms that tackle this problem, maybe not as efficiently as spine decomposition but easier to understand?
You could use a DFS downwards from each node that stops after k edges to search for paths, but notice that this will do 2^k work at each node for a total of O(n*2^k) work, since the number of paths doubles at each level you go down from the starting node.
As DasBoot says in a comment, there is no advantage to using Dijkstra's algorithm here since it's cleverness amounts to choosing the shortest (or longest) way to get between 2 points when multiple routes are possible. With a tree there is always exactly 1 way.
I have a dynamic programming algorithm in mind that will require O(nk) time. Here are some hints:
If you choose some leaf vertex to be the root r and direct all other vertices downwards, away from the root, notice that every path in this directed tree has a highest node -- that is, a unique node that is nearest to r.
You can calculate the heaviest length-k path overall by going through each node v and calculating the heaviest length-k path whose highest node is v, finally taking the maximum over all nodes.
A length-k path whose highest node is v must have a length-i path descending towards one child and a length-(k-i) path descending towards the other.
That should be enough to get you thinking in the right direction; let me know if you need further help.
Here's my solution. Feedback is welcome.
Lets treat the binary tree as a directed graph, with edges going from parent to children. Lets define two concepts for each vertex v:
a) an arc: which is a directed path, that is, it starts from vertex v, and all vertices in the path are children of the starting vertex v.
b) a child-path: which is a directed or non-directed path containing v, that is, it could start anywhere, end anywhere, and go from child of v to v, and then, say to its other child. The set of arcs is a subset of the set of child-paths.
We also define a function HeaviestArc(v,j), which gives, for a vertex j, the heaviest arc, on the left or right side, of length j, starting at v. We also define LeftHeaviest(v,j), and RightHeaviest(v,j) as the heaviest left and right arcs of length j respectively.
Given this, we can define the following recurrences for each vertex v, based on its children:
LeftHeaviest(v,j) = weight(LeftEdge(v)) + HeaviestArc(LeftChild(v)),j-1);
RightHeaviest(v,j) = weight(RightEdge(v)) + HeaviestArc(RightChild(v)),j-1);
HeaviestArc(v,j) = max(LeftHeaviest(v,j),RightHeaviest(v,j));
Here j here goes from 1 to k, and HeaviestArc(v,0)=LeftHeaviest(v,0),RightHeaviest(v,0)=0 for all. For leaf nodes, HeaviestArc(v,0) = 0, and HeaviestArc(v,j)=-inf for all other j (I need to think about corner cases more thoroughly).
And then HeaviestChildPath(v), the heaviest child-path containing v, can be calculated as:
HeaviestChildPath(v) = max{ for j = 0 to k LeftHeaviest(j) + RightHeaviest(k-j)}
The heaviest path should be the heaviest of all child paths.
The estimated runtime of the algorithm should be order O(kn).
def traverse(node, running_weight, level):
if level == 0:
if max_weight < running_weight:
max_weight = running_weight
return
traverse(node->left,running_weight+node.weight,level-1)
traverse(node->right,running_weight+node.weight,level-1)
traverse(node->parent,running_weight+node.weight,level-1)
max_weight = 0
for node in tree:
traverse(node,0,N)

binary tree data structures

Can anybody give me proof how the number of nodes in strictly binary tree is 2n-1 where n is the number of leaf nodes??
Proof by induction.
Base case is when you have one leaf. Suppose it is true for k leaves. Then you should proove for k+1. So you get the new node, his parent and his other leaf (by definition of strict binary tree). The rest leaves are k-1 and then you can use the induction hypothesis. So the actual number of nodes are 2*(k-1) + 3 = 2k+1 == 2*(k+1)-1.
just go with the basics, assuming there are x nodes in total, then we have n nodes with degree 1(leaves), 1 with degree 2(the root) and x-n-1 with degree 3(the inner nodes)
as a tree with x nodes will have x-1 edges. so summing
n + 3*(x-n-1) + 2 = 2(x-1) (equating the total degrees)
solving for x we get x = 2n-1
I'm guessing that what you really want is something like a proof that the depth is log2(N), where N is the number of nodes. In this case, the answer is fairly simple: for any given depth D, the number of nodes is 2D.
Edit: in response to edited question: the same fact pretty much applies. Since the number of nodes at any depth is 2D, the number of nodes further up the tree is 2D-1 + 2D-2 + ...20 = 2D-1. Therefore, the total number of nodes in a balanced binary tree is 2D + 2D-1. If you set n = 2D, you've gone the full circle back to the original equation.
I think you are trying to work out a proof for: N = 2L - 1 where L is the number
of leaf nodes and N is the total number of nodes in a binary tree.
For this formula to hold you need to put a few restrictions on how the binary
tree is constructed. Each node is either a leaf, which means it has no children, or
it is an internal node. Internal nodes have 3
possible configurations:
2 child nodes
1 child and 1 internal node
2 internal nodes
All three configurations imply that an internal node connects to two other nodes. This explicitly
rules out the situation where node connects to a single child as in:
o
/
o
Informal Proof
Start with a minimal tree of 1 leaf: L = 1, N = 1 substitute into N = 2L - 1 and the see that
the formula holds true (1 = 1, so far so good).
Now add another minimal chunk to the tree. To do that you need to add another two nodes and
tree looks like:
o
/ \
o o
Notice that you must add nodes in pairs to satisfy the restriction stated earlier.
Adding a pair of nodes always adds
one leaf (two new leaf nodes, but you loose one as it becomes an internal node). Node growth
progresses as the series: 1, 3, 5, 7, 9... but leaf growth is: 1, 2, 3, 4, 5... That is why the formula
N = 2L - 1 holds for this type of tree.
You might use mathematical induction to construct a formal proof, but this works find for me.
Proof by mathematical induction:
The statement that there are (2n-1) of nodes in a strictly binary tree with n leaf nodes is true for n=1. { tree with only one node i.e root node }
let us assume that the statement is true for tree with n-1 leaf nodes. Thus the tree has 2(n-1)-1 = 2n-3 nodes
to form a tree with n leaf nodes we need to add 2 child nodes to any of the leaf nodes in the above tree. Thus the total number of nodes = 2n-3+2 = 2n-1.
hence, proved
To prove: A strictly binary tree with n leaves contains 2n-1 nodes.
Show P(1): A strictly binary tree with 1 leaf contains 2(1)-1 = 1 node.
Show P(2): A strictly binary tree with 2 leaves contains 2(2)-1 = 3 nodes.
Show P(3): A strictly binary tree with 3 leaves contains 2(3)-1 = 5 nodes.
Assume P(K): A strictly binary tree with K leaves contains 2K-1 nodes.
Prove P(K+1): A strictly binary tree with K+1 leaves contains 2(K+1)-1 nodes.
2(K+1)-1 = 2K+2-1
= 2K+1
= 2K-1 +2*
* This result indicates that, for each leaf that is added, another node must be added to the father of the leaf , in order for it to continue to be a strictly binary tree. So, for every additional leaf, a total of two nodes must be added, as expected.
int N = 1000; insert here the value of N
int sum = 0; // the number of total nodes
int currFactor = 1;
for (int i = 0; i< log(N); ++i) //the is log(N) levels
{
sum += currFactor;
currFactor *= 2; //in each level the number of node is double than the upper level
}
if(sum == 2*N - 1)
{
cout<<"wow that the number of nodes is 2*N-1";
}

Resources