Algorithm to find the longest path in binary tree? - algorithm

Question:
You are given a rooted binary tree (each node has at most two children).
For a simple path p between two nodes in the tree, let mp be the node on the path that is highest
(closest to the root). Define the weight of a path w(p) = Σ_u∈p d(u, mp), where d denotes the distance
(number of edges on the path between two nodes). That is, every node on the path is weighted by the
distance to the highest node on the path.
The question asks an algorithm that finds the maximum weight among all simple paths in the tree. I'm not sure if I interpreted correctly, but can I just find the longest path from mp to the farthest node? I haven't figure out which algorithm is appropriate for this question, but I think recursive is one way to do it. Again, I don't understand the question very well, it would be better if someone could "translate" it for me and guide me to the solution.

Let's assume we know mp. Then the highest-weight path must start in the left subtree and end in the right subtree (or vice versa). Otherwise, the path would not be simple. To find the start and end node, we would go as deep as possible into the respective subtrees as each level adds depth to the weight. Therefore, we can compute the weight of this path directly from the heights of the two subtrees (by using the analytic solution of the arithmetic progression):
max_weight = height_left * (height_left + 1) / 2 + height_right * (height_right + 1) / 2
To find the maximum weight path across the entire tree (without prescribing mp), simply check this value for all nodes. I.e., take a recursive algorithm that calculates the height for each subtree. When you have the two subtree heights for a node, calculate the maximum weight. Of all these weights, take the maximum. This requires time linear in the number of nodes.
And to answer your question: No, it is not necessarily the longest path in the tree. The path can have one branch that goes very deep but a very shallow branch on the other side. This is because adding one level deeper to the path does not just increase the weight by 1 but by the depth of that node.

This problem is the diameter of a binary tree. In this case, the node that at the lower level has a greater weight because it's far from the root. Therefore, to find the longest path is to find the diameter of the binary tree. We can use brute force algorithm to solve it, by traveling all leaf-to-leaf paths, then arriving at the diameter.
Method: naive approach
Find the height of left and right subtree, and then find the left and right diameter. Return the Maximum(Diameter of left subtree, Diameter of right subtree, Longest path between two nodes which passes through the root.)
Time Complexity: Since when calculating the diameter, every iteration for every node, is calculating height of tree separately in which we iterate the tree from top to bottom and when we calculate diameter recursively so its O(N2)
Improve:
If you notice at every node to find the diameter we are calling a separate function to find the height. We can improve it by finding the height of tree and diameter in the same iteration.Every node will return the two information in the same iteration , height of that node and diameter of tree with respect to that node. Running time is O(N)

Related

Find the edge which is not a part of any possible diameter of a tree

I'm curious if there is a quick way to find if an edge exists which is not a part of any possible diameter of a n-ary tree. For example in the following tree, A-B edge will not be a part of any diameter.
I tried by listing down all the possible diameters, but that takes a lot of time and I'm certain that there is a faster way.
Let's begin with a simpler question: how would we find any diameter of the tree? One way to do this would be to pick some node and to root the tree at that node. A diameter of the graph then could be found in one of two ways:
The diameter might purely be contained within one of those subtrees.
The diameter might be found by taking the roots of two subtrees, computing the longest path starting at each of the subtree roots, and then joining them together through the overall tree root.
So imagine that we recursively visit each subtree and obtain, from each, both the longest path starting at the root of that subtree (we could store this implicitly by having each node store its height) and the length of the longest path purely within that subtree (possibly stored implicitly by tagging each subtree with the length of the longest path within that tree). Once we have this information, we can find some diameter as follows: the diameter is either
the longest path purely within one of those subtrees, or
formed by joining two of the longest paths starting at the roots of the subtrees through the root.
All this information can be computed in time O(n) with O(n) auxiliary storage, and so if we just need to determine what the diameter is, we can do so fairly quickly.
Now, let's modify this to actually find all the edges that might get used. We can do this by starting at the root node. Consider the lengths of the paths obtained via routes (1) and (2). If route (1) produces a strictly longer path than route (2), we can recursively descend into each subtree containing a path of that length and run the same process to identify the edges that could potentially be used. If route (2) produces a strictly longer path than route (2), we'd then mark the edge from the root to each of its children who have the longest path starting at a subtree root as being used, and if there's exactly one such subtree we'd then mark each subtree tied for the second-longest path as being used. We'd then recursively descend into those subtrees, always taking paths down subtrees containing one of the many possible longest paths.
This second propagation step takes time O(n) because each node is visited exactly once and the work done is proportional to the number of children. Overall, this is an O(n)-time algorithm that uses O(n) space.

relation between degrees of vertices and edge removal

I'm looking for help to prove the next question:
given an undirected tree with n vertices with each one's degree <= 3,
(1) prove that there exists an edge that if we remove we'll have two trees with number of vertices in each one - maximum (2*n/3).
(2) suggest a linear algorithm that finds such an edge in the above given tree
Choose an arbitrary root. Do a post order traversal to compute the size of each subtree. By descending from the root via children with subtrees at least as large as their siblings, find a subtree of size between (n-1)/3 inclusive and 2(n-1)/3 + 1 exclusive (the degree bound keeps the size from decreasing by more than minus one divided by two). Sever its parent edge.

Dijkstra looped tree

Anyone knows this?
A looped tree is a weighted, directed graph built from a binary tree
by adding an edge from every leaf back to the root. Every edge has a
non-negative weight.
How much time would Dijkstra’s algorithm require to compute the shortest path between two vertices u and v in a looped tree with n
nodes?
Describe and analyze a faster algorithm.
How much time would Dijkstra’s algorithm require to compute the
shortest path between two vertices u and v in a looped tree with n
nodes?
It will take O(VlogV) time (worst case analysis).
Note that there is a single simple path for each pair of nodes (u,v) that connects u to v. If this path for some reason contains a very heavy weighted edge, Dijksta's algorithm is going to keep postponing taking this edge, and will fail to discover the correct route until it will, which will make the algorithm have to discover most of the vertices in the looped tree, making the complexity O(VlogV) (Note that E is in O(V) for this graph).
Describe and analyze a faster algorithm.
Since there is a single simple path, you just need to find it.
It can be easily done by finding the lowest common ancestor in the tree (without loops), and then finding a route to this ancestor from u.
Complexity of this algorithm is O(h) - where h is the height of the graph.
I think the answer by amit is wrong.
In Describing and analyze a faster algorithm.
you can't find the cheapest route from vertex u to this ancestor in O(h), therefore, this algorithm is not O(h). For 2 reasons, if internal nodes only have parent to child directed edge, we need to look down from u to find the cheapest route to common ancestor (or the root), and I am not aware of an algorithm that can do that. Second reason, if there are parent->child and child->parent edges, then the path from source vertex to lowest common ancestor vertex can be through any of the 3 adjacent vertex of any internal tree nodes( vertex) or 1 adjacent vertex(root) of any leaf node vertex, thus we can't do it in O(h).
Based on my understanding of the problem, child->parent edge is not in the definition of looped-tree graph. Therefore, the only we is to go down the tree and come back at the top and from root to target is a simple single path. Therefore, we reduce the problem to finding the cheapest route from u to root, thereby reduce the complexity.
Further, if target is a direct descendant of source, we will stop the during finding the cheapest route to root. if source is the root, the problem is trivial since the route is the simple single path from root to target by going down the subtrees of target.

Split a tree into equal parts by deleting an edge

I am looking for an algorithm to split a tree with N nodes (where the maximum degree of each node is 3) by removing one edge from it, so that the two trees that come as the result have as close as possible to N/2. How do I find the edge that is "the most centered"?
The tree comes as an input from a previous stage of the algorithm and is input as a graph - so it's not balanced nor is it clear which node is the root.
My idea is to find the longest path in the tree and then select the edge in the middle of the longest path. Does it work?
Optimally, I am looking for a solution that can ensure that neither of the trees has more than 2N / 3 nodes.
Thanks for your answers.
I don't believe that your initial algorithm works for the reason I mentioned in the comments. However, I think that you can solve this in O(n) time and space using a modified DFS.
Begin by walking the graph to count how many total nodes there are; call this n. Now, choose an arbitrary node and root the tree at it. We will now recursively explore the tree starting from the root and will compute for each subtree how many nodes are in each subtree. This can be done using a simple recursion:
If the current node is null, return 0.
Otherwise:
For each child, compute the number of nodes in the subtree rooted at that child.
Return 1 + the total number of nodes in all child subtrees
At this point, we know for each edge what split we will get by removing that edge, since if the subtree below that edge has k nodes in it, the spilt will be (k, n - k). You can thus find the best cut to make by iterating across all nodes and looking for the one that balances (k, n - k) most evenly.
Counting the nodes takes O(n) time, and running the recursion visits each node and edge at most O(1) times, so that takes O(n) time as well. Finding the best cut takes an additional O(n) time, for a net runtime of O(n). Since we need to store the subtree node counts, we need O(n) memory as well.
Hope this helps!
If you see my answer to Divide-And-Conquer Algorithm for Trees, you can see I'll find a node that partitions tree into 2 nearly equal size trees (bottom up algorithm), now you just need to choose one of the edges of this node to do what you want.
Your current approach is not working assume you have a complete binary tree, now add a path of length 3*log n to one of leafs (name it bad leaf), your longest path will be within one of a other leafs to the end of path connected to this bad leaf, and your middle edge will be within this path (in fact after you passed bad leaf) and if you partition base on this edge you have a part of O(log n) and another part of size O(n) .

Binary tree visit: get from one leaf to another leaf

Problem: I have a binary tree, all leaves are numbered (from left to right, starting from 0) and no connection exists between them.
I want an algorithm that, given two indices (of 2 distinct leaves), visits the tree starting from the greater leaf (the one with the higher index) and gets to the lower one.
The internal nodes of the tree do not contain any useful information.
I should chose the path based only on the leaves indices. The path start from a leaf and terminates on a leaf, and of course I can access a leaf if I know its index (through an array of pointers)
The tree is static, no insertion or deletion of nodes is allowed.
I have developed an algorithm to do it but it really sucks... any ideas?
One option would be to find the least common ancestor of the two nodes, along with the sequence of nodes you should take from each node to get to that ancestor. Here's a sketch of the algorithm:
Starting from each node, walk back up to that node's parent until you reach the root. Count the number of nodes on the path from each node to the root. Let the height of the first node be h1 and the height of the second node be h2.
Let h = min(h1, h2). This is the height of the higher of the two nodes.
Starting from each node, keep following the node's parent pointer until both nodes are at height h. Record the nodes you followed during this step. At this point, both nodes are at the same height.
Until you find a common node, keep marching upwards from each node to its parent. Eventually you will hit their common ancestor. At this point, follow the path from the first node up to this ancestor, then down the path from the ancestor down to the second node.
In the worst case, this takes O(h) time and O(h) space, where h is the height of the tree. For a balanced binary tree is this O(lg n) time and space, which is quite good.
If you're interested in a Much More Hardcore version of this algorithm, consider looking into Tarjan's Least Common Ancestors algorithm, which with linear preprocessing time, can be used to find the least common ancestor much more rapidly than this.
Hope this helps!
Distance between any two nodes can be calculated with the help of lowest common ancestor:
Dist(n1, n2) = Dist(root, n1) + Dist(root, n2) - 2*Dist(root, lca)
where lca is lowest common ancestor.
see this for more help about this algorithm and see this video for learning how to calculate lca.

Resources