the center of a tree with weighted edges - algorithm

i am trying to give a solution to my university assignement..given a connected tree T=(V,E). every edge e has a specific positive cost c..d(v,w) is the distance between node v and w..I'm asked to give the pseudocode of an algorithm that finds the center of such a tree(the node that minimizes the maximum distance to every other node)..
My solution consists first of all in finding the first two taller branches of the tree..then the center will be in the taller branch in a distance of H/2 from the root(H is the difference between the heights of the two taller branches)..the pseudocode is:
Algorithm solution(Node root, int height, List path)
root: the root of the tree
height : the height measured for every branch. Initially height=0
path : the path from the root to a leaf. Initially path={root}
Result : the center of the tree
if root==null than
return "error message"
endif
/*a list that will contain an element <h,path> for every
leaf of the tree. h is the distanze of the leaf from the root
and path is the path*/
List L = empty
if isLeaf(root) than
L = L union {<height,path>}
endif
foreach child c of root do
solution(c,height+d(root,c),path UNION {c})
endfor
/*for every leaf in the tree I have stored in L an element containing
the distance from the root and the relative path. Now I'm going to pick
the two most taller branches of the tree*/
Array array = sort(L)
<h1,path1> = array[0]//corresponding to the tallest branch
<h2,path2> = array[1]//corresponding to the next tallest branch
H = h1 - h2;
/*The center will be the node c in path1 with d(root,c)=H/2. If such a
node does not exist we can choose the node with te distance from the root
closer to H/2 */
int accumulator = 0
for each element a in path1 do
if d(root,a)>H/2 than
return MIN([d(root,a)-H/2],[H/2-d(root,a.parent)])
endif
end for
end Algorithm
is this a correct solution??is there an alternative and more efficient one??
Thank you...

Your idea is correct. You can arbitrarily pick any vertex to be a root of the tree, and then traverse the tree in 'post-order'. Since the weights are always positive, you can always pick the two longest 'branches' and update the answer in O(1) for each node.
Just remember that you are looking for the 'global' longest path (i.e. diameter of the graph), rather than the 'local' longest paths that go through the roots of the subtrees.
You can find more information if you search for "(weighted) Jordan Center (in a tree)". The optimal algorithm is O(N) for trees, so asymptotically your solution is optimal since you only use a single DFS which is O(|V| + |E|) == O(|V|) for a tree.

Related

Leetcode 543: Diameter of Binary tree top down or bottom-up?

I'm working on diameter of a binary tree and I've copied over the question.
Given the root of a binary tree, return the length of the diameter of the tree.
The diameter of a binary tree is the length of the longest path between any two nodes in a tree. This path may or may not pass through the root. The length of a path between two nodes is represented by the number of edges between them.
The official solution is below.
def diameterOfBinaryTree(self, root: Optional[TreeNode]) -> int:
max_diameter = 0
def getDepth(root):
nonlocal max_diameter
if not root:
return 0
left = getDepth(root.left)
right = getDepth(root.right)
diameter = right + left
max_diameter = max(max_diameter, diameter)
return max(left, right) + 1
getDepth(root)
return max_diameter
The solution has the time complexity as O(N) b/c each node in the tree is only processed once. Is this a bottom-up algorithm b/c the leaf nodes are processed before their parent nodes, and they return information (their longest path, essentially) to their callers? If the algorithm was instead top-down where the diameter of parent nodes are calculated before the diameter of their child nodes are calculated, would the algorithm be O(N^2) since there's repeated diameter calculations?

Time analysis of a Binary Search Tree in-order traversal algorithm

Below is an iterative algorithm to traverse a Binary Search Tree in in-order fashion (first left child , then the parent , finally right child) without using a Stack :
(Idea : the whole idea is to find the left-most child of a tree and find the successor of the node at hand each time and print its value , until there's no more node left.)
void In-Order-Traverse(Node root){
Min-Tree(root); //finding left-most child
Node current = root;
while (current != null){
print-on-screen(current.key);
current = Successor(current);
}
return;
}
Node Min-Tree(Node root){ // find the leftmost child
Node current = root;
while (current.leftChild != null)
current = current.leftChild;
return current;
}
Node Successor(Node root){
if (root.rightChild != null) // if root has a right child ,find the leftmost child of the right sub-tree
return Min-Tree(root.rightChild);
else{
current = root;
while (current.parent != null && current.parent.leftChild != current)
current = current.parent;
return current.parrent;
}
}
It's been claimed that the time complexity of this algorithm is Theta(n) assuming there are n nodes in the BST , which is for sure correct . However I cannot convince myself as I guess some of the nodes are traversed more than constant number of times which depends on the number of nodes in their sub-trees and summing up all these number of visits wouldn't result time complexity of Theta(n)
Any idea or intuition on how to prove it ?
It is easier to reason with edges rather than nodes. Let us reason based on the code of Successor function.
Case 1 (then branch)
For all nodes with a right child, we will visit the right subtree once ("right-turn" edge), then always visit the left subtree ("left-turn" edges) with Min-Tree function. We can prove that such traversal will create a path whose edges are unique - the edges will not be repeated in any traversal made from any other node with a right child, since the traversal ensures that you never visit any "right-turn" edge of other nodes on the tree. (Proof by construction).
Case 2 (else branch)
For all nodes without a right child (else branch), we will visit the ancestors by following "right-turn" edges until you have to make a "left-turn" edge or encounter the root of the binary tree. Again, the edges in the path generated are unique - will never be repeated in any other traversal made from any other node without a right child. This is because:
Except for the starting node and the node reached by following "left-turn" edge, all other nodes in between has a right child (which means those are excluded from else branch). The starting node of course does not have a right child.
Each node has a unique parent (only the root node does not have parent), and the path to parent is either "left-turn" or "right-turn" (the node is a left child or a right child). Given any node (ignoring the right child condition), there is only one path that creates the pattern: many "right-turn" then a "left-turn".
Since the nodes in between have a right child, there is no way for an edge to appear in 2 traversal starting at different nodes. (Since we are currently considering nodes without a right child).
(The proof here is quite hand-waving, but I think it can be formally proven by contradiction).
Since the edges are unique, the total number of edges traversed in case 1 only (or case 2 only) will be O(n) (since the number of edges in a tree is equal to the number of vertices - 1). Therefore, after summing the 2 cases up, In-Order Traversal will be O(n).
Note that I only know each edge is visited at most once - I don't know whether all edges are visited or not from the proof, but the number of edges is bounded by the number of vertices, which is just right.
We can easily see that it is also Omega(n) (each node is visited once), so we can conclude that it is Theta(n).
The given program runs in Θ(N) time. Θ(N) doesn't mean that each node is visited exactly once. Remember there is a constant factor. So Θ(N) could actually be limited by 5 N or 10 N or even a 1000 N. So as such it doesn't give you an exact count on the number of times a node is visited.
The Time complexity of in-order iterative traversal of Binary Search Tree can be analyzed as follows,
Consider a Tree with N nodes,
Let the execution time be denoted by the complexity function T(N).
Let the left sub tree and right sub tree contain X and N-X-1 nodes respectively,
Then the time complexity T(N) = T(X) + T(N-X-1) + c,
Now consider the two extreme cases of a BST,
CASE 1: A BST which is perfectly balanced, i.e. both the sub trees have equal number of nodes. For example consider the BST shown below,
10
/ \
5 14
/ \ / \
1 6 11 16
For such a Tree the complexity function is,
T(N) = 2 T(⌊N/2⌋) + c
Master Theorem gives us a complexity of Θ(N) in this case.
CASE 2: A fully unbalanced BST, i.e. either the left sub tree or right sub tree is empty. There for X = 0. For example consider the BST shown below,
10
/
9
/
8
/
7
Now T(N) = T(0) + T(N-1) + c,
T(N) = T(N-1) + c
T(N) = T(N-2) + c + c
T(N) = T(N-3) + c + c + c
.
.
.
T(N) = T(0) + N c
Since T(N) = K, where K is a constant,
T(N) = K + N c
There for T(N) = Θ(N).
Thus the complexity is Θ(N) for all the cases.
We focus on edges instead of nodes.
( to have a better intuition look at this picture : http://i.stack.imgur.com/WlK5O.png)
We claim that in this algorithm every edge is visited at most twice, (actually it's visited exactly twice);
First time when it's traversed downward and and the second time when it's traversed upward.
To visit an edge more than twice , we have to traverse that edge it downward again : down , up , down , ....
We prove that it's not possible to have a second downward visit of an edge.
Let's assume that we traverse an edge (u , v) downward for the second time , this means that one of the ancestors of u has a successor which is a decedent of u.
This is not possible :
We know that when we are traversing an edge upward , we are looking for a left-turn edge to find a successor , so while u is on the left side of the the successor, successor of this successor is on the right side of it , by moving to the right side of a successor (to find its successor) reaching u again and therefore edge (u,v) again is impossible. (to find a successor we either move to the right or to the up but not to the left)

How to find if a graph is a tree and its center

Is there an algorithm (or a sequence of algorithms) to find, given a generic graph structure G=(V,E) with no notion of parent node, leaf node and child node but only neighboordhood relations:
1) If G it is a tree or not (is it sufficient to check |V| = |E|+1?)
2) If the graph is actually a tree, the leaves and the center of it? (i.e the node of the graph which minimizes the tree depth)
Thanks
If the "center" of the tree is defined as "the node of the graph which minimizes the tree depth", there's an easier way to find it than finding the diameter.
d[] = degrees of all nodes
que = { leaves, i.e i that d[i]==1}
while len(que) > 1:
i=que.pop_front
d[i]--
for j in neighbors[i]:
if d[j] > 0:
d[j]--
if d[j] == 1 :
que.push_back(j)
and the last one left in que is the center.
you can prove this by thinking about the diameter path.
to simpify , we assume the length of the diameter path is odd, so that the middle node of the path is unique, let's call that node M,
we can see that:
M will not be pushed to the back of que until every node else on
diameter path has been pushed into que
if there's another node N
that is pushed after M has already been pushed into que, then N must
be on a longer path than the diameter path. Therefore N can't exist. M must be the last
node pushed (and left) in que
For (1), all you have to do is verify |V| = |E| + 1 and that the graph is fully connected.
For (2), you need to find a maximal diameter then pick a node in the middle of the diameter path. I vaguely remember that there's an easy way to do this for trees.
You start with an arbitrary node a, then find a node at maximal distance from a, call it b. Then you search from b and find a node at maximal distance from b, call it c. The path from b to c is a maximal diameter.
There are other ways to do it that might be more convenient for you, like this one. Check Google too.
No, it is not enough - a tree is a CONNECTED graph with n-1 edges. There could be n-1 edges in a not connected graph - and it won't be a tree.
You can run a BFS to find if the graph is connected and then count the number of edges, that will give you enough information if the graph is a tree
The leaves are the nodes v with degree of the nodes denoted by d(v) given by the equation d(v) = 1 (which have only one connected vertex to each)
(1) The answer assumes non-directed graphs
(2) In here, n denotes the number of vertices.

Dynamic Programming Help: Binary Tree Cost Edge

Given a binary tree with n leaves and a set of C colors. Each leaf node of the tree is given a unique color from the set C. Thus no leaf nodes have the same color. The internal nodes of the tree are uncolored. Every pair of colors in the set C has a cost associated with it. So if a tree edge connects two nodes of colors A and B, the edge cost is the cost of the pair (A, B). Our aim is to give colors to the internal nodes of the tree, minimizing the total edge cost of the tree.
I have working on this problem for hours now, and haven't really come up with a working solution. Any hints would be appreciated.
I am going to solve the problem with pseudocode, because I tried writing explanation and it was completely not understandable even for me. Hopefully the code will do the trick. The complexity of my solution is not very good - memory an druntime in O(C^2 * N).
I will need couple of arrays I will be using in dynamic approach to your task:
dp [N][C][C] -> dp[i][j][k] the maximum price you can get from a tree rooted at node i, if you paint it in color j and its parent is colored in color k
maxPrice[N][C] -> maxPrice[i][j] the maximum price you can get from a tree rooted in node i if its parent is colored in color j
color[leaf] -> the color of the leaf leaf
price[C][C] -> price[i][j] the price you get if you have pair of neighbouring nodes with colors i and j
chosenColor[N][C] -> chosenColor[i][j] the color one should choose for node i to obtain maxPrice[i][j]
Lets assume the nodes are ordered using topological sorting, i.e we will be processing first leaves. Topological sorting is very easy to do in a tree. Let the sorting have given a list of inner nodes inner_nodes
for leaf in leaves:
for i in 0..MAX_C, j in 0..MAX_C
dp[leaf][i][j] = (i != color[leaf]) ? 0 : price[i][j]
for i in 0..MAX_C,
maxPrice[leaf][i] = price[color[leaf]][i]
chosenColor[leaf][i] = color[leaf]
for node in inner_nodes
for i in 0..MAX_C, j in 0..MAX_C
dp[node][i][j] = (i != root) ? price[i][j] : 0
for descendant in node.descendants
dp[node][i][j] += maxPrice[descendant][i]
for i in 0...MAX_C
for j in 0...MAX_C
if maxPrice[node][i] < dp[node][j][i]
maxPrice[node][i] = dp[node][j][i]
chosenColor[node][i] = j
for node in inner_nodes (reversed)
color[node] = (node == root) ? chosenColor[node][0] : chosenColor[node][color[parent[node]]
As a starting point, you can use a greedy solution which gives you an upper bound on the total cost:
while the root is not colored
pick an uncolored node having colored descendants only
choose the color that minimizes the total cost to its descendants

Finding the heaviest length-constrained path in a weighted Binary Tree

UPDATE
I worked out an algorithm that I think runs in O(n*k) running time. Below is the pseudo-code:
routine heaviestKPath( T, k )
// create 2D matrix with n rows and k columns with each element = -∞
// we make it size k+1 because the 0th column must be all 0s for a later
// function to work properly and simplicity in our algorithm
matrix = new array[ T.getVertexCount() ][ k + 1 ] (-∞);
// set all elements in the first column of this matrix = 0
matrix[ n ][ 0 ] = 0;
// fill our matrix by traversing the tree
traverseToFillMatrix( T.root, k );
// consider a path that would arc over a node
globalMaxWeight = -∞;
findArcs( T.root, k );
return globalMaxWeight
end routine
// node = the current node; k = the path length; node.lc = node’s left child;
// node.rc = node’s right child; node.idx = node’s index (row) in the matrix;
// node.lc.wt/node.rc.wt = weight of the edge to left/right child;
routine traverseToFillMatrix( node, k )
if (node == null) return;
traverseToFillMatrix(node.lc, k ); // recurse left
traverseToFillMatrix(node.rc, k ); // recurse right
// in the case that a left/right child doesn’t exist, or both,
// let’s assume the code is smart enough to handle these cases
matrix[ node.idx ][ 1 ] = max( node.lc.wt, node.rc.wt );
for i = 2 to k {
// max returns the heavier of the 2 paths
matrix[node.idx][i] = max( matrix[node.lc.idx][i-1] + node.lc.wt,
matrix[node.rc.idx][i-1] + node.rc.wt);
}
end routine
// node = the current node, k = the path length
routine findArcs( node, k )
if (node == null) return;
nodeMax = matrix[node.idx][k];
longPath = path[node.idx][k];
i = 1;
j = k-1;
while ( i+j == k AND i < k ) {
left = node.lc.wt + matrix[node.lc.idx][i-1];
right = node.rc.wt + matrix[node.rc.idx][j-1];
if ( left + right > nodeMax ) {
nodeMax = left + right;
}
i++; j--;
}
// if this node’s max weight is larger than the global max weight, update
if ( globalMaxWeight < nodeMax ) {
globalMaxWeight = nodeMax;
}
findArcs( node.lc, k ); // recurse left
findArcs( node.rc, k ); // recurse right
end routine
Let me know what you think. Feedback is welcome.
I think have come up with two naive algorithms that find the heaviest length-constrained path in a weighted Binary Tree. Firstly, the description of the algorithm is as follows: given an n-vertex Binary Tree with weighted edges and some value k, find the heaviest path of length k.
For both algorithms, I'll need a reference to all vertices so I'll just do a simple traversal of the Tree to have a reference to all vertices, with each vertex having a reference to its left, right, and parent nodes in the tree.
Algorithm 1
For this algorithm, I'm basically planning on running DFS from each node in the Tree, with consideration to the fixed path length. In addition, since the path I'm looking for has the potential of going from left subtree to root to right subtree, I will have to consider 3 choices at each node. But this will result in a O(n*3^k) algorithm and I don't like that.
Algorithm 2
I'm essentially thinking about using a modified version of Dijkstra's Algorithm in order to consider a fixed path length. Since I'm looking for heaviest and Dijkstra's Algorithm finds the lightest, I'm planning on negating all edge weights before starting the traversal. Actually... this doesn't make sense since I'd have to run Dijkstra's on each node and that doesn't seem very efficient much better than the above algorithm.
So I guess my main questions are several. Firstly, do the algorithms I've described above solve the problem at hand? I'm not totally certain the Dijkstra's version will work as Dijkstra's is meant for positive edge values.
Now, I am sure there exist more clever/efficient algorithms for this... what is a better algorithm? I've read about "Using spine decompositions to efficiently solve the length-constrained heaviest path problem for trees" but that is really complicated and I don't understand it at all. Are there other algorithms that tackle this problem, maybe not as efficiently as spine decomposition but easier to understand?
You could use a DFS downwards from each node that stops after k edges to search for paths, but notice that this will do 2^k work at each node for a total of O(n*2^k) work, since the number of paths doubles at each level you go down from the starting node.
As DasBoot says in a comment, there is no advantage to using Dijkstra's algorithm here since it's cleverness amounts to choosing the shortest (or longest) way to get between 2 points when multiple routes are possible. With a tree there is always exactly 1 way.
I have a dynamic programming algorithm in mind that will require O(nk) time. Here are some hints:
If you choose some leaf vertex to be the root r and direct all other vertices downwards, away from the root, notice that every path in this directed tree has a highest node -- that is, a unique node that is nearest to r.
You can calculate the heaviest length-k path overall by going through each node v and calculating the heaviest length-k path whose highest node is v, finally taking the maximum over all nodes.
A length-k path whose highest node is v must have a length-i path descending towards one child and a length-(k-i) path descending towards the other.
That should be enough to get you thinking in the right direction; let me know if you need further help.
Here's my solution. Feedback is welcome.
Lets treat the binary tree as a directed graph, with edges going from parent to children. Lets define two concepts for each vertex v:
a) an arc: which is a directed path, that is, it starts from vertex v, and all vertices in the path are children of the starting vertex v.
b) a child-path: which is a directed or non-directed path containing v, that is, it could start anywhere, end anywhere, and go from child of v to v, and then, say to its other child. The set of arcs is a subset of the set of child-paths.
We also define a function HeaviestArc(v,j), which gives, for a vertex j, the heaviest arc, on the left or right side, of length j, starting at v. We also define LeftHeaviest(v,j), and RightHeaviest(v,j) as the heaviest left and right arcs of length j respectively.
Given this, we can define the following recurrences for each vertex v, based on its children:
LeftHeaviest(v,j) = weight(LeftEdge(v)) + HeaviestArc(LeftChild(v)),j-1);
RightHeaviest(v,j) = weight(RightEdge(v)) + HeaviestArc(RightChild(v)),j-1);
HeaviestArc(v,j) = max(LeftHeaviest(v,j),RightHeaviest(v,j));
Here j here goes from 1 to k, and HeaviestArc(v,0)=LeftHeaviest(v,0),RightHeaviest(v,0)=0 for all. For leaf nodes, HeaviestArc(v,0) = 0, and HeaviestArc(v,j)=-inf for all other j (I need to think about corner cases more thoroughly).
And then HeaviestChildPath(v), the heaviest child-path containing v, can be calculated as:
HeaviestChildPath(v) = max{ for j = 0 to k LeftHeaviest(j) + RightHeaviest(k-j)}
The heaviest path should be the heaviest of all child paths.
The estimated runtime of the algorithm should be order O(kn).
def traverse(node, running_weight, level):
if level == 0:
if max_weight < running_weight:
max_weight = running_weight
return
traverse(node->left,running_weight+node.weight,level-1)
traverse(node->right,running_weight+node.weight,level-1)
traverse(node->parent,running_weight+node.weight,level-1)
max_weight = 0
for node in tree:
traverse(node,0,N)

Resources