how to merge tree nodes into one node - algorithm

Hopefully, my question is not duplicated.
I would like to know if there exists such algorithm which merges some nodes in a tree to a new tree so the node in the new tree consists of some nodes in the old tree?
In order to explain my idea, I drew a graph to explain the question.
Input: A original tree.
Output: A new tree. There are following conditions with which the new tree must be satisfied:
The number of nodes in the new tree should be a fixed number k.
Each node in the new tree must consists of nodes in the original tree. For example, the node A in the second graph contains node 1,3, and 4 of the first graph. Node D in the secod graph contains nodes 9,12, and 13 in the first graph.
if one node of the original tree is contained in a node of the new tree, it cannot appear in another node of the new tree.
The nodes in the new tree are not necessarily have to be a subtree of the original tree. For example, node C in the second graph contains 6,7,and 10 of the first graph, It is not a subtree of the original graph. Because both node 6 and node 7 in the original graph connect to the nodes in the dotted area of A in the original tree, So they could be grouped in the node C of the second graph.
Currently, I just want the original tree can be converted to a new tree that has a K number of nodes and meets above conditions. For a given tree, there are many solutions. For example, graph 3 and graph 4 illustrate another solution for the original tree. It also has 4 nodes.

You need an addition condition or desired property of your output, otherwise it is quite trivial:
Starting from leaf nodes, copy K - 1 nodes to the output: B = {11}, C = {12}, D = {13}
Group all other nodes into a single K-th node: A = {1,2,3,4,5,6,7,8,9,10}

Related

find the source of reconvergent fanouts in a DAG

if we have graph like:
the rencovergent nodes are nodes 10 and 11.
by reconvergent nodes i mean the parents of that node come from a common source node.
in the given graph the parents of node 10 are nodes 7, 8 and they both have node 6 as source node(source of reconvergence)
and for node 11 the source node would be node 4.
my question is if we know which nodes are reconvergent nodes in the DAG how could we find the source nodes of their parents?
for my graph structure i use adjacency list representation available at geekforgeeks
i've tried this:
if(Node is reconvergent_node)
for each parents of Node
Do DFS(0, parents[i]) // depth first search from primary source till we reach the parent node
if the paths from 0 to parents[i] (which extracted from DFS) have common node mark that node as source node
is this algorithm correct? although i have not get the right results yet...
if it's correct i thinks it's limited to reconvergent nodes with max of two parents.
what if we have reconvergent_node which has five parents such that parents[1,2] have common source and parents[3,4,5] have different source node?
what should i do in this case?
any suggestion are welcome.
thank you

ET-tree in Henzinger and King (1995)

I'm having trouble understanding the following from the beginning of section 2.5 of Henzinger and King (FOCS 1995):
We encode an arbitrary tree T with n vertices using a sequence of 2n - 1 symbols, which is generated as follows: Root the tree at an arbitrary vertex.
Then traverse T in depth-first search order traversing each edge twice (once in each direction) and visiting every degree-d vertex d times, except for the root which is visited d + 1 times. Each time any vertex u is encountered, we call this an occurrence of the vertex. Let ET(T) be the sequence of node occurrences representing an arbitrary tree T.
For each spanning tree T(B) of a block B of H_i each occurrence of ET(T(B)) is stored in a node of a balanced binary search tree, called the ET(T(B))-tree. For each vertex u in T(B), we arbitrarily choose one occurrence to be the active occurrence of u.
With the active occurrence of each vertex v, we keep the (unordered) list of nontree edges in B which are incident to u, stored as a balanced binary tree. Each node in the ET-tree contains the number of nontree edges stored in its subtree.
Using this data structure for each level we can sample an edge of T_1 in time O(logn).
My questions follow some explanation of some terms.
From section 2.2,
A block is a maximal set of nodes that are biconnected.
H_i is a subgraph of G, whose dynamic biconnectivity (which is much of the topic of the paper) we wish to maintain. (More specifically, the edge set of G has been partitioned into l = log(|E(G)|) sets and each set induces a graph H_i for 1 <= i <= l.)
Here's my understanding. Pick some T(B) of H_i. The key of each active node in the corresponding ET tree is the number of "nontree edges in B [that] are incident to u". The value of each node is the appropriate list of edges.
Questions:
To finalise the ET tree, we have each node store the sum of the keys of all nodes in its subtree (including the node itself)?
The list for each node is not a balanced tree, right?
If only the active nodes matter, why create an ET-tree in the first place?
To sample an edge, we generate a number between 1 and |E(B))|. If the number is between 1 and |V(B)| - 1 then we're choosing some tree edge. Otherwise, we traverse the balanced tree and then select the appropriate edge from a list?

Find the maximum weight node in a tree if each node is the sum of the weights all the nodes under it.

For exa, this is the tree.
10
12 -1
5 1 1 -2
2 3 10 -9
How to find the node with maximum value?
Given the problem as stated, you need to traverse the entire tree. See proof below.
Traversing the entire tree should be a fairly trivial process.
Proof that we need to traverse the entire tree:
Assume we're able to identify which side of a tree the maximum is on without traversing the entire tree.
Given any tree with the maximum node on the left. Call this maximum x.
Pick one of the leaf nodes on the right. Add 2 children to it: x+1 and -x-1.
Since x+1-x-1 = 0, adding these won't change the sum at the leaf we added it to, thus nor the sums at any other nodes in the tree.
Since this can be added to any leaf in the tree, and it doesn't affect the sums, we'd need to traverse the entire tree to find out if this occurs anywhere.
Thus our assumption that we can identify which side of a tree the maximum is on without traversing the entire tree is incorrect.
Thus we need to traverse the entire tree.
In the general case, you need to traverse the entire tree. If the values in the tree are not constrained (e.g. all non-negative, but in your example there are negative values), then the value in a node tells you nothing about the individual values below it.

how to find all edge-disjoint equi-cost paths in an undirected graph between a pair of nodes?

Given an undirected graph G = (V, E), the edges in G have non-negative weights.
[1] how to find all the edge-disjoint and equal-cost paths between two nodes s and t ?
[2] how to find all the edge-disjoint paths between two nodes s and t ?
[3] how to find all the vertex-disjoint and equal-cost paths between two nodes s and t ?
[4] how to find all the vertex-disjoint paths between two nodes s and t ?
Any approximation algorithms instead ?
Build a tree where each node is a representation of a path through your unidirectional graph.
I know, a tree is a graph too, but in this answer I will use the following terms with this meanings:
vertex: is a vertex in your unidirectional graph. It is NOT a node in my tree.
edge: is an edge in your unidirectional graph. It is NOT part of my tree.
node: a vertex in my tree, not in your graph.
root: the sole node on top of my tree that has no parent.
leaf: any node in my tree that has no children.
I will not talk about edges in my tree, so there is no word for tree-edges. Every edge in this answer is part of your graph.
Algorithm to build the tree
The root of the tree is the representation of a path that contains only of the vertex s and contains no edge. Let its weight be 0.
Do for every node in that tree:
Take the end-vertex of the path represented by that node (I call it the actual node) and find all edges that lead away from that end-vertex.
If: there are no edges that lead away from this vertex, you reached a dead end. This path never will lead to vertex t. So mark this node as a leaf and give it an infinite weight.
Else:
For each of those edges:
add a child-node to the actual node. Let it be a copy of the actual node. Attach the edge to the end of path and then attach the edges end-vertex to the path too. (so each path follows the pattern vertex-edge-vertex-edge-vertex-...)
Now traverse up in the tree, back to the root and check if any of the predecessors has an end-vertex that is identic with the just added end-vertex.
If you have a match, the newly generated node is the representation of a path that contains a loop. Mark this node as a leaf and give it an infinite weight.
Else If there is no loop, just add the newly added edges weight to the nodes weight.
Now test, if the end-vertex of the newly generated node is the vertex t.
If it really is, mark this node as a leaf, since the path represented by this node is a valid path from s to t.
This algorithm always comes to an end in finite time. At the end you have 3 types of leafs in your tree:
nodes that represent dead ends with an infinite weight
nodes that represent loops, also with an infinite weight
nodes that represent a valid path from s to t, with its weight beeing the sum of all edges weights that are part of this path.
Each path represented by a leaf has its individual sequence of edges (but might contain the same sequence of vertexes), so the leafs with finite weights represent the complete set of edge-disjoint pathes from s to t. This is the solution of exercise [2].
Now do for all leafs with finite weight:
Write its weight and its path into a list. Sort the list by the weights. Now paths with identic weight are neighbours in the list, and you can find and extract all groups of equal-cost paths in this sorted list. This is the solution of exercise [1].
Now, do for each entry in this list:
add to each path in this list the list of its vertexes. After you have done this, you have a table with this columns:
weight
path
1st vertex (is always s)
2nd vertex
3rd vertex
...
Sort this table lexigraphic by the vertexes and after all vertexes by the weight (sort by 1st vertex, 2nd vertex, 3rd vertex ,... ,weight)
If one row in this sorted table has the same sequence of vertexes as the row before, then delete this row.
This is the list of all vertex-disjoint paths between two nodes s and t, and so it is the solution of exercise [4].
And in this list you find all equal-cost paths as neighbours, so you can easily extract all groups of vertex-disjoint and equal-cost paths between two nodes s and t from that list, so here you have the solution of exercise [3].

IOI 2003 : how to calculate the node that has the minimum balance in a tree?

here is the Balancing Act problem that demands to find the node that has the minimum balance in a tree. Balance is defined as :
Deleting any node
from the tree yields a forest : a collection of one or more trees. Define the balance of a node to be the size of the largest tree in the forest T created by deleting that node from T
For the sample tree like :
2 6 1 2 1 4 4 5 3 7 3 1
Explanation is :
Deleting node 4 yields two trees whose member nodes are {5} and {1,2,3,6,7}. The
larger of these two trees has five nodes, thus the balance of node 4 is five. Deleting node
1 yields a forest of three trees of equal size: {2,6}, {3,7}, and {4,5}. Each of these trees
has two nodes, so the balance of node 1 is two.
What kind of algorithm can you offer to this problem?
Thanks
I am going to assume that you have had a looong look at this problem: reading the solution does not help, you only get better at solving these problems by solving them yourself.
So one thing to observe is, the input is a tree. That means that each edge joins 2 smaller trees together. Removing an edge yields 2 disconnected trees (a forest of 2 trees).
So, if you calculate the size of the tree on one side of the edge, and then on the other, you should be able to look at a node's edges and ask "What is the size of the tree on the other side of this edge?"
You can calculate the sizes of trees using dynamic programming - your recurrence state is "What edge am I on? What side of the edge am I on?" and it calculates the size of the tree "hung" at that node. That is the crux of the problem.
Having that data, it is sufficient to iterate through all the nodes, look at their edges and ask "What is the size of the tree on the other side of this edge?" From there, you just pick the minimum.
Hope that helps.
You basically want to check 3 things for every node:
The size of its left subtree.
The size of its right subtree.
The size of the rest of the tree. (size of tree - left - right)
You can use this algorithm and expand it to any kind of tree (different number of subnodes).
Go over the tree in an in-order sequence.
Do this recursively:
Every time you just before you back up from a node to the "father" node, you need to add 1+size of node's total sub trees, to the "father" node.
Then store a value, let's call it maxTree, in the node that holds the maximum between all its subtrees, and the (sum of all subtrees)-(size of tree).
This way you can calculate all the subtree sizes in O(N).
While traversing the tree, you can hold a variable that hold the minimum value found so far.

Resources