How to find if a graph is a tree and its center - algorithm

Is there an algorithm (or a sequence of algorithms) to find, given a generic graph structure G=(V,E) with no notion of parent node, leaf node and child node but only neighboordhood relations:
1) If G it is a tree or not (is it sufficient to check |V| = |E|+1?)
2) If the graph is actually a tree, the leaves and the center of it? (i.e the node of the graph which minimizes the tree depth)
Thanks

If the "center" of the tree is defined as "the node of the graph which minimizes the tree depth", there's an easier way to find it than finding the diameter.
d[] = degrees of all nodes
que = { leaves, i.e i that d[i]==1}
while len(que) > 1:
i=que.pop_front
d[i]--
for j in neighbors[i]:
if d[j] > 0:
d[j]--
if d[j] == 1 :
que.push_back(j)
and the last one left in que is the center.
you can prove this by thinking about the diameter path.
to simpify , we assume the length of the diameter path is odd, so that the middle node of the path is unique, let's call that node M,
we can see that:
M will not be pushed to the back of que until every node else on
diameter path has been pushed into que
if there's another node N
that is pushed after M has already been pushed into que, then N must
be on a longer path than the diameter path. Therefore N can't exist. M must be the last
node pushed (and left) in que

For (1), all you have to do is verify |V| = |E| + 1 and that the graph is fully connected.
For (2), you need to find a maximal diameter then pick a node in the middle of the diameter path. I vaguely remember that there's an easy way to do this for trees.
You start with an arbitrary node a, then find a node at maximal distance from a, call it b. Then you search from b and find a node at maximal distance from b, call it c. The path from b to c is a maximal diameter.
There are other ways to do it that might be more convenient for you, like this one. Check Google too.

No, it is not enough - a tree is a CONNECTED graph with n-1 edges. There could be n-1 edges in a not connected graph - and it won't be a tree.
You can run a BFS to find if the graph is connected and then count the number of edges, that will give you enough information if the graph is a tree
The leaves are the nodes v with degree of the nodes denoted by d(v) given by the equation d(v) = 1 (which have only one connected vertex to each)
(1) The answer assumes non-directed graphs
(2) In here, n denotes the number of vertices.

Related

Linear-time algorithm for number of distinct paths from each vertex in a directed acyclic graph

I am working on the following past paper question for an algorithms module:
Let G = (V, E) be a simple directed acyclic graph (DAG).
For a pair of vertices v, u in V, we say v is reachable from u if there is a (directed) path from u to v in G.
(We assume that every vertex is reachable from itself.)
For any vertex v in V, let R(v) be the reachability number of vertex v, which is the number of vertices u in V that are reachable from v.
Design an algorithm which, for a given DAG, G = (V, E), computes the values of R(v) for all vertices v in V.
Provide the analysis of your algorithm (i.e., correctness and running time
analysis).
(Optimally, one should try to design an algorithm running in
O(n + m) time.)
So, far I have the following thoughts:
The following algorithm for finding a topological sort of a DAG might be useful:
TopologicalSort(G)
1. Run DFS on G and compute a DFS-numbering, N // A DFS-numbering is a numbering (starting from 1) of the vertices of G, representing the point at which the DFS-call on a given vertex v finishes.
2. Let the topological sort be the function a(v) = n - N[v] + 1 // n is the number of nodes in G and N[v] is the DFS-number of v.
My second thought is that dynamic programming might be a useful approach, too.
However, I am currently not sure how to combine these two ideas into a solution.
I would appreciate any hints!
EDIT: Unfortunately the approach below is not correct in general. It may count multiple times the nodes that can be reached via multiple paths.
The ideas below are valid if the DAG is a polytree, since this guarantees that there is at most one path between any two nodes.
You can use the following steps:
find all nodes with 0 in-degree (i.e. no incoming edges).
This can be done in O(n + m), e.g. by looping through all edges
and marking those nodes that are the end of any edge. The nodes with 0
in-degree are those which have not been marked.
Start a DFS from each node with 0 in-degree.
After the DFS call for a node ends, we want to have computed for that
node the information of its reachability.
In order to achieve this, we need to add the reachability of the
successors of this node. Some of these values might have already been
computed (if the successor was already visited by DFS), therefore this
is a dynamic programming solution.
The following pseudocode describes the DFS code:
function DFS(node) {
visited[node] = true;
reachability[node] = 1;
for each successor of node {
if (!visited[successor]) {
DFS(successor);
}
reachability[node] += reachability[successor];
}
}
After calling this for all nodes with 0 in-degree, the reachability
array will contain the reachability for all nodes in the graph.
The overall complexity is O(n + m).
I'd suggest using a Breadth First Search approach.
For every node, add all the nodes that are connected to the queue. In addition to that, maintain a separate array for calculating the reachability.
For example, if a A->B, then
1.) Mark A as traversed
2.) B is added to the queue
3.) arr[B]+=1
This way, we can get R(v) for all vertices in O(|V| + |E|) time through arr[].

How to update MST from the old MST if one edge is deleted

I am studying algorithms, and I have seen an exercise like this
I can overcome this problem with exponential time but. I don't know how to prove this linear time O(E+V)
I will appreciate any help.
Let G be the graph where the minimum spanning tree T is embedded; let A and B be the two trees remaining after (u,v) is removed from T.
Premise P: Select minimum weight edge (x,y) from G - (u,v) that reconnects A and B. Then T' = A + B + (x,y) is a MST of G - (u,v).
Proof of P: It's obvious that T' is a tree. Suppose it were not minimum. Then there would be a MST - call it M - of smaller weight. And either M contains (x,y), or it doesn't.
If M contains (x,y), then it must have the form A' + B' + (x,y) where A' and B' are minimum weight trees that span the same vertices as A and B. These can't have weight smaller than A and B, otherwise T would not have been an MST. So M is not smaller than T' after all, a contradiction; M can't exist.
If M does not contain (x,y), then there is some other path P from x to y in M. One or more edges of P pass from a vertex in A to another in B. Call such an edge c. Now, c has weight at least that of (x,y), else we would have picked it instead of (x,y) to form T'. Note P+(x,y) is a cycle. Consequently, M - c + (x,y) is also a spanning tree. If c were of greater weight than (x,y) then this new tree would have smaller weight than M. This contradicts the assumption that M is a MST. Again M can't exist.
Since in either case, M can't exist, T' must be a MST. QED
Algorithm
Traverse A and color all its vertices Red. Similarly label B's vertices Blue. Now traverse the edge list of G - (u,v) to find a minimum weight edge connecting a Red vertex with a Blue. The new MST is this edge plus A and B.
When you remove one of the edges then the MST breaks into two parts, lets call them a and b, so what you can do is iterate over all vertices from the part a and look for all adjacent edges, if any of the edges forms a link between the part a and part b you have found the new MST.
Pseudocode :
for(all vertices in part a){
u = current vertex;
for(all adjacent edges of u){
v = adjacent vertex of u for the current edge
if(u and v belong to different part of the MST) found new MST;
}
}
Complexity is O(V + E)
Note : You can keep a simple array to check if vertex is in part a of the MST or part b.
Also note that in order to get the O(V + E) complexity, you need to have an adjacency list representation of the graph.
Let's say you have graph G' after removing the edge. G' consists have two connected components.
Let each node in the graph have a componentID. Set the componentID for all the nodes based on which component they belong to. This can be done with a simple BFS for example on G'. This is an O(V) operation as G' only has V nodes and V-2 edges.
Once all the nodes have been flagged, iterate over all unused edges and find the one with the least weight that connects the two components (componentIDs of the two nodes will be different). This is an O(E) operation.
Thus the total runtime is O(V+E).

Minimum Spanning tree different from another

Assume we are given
an undirected graph g where every node i,1 <= i < n is connected to all j,i < j <=n
and a source s.
We want to find the total costs (defined as the sum of all edges' weights) of the cheapest minimum spanning tree that differs from the minimum distance tree of s (i.e. from the MST obtained by running prim/dijkstra on s) by at least one edge.
What would be the best way to tackle this? Because currently, I can only think of some kind of fixed-point iteration
run dijkstra on (g,s) to obtain reference graph r that we need to differ from
costs := sum(edge_weights_of(r))
change := 0
for each vertex u in r, run a bfs and note for each reached vertex v the longest edge on the path from u to v.
iterate through all edges e = (a,b) in g: and find e'=(a',b') that is NOT in r and minimizes newchange := weight(e') - weight(longest_edge(a',b'))
if(first_time_here OR newchange < 0) then change += newchange
if(newchange < 0) goto 4
result := costs + change
That seems to waste a lot of time... It relies on the fact that adding an edge to a spanning tree creates a cycle from which we can remove the longest edge.
I also thought about using Kruskal to get an overall minimum spanning tree and only using the above algorithm to replace a single edge when the trees from both, prim and kruskal, happen to be the same, but that doesn't seem to work as the result would be highly dependent on the edges selected during a run of kruskal.
Any suggestions/hints?
You can do it using Prim`s algorithm
Prim's algorithm:
let T be a single vertex x
while (T has fewer than n vertices)
{
1.find the smallest edge connecting T to G-T
2.add it to T
}
Now lets modify it.
Let you have one minimum spanning tree. Say Tree(E,V)
Using this algorithm
Prim's algorithm (Modified):
let T be a single vertex
let isOther = false
while (T has fewer than n vertices)
{
1.find the smallest edge (say e) connecting T to G-T
2.If more than one edge is found, {
check which one you have in E(Tree)
choose one different from this
add it to T
set isOther = true
}
else if one vertex is found {
add it to T
If E(Tree) doesn`t contain this edge, set isOther = true
Else don`t touch isOther ( keep value ).
}
}
If isOther = true, it means you have found another tree different from Tree(E,V) and it is T,
Else graph have single minimum spanning tree

Algorithm - Finding the number of pairs with diameter distance in a tree?

I have a non-rooted bidirectional unweighted non-binary tree. I know how to find the diameter of the tree, the greatest distance between any pair of points in the tree, but I'm interested in finding the number of pairs with that max distance. Is there an algorithm to find the number of pairs with diameter distance in better than O(V^2) time, where V is the number of nodes?
Thank you!
Yes, there's a linear-time algorithm that operates bottom-up and resembles the algorithm for just finding the diameter. Here's the signature in Java-ish pseudocode; I'll leave the algorithm itself as an exercise.
class Node {
Collection<Node> children;
}
class Result {
int height; // height of the tree
int num_deep_nodes; // number of nodes whose depth equals the height
int diameter; // length of the longest path inside the tree
int num_long_paths; // number of pairs of nodes at distance |diameter|
}
Result computeNumberOfLongPaths(Node root); // recursive
Yes there is an algorithm with O(V+E) time.It is simply a modified version of finding the diameter.
As we know we can find the diameter using two calls of BFS by first making first call on any node and then remembering the last node discovered u and running a second call BFS(u),and remembering the last node discovered ,say v.The distance between u and v gives us the diameter.
Coming to number of pairs with that max distance.
1.Before invoking the first BFS,initialize an array distance of length |V| and distance[s]=0.s is the starting vertex for first BFS call on any node.
2.In the BFS,modify the while loop as:
while(Q is not empty)
{
e=deque(Q);
for all vertices w adjacent to e
{
if(w is not visited)
{
enque(w)
mark w as visited
distance[w]=distance[e]+1
parent[w]=e
}
}
}
3.Like I said,remembering the last node visited,say u is that node. Now counting the number of vertices that are at the same level as vertex u. mark is an array of length n,which has all its value initialized to 0,0 implies that vertex not counted initially.
n1=0
for i = 1 to number of vertices
{
if(distance[i]==distance[u]&&mark[i]==0)
{
n1++
mark[i]=1/*vertex counted*/
}
}
n1 gives the number of vertices,that are at the same level as vertex u,now for all vertices that have mark[i] = 1 ,are marked and they will not be counted again.
4.Similarly before performing second BFS on u,initialize another array distance2 of length |V| and distance2[u]=0.
5.Run BFS(u) and again get the last node discovered say v
6.Repeat 3rd step,this time on distance2 array and taking a different variable say n2=0 and the condition being
if(distance2[i]==distance2[v]&&mark[i]==0)
n2++
else if(distance2[i]==distance2[v]&&mark[i]==1)
set_common=1
7.set_common is a global variable that is set when there are a set of vertices such that between any two vertices the path is that of a diameter and the first bfs did not mark all those vertices but did mark at least one of those that is why mark[i]==1.
Suppose that first bfs did mark all such vertices in first call then n2 would be = 0 and set_common would not be set and there is no need also.But this situation is same as above
In any case the number of pairs giving diameter are:=
(n+n2)combination2 - X=(n1+n2)!/((2!)((n1+n2-2)!)) - X
I will elaborate on what X is.Else the number of pairs are = n1*n2,which is the case when 2 disjoint set of vertices are giving the diameter
So the Condition used is
if(n2==0||set_common==1)
number_of_pairs=(n1+n2)C2-X
else n1*n2
Now talking about X.It can occur that the vertices that are marked may have common parent.In that case we must not count there combinations.So before using the above condition it is advised to run the following algorithm
X=0/*Initialize*/
for(i = 1 to number of vertices)
{
s = 0,p = -1
if(mark[i]==0)
continue
else
{
s++
if(p==-1)
p=parent[i]
while((i+1)<=number_of_vertices&& p==parent[i+1])
{s++;i++}
}
if(s>1)
X=X+sC2
}
Proof of correctness
It is very easy.Since BFS traverses a tree level by level,n1 will give you the number of vertices at the level of u and n2 gives you the number of vertices at the level of v and since the distance between u and v = diameter.Therefore, distance between any vertex on level of u and any vertex on level of v will be equal to diameter.
The time taken is 2(|V|) + 2*time_of_DFS=O(V+E).

Find two paths in a graph that are in distance of at least D(constant)

Instance of the problem:
Undirected and unweighted graph G=(V,E).
two source nodes a and b, two destination nodes c and d and a constant D(complete positive number).(we can assume that lambda(c,d),lambda(a,b)>D, when lambda(x,y) is the shortest path between x and y in G).
we have two peoples standing on the nodes a and b.
Definition:scheduler set-
A scheduler set is a set of orders such that in each step only one of the peoples make a move from his node v to one of v neighbors, when the starting position of them is in the nodes a,b and the ending position is in the nodes c,d.A "scheduler set" is missing-disorders if in each step the distance between the two peoples is > D.
I need to find an algorithm that decides whether there is a "missing-disorders scheduler set" or not.
any suggestions?
One simple solution would be to first solve all-pairs shortest paths using n breadth-first searches from every node in O(n * (n + m)).
Then create the graph of valid node pairs (x,y) with lambda(x, y) > D, with edges indicating the possible moves. There is an edge {(v,w), (x,y)} if v = x and there is an edge {w, y} in the original graph or if w = y and there is an edge {v, x} in the original graph. This new graph has O(n^2) nodes and O(nm) edges.
Now you just need to check whether (c, d) is reachable from (a, b) in the new graph. This can be achieved using DFS or BFS.
The total runtime be O(n * (n + m)).

Resources