how to find root of a directed acyclic graph - algorithm

I need a method to find root of a directed acyclic graph.I am using boolean adjancency matix to represent graph in java.so please suggest.Also graph is unweighted graph

Just find the node where indegree is 0. For below algorithm to work we assume that none of nodes in graph are isolated.
int indegree[N]={0};
for(i=0;i<n;++i){
for(j=0;j<n;++j){
if(graph[i][j]==1){ //assuming edge from i to j
indegree[j]++;
}
}
}
for(int i=0;i<n;++i){
if(indegree[i]==0) add i to roots;
}

You are looking for nodes with no in-edges. If the adjacency matrix is encoded so that entry (i,j) contains a 1 if and only if there is an edge from i to j, then for node K to be a root, there must be no edges of the form i->K, therefore no 1's in entries of the form (i, K). So you are looking for columns K with all zeros. Each such column is a root.
In pseudocode,
roots = {}
for k in 1 to N
for i in 1 to N
if adjacencies[i, k] > 0
continue with next k value
add k to roots

It can be done in linear time. It is basically doing DFS over the graph with all the edges reversed.
Pick up any vertex in the given graph G
Check if the vertex has in-degree equal to 0. If it does we have found a vertex which is root of the graph.
If not then, mark the current vertex v as visited and repeat the same process over all the unvisited parents of v.
This will fetch all the required vertices with in-degree equal to zero or roots of a DAG.

Related

How to find the sum weight of all edges reachable from some vertex?

Consider a directed graph with no cycles. I need to find for each u the total weight of edges reachable from u (by reachable we mean there's a path from u to some v).
Now, what I thought about is running topological sort and then starting to run from the last node to the first node (possible by interchanging the direction of the edges)
And then we're evaluating f[v] = f[u] + w(u,v).
but there's a problem; for this graph, we will count f[d] twice. How can I overcome this?
You can use either BFS or DFS to achieve this.
total = 0
dfs (node):
if visited[node] == 1:
return
visited[node] = 1
for all u connected to node:
total += weight[node][u]
dfs(u)
Note that we check the visited after total += weight[node][u].
You can use a bottom up approach. that is firstly calculate the outdegree of each vertex, Now the vertices with 0 outdegree would have F[u] = 0 for them. Now add all such vertices in a queue Q.
Also you would need to store the transpose of the Graph, suppose it was T.
While(!Q.empty){
u=Q.front();
Q.pop();
for all edges E originating from T[u]{
F[v]+=w; (where (u,v) was the edge with w as weight)
//now remove u from the graph
outdegree[v]--;
if(outdegree[v]==0)
Q.push(v);
}
}

How to update MST from the old MST if one edge is deleted

I am studying algorithms, and I have seen an exercise like this
I can overcome this problem with exponential time but. I don't know how to prove this linear time O(E+V)
I will appreciate any help.
Let G be the graph where the minimum spanning tree T is embedded; let A and B be the two trees remaining after (u,v) is removed from T.
Premise P: Select minimum weight edge (x,y) from G - (u,v) that reconnects A and B. Then T' = A + B + (x,y) is a MST of G - (u,v).
Proof of P: It's obvious that T' is a tree. Suppose it were not minimum. Then there would be a MST - call it M - of smaller weight. And either M contains (x,y), or it doesn't.
If M contains (x,y), then it must have the form A' + B' + (x,y) where A' and B' are minimum weight trees that span the same vertices as A and B. These can't have weight smaller than A and B, otherwise T would not have been an MST. So M is not smaller than T' after all, a contradiction; M can't exist.
If M does not contain (x,y), then there is some other path P from x to y in M. One or more edges of P pass from a vertex in A to another in B. Call such an edge c. Now, c has weight at least that of (x,y), else we would have picked it instead of (x,y) to form T'. Note P+(x,y) is a cycle. Consequently, M - c + (x,y) is also a spanning tree. If c were of greater weight than (x,y) then this new tree would have smaller weight than M. This contradicts the assumption that M is a MST. Again M can't exist.
Since in either case, M can't exist, T' must be a MST. QED
Algorithm
Traverse A and color all its vertices Red. Similarly label B's vertices Blue. Now traverse the edge list of G - (u,v) to find a minimum weight edge connecting a Red vertex with a Blue. The new MST is this edge plus A and B.
When you remove one of the edges then the MST breaks into two parts, lets call them a and b, so what you can do is iterate over all vertices from the part a and look for all adjacent edges, if any of the edges forms a link between the part a and part b you have found the new MST.
Pseudocode :
for(all vertices in part a){
u = current vertex;
for(all adjacent edges of u){
v = adjacent vertex of u for the current edge
if(u and v belong to different part of the MST) found new MST;
}
}
Complexity is O(V + E)
Note : You can keep a simple array to check if vertex is in part a of the MST or part b.
Also note that in order to get the O(V + E) complexity, you need to have an adjacency list representation of the graph.
Let's say you have graph G' after removing the edge. G' consists have two connected components.
Let each node in the graph have a componentID. Set the componentID for all the nodes based on which component they belong to. This can be done with a simple BFS for example on G'. This is an O(V) operation as G' only has V nodes and V-2 edges.
Once all the nodes have been flagged, iterate over all unused edges and find the one with the least weight that connects the two components (componentIDs of the two nodes will be different). This is an O(E) operation.
Thus the total runtime is O(V+E).

Algorithm - Finding the number of pairs with diameter distance in a tree?

I have a non-rooted bidirectional unweighted non-binary tree. I know how to find the diameter of the tree, the greatest distance between any pair of points in the tree, but I'm interested in finding the number of pairs with that max distance. Is there an algorithm to find the number of pairs with diameter distance in better than O(V^2) time, where V is the number of nodes?
Thank you!
Yes, there's a linear-time algorithm that operates bottom-up and resembles the algorithm for just finding the diameter. Here's the signature in Java-ish pseudocode; I'll leave the algorithm itself as an exercise.
class Node {
Collection<Node> children;
}
class Result {
int height; // height of the tree
int num_deep_nodes; // number of nodes whose depth equals the height
int diameter; // length of the longest path inside the tree
int num_long_paths; // number of pairs of nodes at distance |diameter|
}
Result computeNumberOfLongPaths(Node root); // recursive
Yes there is an algorithm with O(V+E) time.It is simply a modified version of finding the diameter.
As we know we can find the diameter using two calls of BFS by first making first call on any node and then remembering the last node discovered u and running a second call BFS(u),and remembering the last node discovered ,say v.The distance between u and v gives us the diameter.
Coming to number of pairs with that max distance.
1.Before invoking the first BFS,initialize an array distance of length |V| and distance[s]=0.s is the starting vertex for first BFS call on any node.
2.In the BFS,modify the while loop as:
while(Q is not empty)
{
e=deque(Q);
for all vertices w adjacent to e
{
if(w is not visited)
{
enque(w)
mark w as visited
distance[w]=distance[e]+1
parent[w]=e
}
}
}
3.Like I said,remembering the last node visited,say u is that node. Now counting the number of vertices that are at the same level as vertex u. mark is an array of length n,which has all its value initialized to 0,0 implies that vertex not counted initially.
n1=0
for i = 1 to number of vertices
{
if(distance[i]==distance[u]&&mark[i]==0)
{
n1++
mark[i]=1/*vertex counted*/
}
}
n1 gives the number of vertices,that are at the same level as vertex u,now for all vertices that have mark[i] = 1 ,are marked and they will not be counted again.
4.Similarly before performing second BFS on u,initialize another array distance2 of length |V| and distance2[u]=0.
5.Run BFS(u) and again get the last node discovered say v
6.Repeat 3rd step,this time on distance2 array and taking a different variable say n2=0 and the condition being
if(distance2[i]==distance2[v]&&mark[i]==0)
n2++
else if(distance2[i]==distance2[v]&&mark[i]==1)
set_common=1
7.set_common is a global variable that is set when there are a set of vertices such that between any two vertices the path is that of a diameter and the first bfs did not mark all those vertices but did mark at least one of those that is why mark[i]==1.
Suppose that first bfs did mark all such vertices in first call then n2 would be = 0 and set_common would not be set and there is no need also.But this situation is same as above
In any case the number of pairs giving diameter are:=
(n+n2)combination2 - X=(n1+n2)!/((2!)((n1+n2-2)!)) - X
I will elaborate on what X is.Else the number of pairs are = n1*n2,which is the case when 2 disjoint set of vertices are giving the diameter
So the Condition used is
if(n2==0||set_common==1)
number_of_pairs=(n1+n2)C2-X
else n1*n2
Now talking about X.It can occur that the vertices that are marked may have common parent.In that case we must not count there combinations.So before using the above condition it is advised to run the following algorithm
X=0/*Initialize*/
for(i = 1 to number of vertices)
{
s = 0,p = -1
if(mark[i]==0)
continue
else
{
s++
if(p==-1)
p=parent[i]
while((i+1)<=number_of_vertices&& p==parent[i+1])
{s++;i++}
}
if(s>1)
X=X+sC2
}
Proof of correctness
It is very easy.Since BFS traverses a tree level by level,n1 will give you the number of vertices at the level of u and n2 gives you the number of vertices at the level of v and since the distance between u and v = diameter.Therefore, distance between any vertex on level of u and any vertex on level of v will be equal to diameter.
The time taken is 2(|V|) + 2*time_of_DFS=O(V+E).

Graph algorithm to calculate node degree

I'm trying to implement the topological-sort algorithm for a DAG. (http://en.wikipedia.org/wiki/Topological_sorting)
First step of this simple algorithm is finding nodes with zero degree, and I cannot find any way to do this without a quadratic algorithm.
My graph implementation is a simple adjacency list and the basic process is to loop through every node and for every node go through each adjacency list so the complexity will be O(|V| * |V|).
The complexity of topological-sort is O(|V| + |E|) so i think there must be a way to calculate the degree for all nodes in a linear way.
You can maintain the indegree of all vertices while removing nodes from the graph and maintain a linked list of zero indegree nodes:
indeg[x] = indegree of node x (compute this by going through the adjacency lists)
zero = [ x in nodes | indeg[x] = 0 ]
result = []
while zero != []:
x = zero.pop()
result.push(x)
for y in adj(x):
indeg[y]--
if indeg[y] = 0:
zero.push(y)
That said, topological sort using DFS is conceptionally much simpler, IMHO:
result = []
visited = {}
dfs(x):
if x in visited: return
visited.insert(x)
for y in adj(x):
dfs(y)
result.push(x)
for x in V: dfs(x)
reverse(result)
You can achieve it in o(|v|+|e|). Follow below given steps:
Create two lists inDegree, outDegree which maintain count for in coming and out going edges for each node, initialize it to 0.
Now traverse through given adjacency list, for edge (u,v) in graph g, increase count of outdegree for u, and increment count of indegree for v.
You can traverse through adjacency list in o(v +e) , and will have indegree and outdegree for each u in o(|v|+|e|).
The Complexity that you mentioned for visiting adjacency nodes is not quite correct (O(n2)), because if you think carefully, you will notice that this is more like a BFS search. So, you visit each node and each edge only once. Therefore, the complexity is O(m+n). Where, n is the number of nodes and m is the edge count.
You can also use DFS for topological sorting. You won't need additional pass to calculate in-degree after processing each node.
http://www.geeksforgeeks.org/topological-sorting/

determine whether an undirected graph is a tree

I have written an algorithm to determine "whether an undirected graph is a tree"
Assumptions : graph G is represented as adjacency list, where we already know the number of vertices which is n
Is_graph_a_tree(G,1,n) /* using BFS */
{
-->Q={1} //is a Queue
-->An array M[1:n], such that for all i, M[i]=0 /* to mark visited vertices*/
-->M[1]=1
-->edgecount=0 // to determine the number of edges visited
-->While( (Q is not empty) and (edgecount<=n-1) )
{
-->i=dequeue(Q)
-->for each edge (i,j) and M[j] =0 and edgecount<=n-1
{
-->M[j]=1
-->Q=Q U {j}
-->edgecount++
}
}
If(edgecount != n-1)
--> print “G is not a tree”
Else
{
-->If there exists i such that M[i]==0
Print “ G is not a tree”
Else
Print “G is tree”
}
}
Is it right??
Is the time complexity of this algorithm Big0h(n)??
I think the counting of edges is not correct. You should also count edges (i,j) for witch M[j]=1 but j is not the parent of i (so you would also need to keep the parent of each node).
Maybe is better to count the edges at the end, by summing the sizes of the adjacency lists and dividing by 2.
You want to do a Depth First Search. An undirected graph has only back edges and tree edges. So you can just copy the DFS algorithm and if you find a back edge then it's not a tree.

Resources