Counting incoming edges in a directed acyclic graph - algorithm

Given a directed acyclic graph (DAG), is there an algorithm that runs in linear time that counts the in-degree of each vertex (and the source of that edge), given that we know a root node (a node that from it you can reach every other vertex)?

Let's assume your nodes are numbered from 1 to n. There's a simple solution: Create an array D of size n, with values initialized to 0. Then walk through all edges (v, w) and increment D[w] by one. In the end D[w] will be the in-degree of vertex w.

This may be the answer to your problem:
public List<Integer> findIndegree(int n, List<List<Integer>> edges)
{
List<Integer>as = new ArrayList<>();
int[]in_deg = new int[n];
for(int u=0;u<n;u++)
{
for(int x:edges.get(u))
in_deg[x]++;
}}
Here the array in_deg is storing the degree of the variables.

Related

Write a program to solve the topological sorting problem by using Depth First Search (DFS) algorithm

I want to use topological sorting using Depth First Search (DFS) for the given problem (the directed graph attached below).
click here to see the image.
Could you please help by writing the appropriate code for the given problem using any programming language?
To recap: an ordering of the vertices of a graph is a topological sorting if all edges in the graph lead from higher-numbered vertices to lower-numbered vertices (or the other way around).
This means, that if, having a vertex V, you'd be able to somehow visit all vertices accessible from V and give them numbers lower than the number you give to V, then all edges going out of V will lead to vertices numbered lower than V, so they will conform to the definition of a topological sorting.
It is now our goal to achieve the above for all vertices.
Recall that a DFS could be useful here, because a DFS visits all vertices accessible from a given vertex.
We must run the DFS for all vertices, but since the DFS is a recursive algorithm, it will run itself for all vertices accessible from other vertices for which the DFS is run. Thus, we must run the DFS for all vertices, numbering the vertices in the order in which we exit from the DFS function, because at the end of the DFS function we know that all vertices accessible from a given vertex have been visited.
Here's some sample code in C++:
#include <vector> // we'll need vector for an adjacency list graph representation
const maxN = 1'000'000 // maximum number of vertices
vector<int> graph[maxN]; // the adjacency list representation of the graph
bool visited[maxN]; // filled with the value false by default
int postOrder[maxN]; // array with the so-called post-order ordering of the vertices
int counter = 1;
void DFS(int v) {
visited[v] = true; // mark vertex as visited
for (int neighbour : graph[v]) { // for each neighbour of v
if (!visited[neighbour]) { // if the neighbour wasn't already visited
DFS(neighbour); // run the DFS for that neighbour
}
}
// Finally, after all the vertices accessible from v have been numbered,
// give v the next number, and increase the counter.
postOrder[v] = counter;
counter++;
}
void sortTopologically() {
// For each unvisited vertex, run the DFS starting in that vertex.
for (int v = 0; v < maxN; v++) {
if (!visited[v]) {
DFS(v);
}
}
}
This is called a post-order ordering and is a very common way to topologically sort a directed graph. You can then easily check if the topological sorting exists by verifying that all edges lead from a higher-numbered vertex to a lower-numbered vertex.
Good luck with your university course!

Algorithm - Finding the number of pairs with diameter distance in a tree?

I have a non-rooted bidirectional unweighted non-binary tree. I know how to find the diameter of the tree, the greatest distance between any pair of points in the tree, but I'm interested in finding the number of pairs with that max distance. Is there an algorithm to find the number of pairs with diameter distance in better than O(V^2) time, where V is the number of nodes?
Thank you!
Yes, there's a linear-time algorithm that operates bottom-up and resembles the algorithm for just finding the diameter. Here's the signature in Java-ish pseudocode; I'll leave the algorithm itself as an exercise.
class Node {
Collection<Node> children;
}
class Result {
int height; // height of the tree
int num_deep_nodes; // number of nodes whose depth equals the height
int diameter; // length of the longest path inside the tree
int num_long_paths; // number of pairs of nodes at distance |diameter|
}
Result computeNumberOfLongPaths(Node root); // recursive
Yes there is an algorithm with O(V+E) time.It is simply a modified version of finding the diameter.
As we know we can find the diameter using two calls of BFS by first making first call on any node and then remembering the last node discovered u and running a second call BFS(u),and remembering the last node discovered ,say v.The distance between u and v gives us the diameter.
Coming to number of pairs with that max distance.
1.Before invoking the first BFS,initialize an array distance of length |V| and distance[s]=0.s is the starting vertex for first BFS call on any node.
2.In the BFS,modify the while loop as:
while(Q is not empty)
{
e=deque(Q);
for all vertices w adjacent to e
{
if(w is not visited)
{
enque(w)
mark w as visited
distance[w]=distance[e]+1
parent[w]=e
}
}
}
3.Like I said,remembering the last node visited,say u is that node. Now counting the number of vertices that are at the same level as vertex u. mark is an array of length n,which has all its value initialized to 0,0 implies that vertex not counted initially.
n1=0
for i = 1 to number of vertices
{
if(distance[i]==distance[u]&&mark[i]==0)
{
n1++
mark[i]=1/*vertex counted*/
}
}
n1 gives the number of vertices,that are at the same level as vertex u,now for all vertices that have mark[i] = 1 ,are marked and they will not be counted again.
4.Similarly before performing second BFS on u,initialize another array distance2 of length |V| and distance2[u]=0.
5.Run BFS(u) and again get the last node discovered say v
6.Repeat 3rd step,this time on distance2 array and taking a different variable say n2=0 and the condition being
if(distance2[i]==distance2[v]&&mark[i]==0)
n2++
else if(distance2[i]==distance2[v]&&mark[i]==1)
set_common=1
7.set_common is a global variable that is set when there are a set of vertices such that between any two vertices the path is that of a diameter and the first bfs did not mark all those vertices but did mark at least one of those that is why mark[i]==1.
Suppose that first bfs did mark all such vertices in first call then n2 would be = 0 and set_common would not be set and there is no need also.But this situation is same as above
In any case the number of pairs giving diameter are:=
(n+n2)combination2 - X=(n1+n2)!/((2!)((n1+n2-2)!)) - X
I will elaborate on what X is.Else the number of pairs are = n1*n2,which is the case when 2 disjoint set of vertices are giving the diameter
So the Condition used is
if(n2==0||set_common==1)
number_of_pairs=(n1+n2)C2-X
else n1*n2
Now talking about X.It can occur that the vertices that are marked may have common parent.In that case we must not count there combinations.So before using the above condition it is advised to run the following algorithm
X=0/*Initialize*/
for(i = 1 to number of vertices)
{
s = 0,p = -1
if(mark[i]==0)
continue
else
{
s++
if(p==-1)
p=parent[i]
while((i+1)<=number_of_vertices&& p==parent[i+1])
{s++;i++}
}
if(s>1)
X=X+sC2
}
Proof of correctness
It is very easy.Since BFS traverses a tree level by level,n1 will give you the number of vertices at the level of u and n2 gives you the number of vertices at the level of v and since the distance between u and v = diameter.Therefore, distance between any vertex on level of u and any vertex on level of v will be equal to diameter.
The time taken is 2(|V|) + 2*time_of_DFS=O(V+E).

Shortest path in a complement graph algorithm

I had a test today (Data Structures course), and one of the questions was the following:
Given an undirected, non-weighted graph G=(V,E), you need to write an algorithm that for a given node s, returns the shortest path from s to all the nodes v' in the complement graph.
A Complement Graph G'=(E',V') contains an edge between any to nodes in G that don't share an edge, and only those.
The algorithm needs to run in O(V+E) (of the original graph).
I asked 50 different students, and not even one of them solved it correctly.
any Ideas?
Thanks a lot,
Barak.
The course staff have published the official answers to the test.
The answer is:
"The algorithm is based on a BFS with a few adaptations.
For each node in the graph we will add 2 fields - next and prev. Using these two fields we can maintain two Doubly-Linked lists of nodes: L1,L2.
At the beginning of every iteration of the algorithm, L1 has all the while nodes in the graph, and L2 is empty.
The BFS code (without the initialization) is:
At the ending of the loop at lines 3-5, L1 contains all the white nodes that aren't adjacent to u in G, or in other words, all the white nodes that are adjacent to u in the complement graph.
Therefore the runtime of the algorithm equals to the runtime of the original BFS on the complement graph.
The time is O(V+E) because lines 4-5 are executed at most 2E times, and lines 7-9 are executed at most V times (Every node can get out of L1 only once)."
Note: this is the original solution translated from Hebrew.
I Hope you find it helpful, and thank you all for helping me out,
Barak.
I would like to propose a different approach.
Initialization:-
Create a list of undiscovered edges. Let's call it undiscovered and initialize it with all nodes.
Then we will run a modified version of BFS
Create a Queue(Q) and add start node to it
Main algo
while undiscovered.size()>0 && Queue not Empty
curr_node = DEQUEUE(Queue)
create a list of all edges in the complement graph(Lets call it
complement_edges). This can be created by looping through all the
nodes in undiscovered and checking whether it is connected to
curr_node.
Then loop through each node in complement_edges perform 3
operation
Update distance if optimal
remove this node from undiscovered
ENQUEUE(Queue, this node)
Some things to note here,
If the initial graph is sparse, then the undiscovered will become empty very fast.
During implementation, use hashing to store edges in graph, this will make step 2 fast.
Heres the sample code:-
HashSet<Integer> adjList[]; // graph stored as adjancency list
public int[] calc_distance(int start){
HashSet<Integer> undiscovered = new HashSet<>();
for(int i=1;i<=N;i++){
undiscovered.add(i);
}
int[] dist = new int[N+1];
Arrays.fill(dist, Integer.MAX_VALUE/4);
Queue<Integer> q = new LinkedList<>();
q.add(start);
dist[start] = 0;
while(!q.isEmpty() && undiscovered.size()>0){
int curr = q.poll();
LinkedList<Integer> complement_edges = new LinkedList<>();
for(int child : undiscovered){
if(!adjList[curr].contains(child)){
// curr and child is connected in complement
complement_edges.add(child);
}
}
for(int child : complement_edges){
if(dist[child]>(dist[curr]+1)){
dist[child] = dist[curr]+1;
}
// remove complement_edges from undiscovered
undiscovered.remove(child);
q.add(child);
}
}
return dist;
}
}

how to find root of a directed acyclic graph

I need a method to find root of a directed acyclic graph.I am using boolean adjancency matix to represent graph in java.so please suggest.Also graph is unweighted graph
Just find the node where indegree is 0. For below algorithm to work we assume that none of nodes in graph are isolated.
int indegree[N]={0};
for(i=0;i<n;++i){
for(j=0;j<n;++j){
if(graph[i][j]==1){ //assuming edge from i to j
indegree[j]++;
}
}
}
for(int i=0;i<n;++i){
if(indegree[i]==0) add i to roots;
}
You are looking for nodes with no in-edges. If the adjacency matrix is encoded so that entry (i,j) contains a 1 if and only if there is an edge from i to j, then for node K to be a root, there must be no edges of the form i->K, therefore no 1's in entries of the form (i, K). So you are looking for columns K with all zeros. Each such column is a root.
In pseudocode,
roots = {}
for k in 1 to N
for i in 1 to N
if adjacencies[i, k] > 0
continue with next k value
add k to roots
It can be done in linear time. It is basically doing DFS over the graph with all the edges reversed.
Pick up any vertex in the given graph G
Check if the vertex has in-degree equal to 0. If it does we have found a vertex which is root of the graph.
If not then, mark the current vertex v as visited and repeat the same process over all the unvisited parents of v.
This will fetch all the required vertices with in-degree equal to zero or roots of a DAG.

determine whether an undirected graph is a tree

I have written an algorithm to determine "whether an undirected graph is a tree"
Assumptions : graph G is represented as adjacency list, where we already know the number of vertices which is n
Is_graph_a_tree(G,1,n) /* using BFS */
{
-->Q={1} //is a Queue
-->An array M[1:n], such that for all i, M[i]=0 /* to mark visited vertices*/
-->M[1]=1
-->edgecount=0 // to determine the number of edges visited
-->While( (Q is not empty) and (edgecount<=n-1) )
{
-->i=dequeue(Q)
-->for each edge (i,j) and M[j] =0 and edgecount<=n-1
{
-->M[j]=1
-->Q=Q U {j}
-->edgecount++
}
}
If(edgecount != n-1)
--> print “G is not a tree”
Else
{
-->If there exists i such that M[i]==0
Print “ G is not a tree”
Else
Print “G is tree”
}
}
Is it right??
Is the time complexity of this algorithm Big0h(n)??
I think the counting of edges is not correct. You should also count edges (i,j) for witch M[j]=1 but j is not the parent of i (so you would also need to keep the parent of each node).
Maybe is better to count the edges at the end, by summing the sizes of the adjacency lists and dividing by 2.
You want to do a Depth First Search. An undirected graph has only back edges and tree edges. So you can just copy the DFS algorithm and if you find a back edge then it's not a tree.

Resources