Count paths with Topological Sort - algorithm

I have a DAG and I need to count all the paths since any node to another node, I've researched a little bit and I found that it could be done with some Topological Order, but so far the solutions are incomplete or wrong.
So how is the correct way to do it?.
Thanks.

As this is a DAG you can topologically sort the nodes in O(V+E) time. Let's assume the source vertex is S. Then from S start traversing the nodes in depth first fashion. When we're processing node U , let's assume there's an edge U->V then V is of course not yet visited (why? because it's an directed acyclic graph) So you can reach from S to V via node U in d[U] ways where d[U] is the number of paths from S to U.
So number of paths from S to any node V, d[V] = d[x1]+d[x2]+d[x3]+ . . . +d[xy], where there are edge like x1->V, x2->V, . . . xy->V
This algorithm will take O(V+E) to topologically sort the graph and then for calculating number of paths at most O(V*E ). You can further reduce its run time of calculating number of path to O(V+E) using adjacency list instead of adjacency matrix and this is the most efficient solution so far.

You can use recursion to count all of the paths in a tree/DAG. Here is the pseudocode:
function numPaths(node1, node2):
// base case, one path from node to itself
if (node1 == node2): return 1
totalPaths = 0
for edge in node1.edges:
nextNode = edge.destinationNode
totalPaths += numPaths(nextNode, node2)
return totalPaths
Edit:
A good dynamic approach to this problem is the Floyd-Warshall algorithm.

Assume G(V,E)
Let d[i][j] = the number of all the paths from i to j
Then d[i][j]= sigma d[next][j] for all (i,next) in E
It seems too slow? Okay. Just memorise it(some guys call it dynamic programming). Like this
memset(d,-1,sizeof(d))// set all of elements of array d to -1 at the very beginning
saya(int i,int j)
{
if (d[i][j]!=-1) return d[i][j];//d[i][j] has been calculated
if (i==j) return d[i][j]=1;//trivival cases
d[i][j]=0;
for e in i.edges
d[i][j]+=saya(e.next,j);
return d[i][j];
}
Now saya(i,j) will return the number of all the paths from i to j.

Related

Finding the longest path In an unweighted graph

i'm having a really tough time with this issue.
If I have a graphh, directed or undirected, unweighted and no cycles. How do I find the longest path?
Many algorithms I have seen rely on the graph being weighted, and reverse the weights and use bellman ford.
I ahve also seen dynamic programming solutions, but here people were simply looking for any path, I'm looking for one from s-t.
Ive been trying to break down the graph into subproblems, where I add one node a a time and give it a value of the parent that it is coming from plus one, but I just canoot get the logic right
can anyone provide an algorithm, exponential time would do, pseudopolynomial would be fantastic?
If the graph can be directed or undirected lets reduce the problem only to directed graphs. If it's undirected then you should make edges from v -> w and from w -> v. You can use modified DFS to measure the longest path starting from the node you pass to this function.
Then run this function to every node and get the maximum value.
Pseudocode for DFS:
DFS(Node v):
if (v.visited?) return 0;
v.visited = true;
longestPath = 0;
foreach(Node n : v.neighbours):
longestPath = max(longestPath, DFS(n) + 1)
return longestPath
Pseudocode for the problem:
longestPath(Node[] verticies):
longestPath = 0
foreach(Node v : vertices):
foreach(Node w: vertices):
w.visited = false;
longestPath = max(longestPath, DFS(v))
return longestPath;
It works in O(n^2) but I think this solution is straight forward.
Just to let you know, there is a solution which works O(n).

Linear-time algorithm for number of distinct paths from each vertex in a directed acyclic graph

I am working on the following past paper question for an algorithms module:
Let G = (V, E) be a simple directed acyclic graph (DAG).
For a pair of vertices v, u in V, we say v is reachable from u if there is a (directed) path from u to v in G.
(We assume that every vertex is reachable from itself.)
For any vertex v in V, let R(v) be the reachability number of vertex v, which is the number of vertices u in V that are reachable from v.
Design an algorithm which, for a given DAG, G = (V, E), computes the values of R(v) for all vertices v in V.
Provide the analysis of your algorithm (i.e., correctness and running time
analysis).
(Optimally, one should try to design an algorithm running in
O(n + m) time.)
So, far I have the following thoughts:
The following algorithm for finding a topological sort of a DAG might be useful:
TopologicalSort(G)
1. Run DFS on G and compute a DFS-numbering, N // A DFS-numbering is a numbering (starting from 1) of the vertices of G, representing the point at which the DFS-call on a given vertex v finishes.
2. Let the topological sort be the function a(v) = n - N[v] + 1 // n is the number of nodes in G and N[v] is the DFS-number of v.
My second thought is that dynamic programming might be a useful approach, too.
However, I am currently not sure how to combine these two ideas into a solution.
I would appreciate any hints!
EDIT: Unfortunately the approach below is not correct in general. It may count multiple times the nodes that can be reached via multiple paths.
The ideas below are valid if the DAG is a polytree, since this guarantees that there is at most one path between any two nodes.
You can use the following steps:
find all nodes with 0 in-degree (i.e. no incoming edges).
This can be done in O(n + m), e.g. by looping through all edges
and marking those nodes that are the end of any edge. The nodes with 0
in-degree are those which have not been marked.
Start a DFS from each node with 0 in-degree.
After the DFS call for a node ends, we want to have computed for that
node the information of its reachability.
In order to achieve this, we need to add the reachability of the
successors of this node. Some of these values might have already been
computed (if the successor was already visited by DFS), therefore this
is a dynamic programming solution.
The following pseudocode describes the DFS code:
function DFS(node) {
visited[node] = true;
reachability[node] = 1;
for each successor of node {
if (!visited[successor]) {
DFS(successor);
}
reachability[node] += reachability[successor];
}
}
After calling this for all nodes with 0 in-degree, the reachability
array will contain the reachability for all nodes in the graph.
The overall complexity is O(n + m).
I'd suggest using a Breadth First Search approach.
For every node, add all the nodes that are connected to the queue. In addition to that, maintain a separate array for calculating the reachability.
For example, if a A->B, then
1.) Mark A as traversed
2.) B is added to the queue
3.) arr[B]+=1
This way, we can get R(v) for all vertices in O(|V| + |E|) time through arr[].

Finding the number of all the shortest paths between two nodes in directed unweighted graph

I need help to finding the number of all the shortest paths between two nodes in an directed unweighted graph.
I am able to find one of the shortest paths using BFS algorithm, but i dont know how to find all the shortest paths.
Any idea of the algorithm / pseudocode I could use?
Thanks!!
You can do it by remembering how many paths are leading to each node, and when discovering a new node - summarize that number.
For simplicity, let's assume you have regular BFS algorithm, that whenever you use an edge (u,v), calls visit(u,v,k), where:
u - the source node of the edge
v - the target node of the edge
k - the distance from the original source to u
In addition to this, assume you have a mapping d:(vertex,distance)->#paths.
This is basically a map (or 2D matrix) that it's key is a pair of vertex and an integer - distance, and its value is the number of shortest paths leading from the source, to that vertex, with distance k.
It is easy to see that for each vertex v:
d[v,k] = sum { d[u,k-1] | for all edges (u,v) }
d[source,0] = 0
And now, you can easily find the number of shortest paths of length k leading to each node.
Optimization:
You can see that "number of shortest paths of length k" is redundant, you actually need only one value of k for each vertex. This requires some book-keeping, but saves you some space.
Good luck!
The first idea that come to my mind is the next:
Let's name start vertex as s and end vertex as e.
We can store two arrays: D and Q. D[i] is a length of the shortest path from s to i and Q[i] is a number of shortest paths between s and i.
How can we recalculate these arrays?
First of all, lets set D[s] = 0 and Q[s] = 1. Then we can use well-known bfs:
while queue with vertexes is not empty
get v from queue
set v as visited
for all u, there's an edge from v to u
if u is not visited yet
D[u] = D[v] + 1
Q[u] = Q[v]
push u into the queue
else if D[v] + 1 == D[u]
Q[u] += Q[v]
The answer is Q[e].
Modify your breadth first search to keep going until it starts finding longer paths, rather than stopping and returning just the first one.

Graph algorithm to calculate node degree

I'm trying to implement the topological-sort algorithm for a DAG. (http://en.wikipedia.org/wiki/Topological_sorting)
First step of this simple algorithm is finding nodes with zero degree, and I cannot find any way to do this without a quadratic algorithm.
My graph implementation is a simple adjacency list and the basic process is to loop through every node and for every node go through each adjacency list so the complexity will be O(|V| * |V|).
The complexity of topological-sort is O(|V| + |E|) so i think there must be a way to calculate the degree for all nodes in a linear way.
You can maintain the indegree of all vertices while removing nodes from the graph and maintain a linked list of zero indegree nodes:
indeg[x] = indegree of node x (compute this by going through the adjacency lists)
zero = [ x in nodes | indeg[x] = 0 ]
result = []
while zero != []:
x = zero.pop()
result.push(x)
for y in adj(x):
indeg[y]--
if indeg[y] = 0:
zero.push(y)
That said, topological sort using DFS is conceptionally much simpler, IMHO:
result = []
visited = {}
dfs(x):
if x in visited: return
visited.insert(x)
for y in adj(x):
dfs(y)
result.push(x)
for x in V: dfs(x)
reverse(result)
You can achieve it in o(|v|+|e|). Follow below given steps:
Create two lists inDegree, outDegree which maintain count for in coming and out going edges for each node, initialize it to 0.
Now traverse through given adjacency list, for edge (u,v) in graph g, increase count of outdegree for u, and increment count of indegree for v.
You can traverse through adjacency list in o(v +e) , and will have indegree and outdegree for each u in o(|v|+|e|).
The Complexity that you mentioned for visiting adjacency nodes is not quite correct (O(n2)), because if you think carefully, you will notice that this is more like a BFS search. So, you visit each node and each edge only once. Therefore, the complexity is O(m+n). Where, n is the number of nodes and m is the edge count.
You can also use DFS for topological sorting. You won't need additional pass to calculate in-degree after processing each node.
http://www.geeksforgeeks.org/topological-sorting/

Number of paths between two nodes in a DAG

I want to find number of paths between two nodes in a DAG. O(V^2) and O(V+E) are acceptable.
O(V+E) reminds me to somehow use BFS or DFS but I don't know how.
Can somebody help?
Do a topological sort of the DAG, then scan the vertices from the target backwards to the source. For each vertex v, keep a count of the number of paths from v to the target. When you get to the source, the value of that count is the answer. That is O(V+E).
The number of distinct paths from node u to v is the sum of distinct paths from nodes x to v, where x is a direct descendant of u.
Store the number of paths to target node v for each node (temporary set to 0), go from v (here the value is 1) using opposite orientation and recompute this value for each node (sum the value of all descendants) until you reach u.
If you process the nodes in topological order (again opposite orientation) you are guaranteed that all direct descendants are already computed when you visit given node.
Hope it helps.
This question has been asked elsewhere on SO, but nowhere has the simpler solution of using DFS + DP been mentioned; all solutions seem to use topological sorting. The simpler solution goes like this (paths from s to t):
Add a field to the vertex representation to hold an integer count. Initially, set vertex t’s count to 1 and other vertices’ count to 0. Start running DFS with s as the start vertex. When t is discovered, it should be immediately marked as finished (BLACK), without further processing starting from it. Subsequently, each time DFS finishes a vertex v, set v’s count to the sum of the counts of all vertices adjacent to v. When DFS finishes vertex s, stop and return the count computed for s. The time complexity of this solution is O(V+E).
Pseudo-code:
simple_path (s, t)
if (s == t)
return 1
else if (path_count != NULL)
return path_count
else
path_count = 0
for each node w ϵ adj[s]
do path_count = path_count + simple_path(w, t)
end
return path_count
end

Resources