Best way to find if path exists in a unidirectional directed graph - algorithm

I have a graph with huge number of nodes with one start node ( all edges are outward ) and one end node ( all edges towards it ). It is an unidirectional and unweighted graph.How to optimize the search in this kind of graph for finding out if path exists between two nodes ? I know BFS provides a solution. Is there anyway to optimize the search ( like adding some additional information ) as I will be doing frequent search on the graph?
EDIT : To add more information about the graph, the graph has one start node with multiple out-edges and one end node with multiple in-edges. In between, there are millions of nodes connected. It is an unweighted DAG. And there are no heuristics involved. Just check isConnected(node a,node b).

Considering your graph is acyclic here is a way to do it : -
Do DFS on graph start with source vertex(only outgoiong edges)
For each edge (u,v) in the graph connected[u][v] = true
Try to store the previous node in DFS stack in a array & for each vertex v visited check the previous nodes in the stack and do
connected[u][v] = true where u is a previous node.
If graph is not acyclic then first calculate SCC's using Kosaraju or Tarjan and then reduce the graph to acyclic and do connected[u][v] = true for each pair in a SCC
pseudo code for modified DFS routine:-
bool connected[n][n] = {false};
bool visited[n] = {false};
int stack[n];
for each source vertex v do :
DFS(v,stack,0);
void DFS(int u,int stack[n],int depth) {
if(!visited[v]) {
visited[v] = true;
for(int i=0;i<depth;i++) {
connected[stack[i]][v] = true;
}
stack[depth] = u;
for each edge(u,v) {
connected[u][v] = true;
DFS(v,stack,depth+1);
}
}
}
Space Complexity : O(V^2)
Time Complexity : O(V^2)
Note:-
If your number of queries are less then try to use DFS for them individually and cache the results as this will be more time consuming then that.

Related

Find the shortest cycle in a positive weighted directed graph passing through only specific nodes (not the other nodes)

Consider a weighted directed graph, including V vertices and E edges. I am looking for an algorithm that finds the shortest cycle that passes through only S certain node (must pass through all nodes in S), not the other nodes. The cycle starts and ends from node w in set S.
Is it possible to delete the nodes in the set of V - S and also delete their corresponding connected edges, and then apply an algorithm (for finding the shortest cycle) to this graph, including only S nodes and their corresponding edges?
I emphasize that we only consider the nodes in set S, not the other nodes.
I am not sure if the below link is relevant to my question. The link asks for the shortest cycle that must pass through the blue nodes, but the cycle may pass through the black ones (I am not sure about this).
Finding shortest circuit in a graph that visits X nodes at least once
Yes, the way your problem is stated, the consider approach is correct.
A graph where you remove all vertices that don't belong to a set S is called an induced subgraph. Every path/cycle in the original graph that only uses vertices from S can be found in the induced subgraph, too. Therefore, finding the shortest cycle in the induced subgraph is equivalent to finding the cycle in the original graph.
If your problem requires to find the shortest cycle that uses all nodes in S, then you're solving the travelling salesman problem, which is known to be NP-hard, which means there is no known (and likely no existing) polynomial algorithm. That said, it is a well studied problem, you can choose from both exact algorithms (if the set is small enough) and heuristics/approximations for larger scale.
The first step is to detect the cycles that are present in your graph, if any
This can be done by modifying a depth first search ( DFS ) as follows:
- run DFS
- IF a node is reached for the second time
- IF path exists from node reached again to current DFS node
- the path is a cycle
Now you can filter the cycles detected for your criteria ( visit nodes in S, shortest, etc )
Here is the C++ code for a DFS that detects and records cycles
std::vector<std::vector<vertex_t>>
cGraph::dfs_cycle_finder(const std::string &start)
{
std::vector<std::vector<vertex_t>> ret;
// track visited vertices
std::vector<bool> visited(vVertex.size(), false);
// vertices waiting to be processed
std::stack<vertex_t> wait;
// start at the beginning
wait.push(vVertex[index(start)]);
// continue until no more vertices need processing
while (!wait.empty())
{
vertex_t v = wait.top();
wait.pop();
int vi = index(v);
if (!visited[vi])
{
visited[vi] = true;
for (vertex_t w : adjacentOut(v))
{
if (!visited[index(w)])
{
wait.push(w);
}
else
{
// previously visited node, check for ancestor
auto cycle = path( w, v );
if( cycle.size() > 0 ) {
// found a cycle
cycle.push_back( w );
ret.push_back(cycle);
}
}
}
}
}
return ret;
}
The complete application for this is at https://github.com/JamesBremner/graphCycler
Example output:
node a linked to b
node b linked to c
node c linked to d
node d linked to a
cycle: a b c d a

Algorithm for finding the shortest cycle in a positive weighted directed graph passing through only specific nodes (not the other nodes) [duplicate]

Consider a weighted directed graph, including V vertices and E edges. I am looking for an algorithm that finds the shortest cycle that passes through only S certain node (must pass through all nodes in S), not the other nodes. The cycle starts and ends from node w in set S.
Is it possible to delete the nodes in the set of V - S and also delete their corresponding connected edges, and then apply an algorithm (for finding the shortest cycle) to this graph, including only S nodes and their corresponding edges?
I emphasize that we only consider the nodes in set S, not the other nodes.
I am not sure if the below link is relevant to my question. The link asks for the shortest cycle that must pass through the blue nodes, but the cycle may pass through the black ones (I am not sure about this).
Finding shortest circuit in a graph that visits X nodes at least once
Yes, the way your problem is stated, the consider approach is correct.
A graph where you remove all vertices that don't belong to a set S is called an induced subgraph. Every path/cycle in the original graph that only uses vertices from S can be found in the induced subgraph, too. Therefore, finding the shortest cycle in the induced subgraph is equivalent to finding the cycle in the original graph.
If your problem requires to find the shortest cycle that uses all nodes in S, then you're solving the travelling salesman problem, which is known to be NP-hard, which means there is no known (and likely no existing) polynomial algorithm. That said, it is a well studied problem, you can choose from both exact algorithms (if the set is small enough) and heuristics/approximations for larger scale.
The first step is to detect the cycles that are present in your graph, if any
This can be done by modifying a depth first search ( DFS ) as follows:
- run DFS
- IF a node is reached for the second time
- IF path exists from node reached again to current DFS node
- the path is a cycle
Now you can filter the cycles detected for your criteria ( visit nodes in S, shortest, etc )
Here is the C++ code for a DFS that detects and records cycles
std::vector<std::vector<vertex_t>>
cGraph::dfs_cycle_finder(const std::string &start)
{
std::vector<std::vector<vertex_t>> ret;
// track visited vertices
std::vector<bool> visited(vVertex.size(), false);
// vertices waiting to be processed
std::stack<vertex_t> wait;
// start at the beginning
wait.push(vVertex[index(start)]);
// continue until no more vertices need processing
while (!wait.empty())
{
vertex_t v = wait.top();
wait.pop();
int vi = index(v);
if (!visited[vi])
{
visited[vi] = true;
for (vertex_t w : adjacentOut(v))
{
if (!visited[index(w)])
{
wait.push(w);
}
else
{
// previously visited node, check for ancestor
auto cycle = path( w, v );
if( cycle.size() > 0 ) {
// found a cycle
cycle.push_back( w );
ret.push_back(cycle);
}
}
}
}
}
return ret;
}
The complete application for this is at https://github.com/JamesBremner/graphCycler
Example output:
node a linked to b
node b linked to c
node c linked to d
node d linked to a
cycle: a b c d a

How to minimize the vertex of the graph by substituting cycles?

How can I minimize the number of vertices of the directed graph by removing circuits? Is there any algorithms that can be adapted here?
There already is a question about removing the cycles in graphs, but I am particularly asking about MINIMIZING THE NUMBER OF VERTICES by removing the cycles in graphs
Supposing the solution for your problem is to simply turn cycles into a single node, sure, you can do that easily.
When you execute Breadth-First Search (BSF) or Depth-First Search (DFS), you will find cycles (i.e. if you mark the path you step into, once you can reach an already marked node, you have found a cycle). Hence, you can easily find cycles by storing the predecessor of each node you visit, that is, if you are in node u and you go to node v, you can store p[v] = u, so if you find some node w already visited in the adjacency list of v, you can walk back, parent by parent, until you find w and you have all nodes from that cycles.
I cannot guarantee any property of completeness from this algorithm, so if you can freely preprocess your graph, you can run DFSs on it until the graph is unchanged by it, otherwise run it a certain number n of times that you find is efficient.
void FindCycles(vector<Node> nodes){
int p[nodes.size()];
bool mark[nodes.size()]; //set all to false
stack<int> s;
s.push(nodes[0].id);
while(s.size()){
int u = s.pop();
mark[u] = true;
for(int v : nodes[u].adjs){
p[v] = u;
if(mark[v]) {
//found a cycle, call some method to reduce the graph
cout<<v<<" belongs to the cycle"<<endl;
while(u != v){
cout<<u<<" belongs to the cycle"<<endl;
u = p[u];
}
break;
}
else{
s.push(v);
}
}
}
}

DFS after remove some edge

I have a graph with one source vertex and a list of the edges, where in each iteration one edge from the list is going to be removed from the graph.
For each vertex i have to print the number of iterations after it lost its connection to the source vertex- there will be no path between the vertex and the source.
My idea is to run DFS algorithm from the source vertex in each iteration and increment the value of the vertexes, which have the connection with the source vertex- there is a path between the vertex and the source vertex.
I'm sure there is a better idea than run the dfs algorithm from the source vertex in each iteration. But I don't know how to resolve the problem in better, faster way.
Since you have the whole edge list in advance, you can process it backwards, connecting the graph instead of disconnecting it.
In pseudo-code:
GIVEN:
edges = list of edges
outputMap = new empty map from vertex to iteration number
S = source vertex
//first remove all the edges in the list
for (int i=0;i<edges.size();i++) {
removeEdge(edges[i]);
}
//find vertices that are never disconnected
//use DFS or BFS
foreach vertex reachable from S
{
outputMap[vertex] = -1;
}
//walk through the edges backward, reconnecting
//the graph
for (int i=edges.size()-1; i>=0; i--)
{
Vertex v1 = edges[i].v1;
Vertex v2 = edges[i].v2;
Vertex newlyConnected = null;
//this is for an undirected graph
//for a directed graph, you only test one way
//is a new vertex being connected to the source?
if (outputMap.containsKey(v1) && !outputMap.containsKey(v2))
newlyConnected = v2;
else if (outputMap.containsKey(v2) && !outputMap.containsKey(v1))
newlyConnected = v1;
if (newlyConnected != null)
{
//BFS or DFS again
foreach vertex reachable from newlyConnected
{
//It's easy to calculate the desired remove iteration number
//from our add iteration number
outputMap[vertex] = edges.size()-i;
}
}
addEdge(v1,v2);
}
//generate output
foreach entry in outputMap
{
if (entry.value >=0)
{
print("vertex "+entry.key+" disconnects in iteration "+entry.value);
}
}
This algorithm achieves linear time, since each vertex is only involved in a single BFS or DFS, before it gets connected to the source.
It helps to reverse time, so that we're thinking about adding edges one by one and determining when connectivity to the source is achieved. Your idea of performing a traversal after each step is a good one. To get the total cost down to linear, you need the following optimization and an amortized analysis. The optimization is that you save the set of visited vertices from traversal to traversal and treat the set as one "supervertex", deleting intra-set edges as they are traversed. The cost of each traversal is proportional to the number of edges thus deleted, hence the amortized linear running time.

Finding a cycle in a directed graph using BFS or DFS

I tried looking around the Internet but I'm a little stuck at the moment with regards to modifying the BFS or DFS algorithm in order to be able to find a cycle in a directed graph. If the graph were not directed, the DFS algorithm would solve this using back edges, but this method fails when looking at directed graphs.
Can anyone point me in the right direction?
Thanks for your time.
Keep track of vertices currently in recursion stack of function for DFS traversal. If you reach a vertex that is already in the recursion stack, then there is a cycle in the tree.
Create an array recStack[] and add every vertex visited in it. if you encounter a vertex that is already visited, there exists a cycle and you can print it by passing that vertex again to a modified DFS function for printing
bool isGraphCyclic(int v, bool visited[], bool *recStack)
{
if(visited[v] == false)
{
// Mark the current node as visited and part of recursion stack
visited[v] = true;
recStack[v] = true;
// Recur for all the vertices adjacent to this vertex
list<int>::iterator i;
for(i = adj[v].begin(); i != adj[v].end(); ++i)
{
if ( !visited[*i] && isGraphCyclic(*i, visited, recStack) )
return true;
else if (recStack[*i])
return true;
}
}
recStack[v] = false; // remove the vertex from recursion stack
return false;
}
DFS algorithm classifies graph edges into three categories *:
Forward edges
Cross edges
Back edges
If your graph has a back edge, it has a cycle. When you run a DFS algorithm and see a backedge, examine the portion of the path from the vertex to which the back edge leads to the current node will give you a set of nodes from the cycle to which the back edge belongs.
* Sometimes, tree edges are treated as a separate category from forward edges, which is insignificant for the purposes of this discussion.

Resources