Does anyone know where I can find an implementation of an algorithm to convert an activity node graph (aka activity-on-node graph) to an event node graph (aka activiy-on-arrow graph)?
If you don't know what I am talking about, take a look here: http://www.brpreiss.com/books/opus7/html/page581.html
Please provide a working algo in your answer.
Thanks in advance.
The easiest thing to do is replace every node v in your original graph with two nodes, v_in and v_out, connected by a single edge with weight equal to the original vertex weight. Then replace all the original edges (u, v) with edges from u_out to v_in of zero weight.
That link states how you do it:
The activity-node graph is a
vertex-weighted graph. However, the
algorithms presented in the preceding
sections all require edge-weighted
graphs. Therefore, we must convert the
vertex-weighted graph into its
edge-weighted dual. In the dual graph
the edges represent the activities,
and the vertices represent the
commencement and termination of
activities. For this reason, the dual
graph is called an event-node graph.
Although I suppose it leaves out some important details. The way they suggest to convert from an activity node to event node graph is to convert every activity node into an event node edge and to add a dummy edge for activities that take multiple inputs.
Another way to construct the event node graph is to replace every activity node with an edge and two nodes, e.g., A->B->C becomes A->A'->B->B'->C-C'. Then, remove every node that has only one input and zero or one output and replace them with an edge of zero cost, as those event nodes don't actually do anything.
foreach node in graph
if count of incoming_arrows != 1
{
Create new node
Assign incoming arrows from old node to new node
Create arrow from new node to old node
Assign cost 0 to new node
}
endif
end foreach
foreach arrow in graph
Assign cost of destination node to cost of arrow
/* if you want ...preceded by "node name:" to get F:5 */
end foreach
Rename the nodes
The data structures needed are something like
struct node
node_name string
node_cost int
struct arrow
arrow_form_node node
arrow_to_node node
arrow_cost int
Related
I have a cyclic directed graph and I was wondering if there is any algorithm (preferably an optimum one) to make a list of common descendants between any two nodes? Something almost opposite of what Lowest Common Ancestor (LCA) does.
As user1990169 suggests, you can compute the set of vertices reachable from each of the starting vertices using DFS and then return the intersection.
If you're planning to do this repeatedly on the same graph, then it might be worthwhile first to compute and contract the strong components to supervertices representing a set of vertices. As a side effect, you can get a topological order on supervertices. This allows a data-parallel algorithm to compute reachability from multiple starting vertices at the same time. Initialize all vertex labels to {}. For each start vertex v, set the label to {v}. Now, sweep all vertices w in topological order, updating the label of w's out-neighbors x by setting it to the union of x's label and w's label. Use bitsets for a compact, efficient representation of the sets. The downside is that we cannot prune as with single reachability computations.
I would recommend using a DFS (depth first search).
For each input node
Create a collection to store reachable nodes
Perform a DFS to find reachable nodes
When a node is reached
If it's already stored stop searching that path // Prevent cycles
Else store it and continue
Find the intersection between all collections of nodes
Note: You could easily use BFS (breadth first search) instead with the same logic if you wanted.
When you implement this keep in mind there will be a few special cases you can look for to further optimize your search such as:
If an input node doesn't have any vertices then there are no common nodes
If one input node (A) reaches another input node (B), then A can reach everything B can. This means the algorithm wouldn't have to be ran on B.
etc.
Why not just reverse the direction of the edge and use LCA?
I have to ensure that a graph in our application is a DAG with a unique source and a unique sink.
Specifically I have to ensure that for a given start node and end node (both of which are known at the outset) that every node in the graph lies on a path from the start node to the end node.
I already have an implementation of Tarjan's algorithm which I use to identify cycles, and a a topological sorting algorithm that I can run once Tarjan's algorithm reports the graph is a DAG.
What is the most efficient way to make sure the graph meets this criterion?
If your graph is represented by an adjacency matrix, then node x is a source node if the xth column of the matrix is 0 and is a sink node if row x of the matrix is 0. You could run two quick passes over the matrix to count the number of rows and columns that are 0s to determine how many sources and sinks exist and what they are. This takes time O(n2) and is probably the fastest way to check this.
If your graph is represented by an adjacency list, you can find all sink nodes in time O(n) by checking whether any node has no outgoing edges. You can find all sinks by maintaining for each node a boolean value indicating whether it has any incoming edges, which is initially false. You can then iterate across all the edges in the list in time O(n + m) marking all nodes with incoming edges. The nodes that weren't marked as having incoming edges are then sources. This process takes time O(m + n) and has such little overhead that it's probably one of the fastest approaches.
Hope this helps!
A simple breadth or depth-first search should satisfy this. Firstly you can keep a set of nodes that comprise sink nodes that you've seen. Secondly, you can keep a set of nodes that you've discovered using BFS/DFS. Then the graph will then be connected iff there is a single connected component. Assuming you're using some kind of adjacency list style representation for your graph, the algorithm would look something like this:
create an empty queue
create an empty set to store seen vertices
create an empty set for sink nodes
add source node to queue
while queue is not empty
get next vertex from queue, add vertex to seen vertices
if num of adjacent nodes == 0
add sink nodes to sink node set
else
for each node in adjacent nodes
if node is not in seen vertices
add node to queue
return size of sink nodes == 1 && size of seen vertices == total number in graph
This will be linear in the number of vertices and edges in the graph.
Note that as long as you know a source vertex to start from, this will also ensure the property of a single source: any other vertex that is a source won't be discovered by the BFS/DFS, and thus the size of the seen vertices won't be the total number in the graph.
If your algorithm takes as input a DAG, which is weakly connected, assume that there are only one node s whose in-degree is zero and only one node t whose out-degree is zero, while all other nodes have positive in-degree and out-degree, then s can reach all other nodes, and all other nodes can reach t. By contradiction, assume that there is a node v that s cannot reach. Since no nodes can reach s, that is, v cannot reach s too. As such, v and s are disconnected, which contradicts the assumption. On the other hand, if the DAG is not weakly connected, it definitely does not satisfy the requirements that you want. In sum, you can first compute the weakly connected component of the DAG simply using BFS/DFS, meanwhile remembering the number of nodes whose in-degree or out-degree is zero. The complexity of this algorithm is O(|E|).
First, do a DFS on the graph, starting from the source node, which, as you say, is known in advance. If you encounter a back edge[1], then you have a cycle and you can exit with an error. During this traversal you can identify if there are nodes, not reachable from the source[2] and take the appropriate action.
Once you have determined the graph is a DAG, you can ensure that every node lies on a path from the source to the sink by another DFS, starting from the source, as follows:
bool have_path(source, sink) {
if source == sink {
source.flag = true
return true
}
// traverse all successor nodes of `source`
for dst in succ(source) {
if not dst.flag and not have_path(dst, sink)
return false // exit as soon as we find a node with no path to `sink`
}
source.flag = true;
return true
}
The procedure have_path sets a boolean flag in each node, for which there exists some path from that node to the sink. In the same time the procedure traverses only nodes reachable from the source and it traverses all nodes reachable from the source. If the procedure returns true, then all the nodes, reachable from the source lie on a path to the sink. Unreachable nodes were already handled in the first phase.
[1] an edge, linking a node with a greater DFS number to an node with a lesser DFS number, that is not already completely processed, i.e. is still on the DFS stack
[2] e.g. they would not have an assigned DFS number
I'm looking at a graph problem where I'm given a source node and need to find all other nodes up to a fixed distance away, where each edge between nodes has a uniform cost. So I've implemented a breadth first search using the standard FIFO queue technique, but stopping the BFS at a fixed distance is causing me problems.
If I were using DFS, I could pass in the current depth with each recursive call, but I can't do that here. I cannot modify the nodes of the graph to keep an extra parameter (distance) either. Any suggestions or references?
Just use two queues and bounce back and forth between them. Every time you switch from one to the other, increment your depth count by one.
To elaborate...
Maintain an "active" queue and an "inactive" queue.
Pop a node from the active queue. Add its neighbors to the inactive queue. Repeat until the active queue is empty. Then swap the queues.
This maintains the invariant that if the distance to all nodes in the active queue is d, the distance to all nodes in the inactive queue is d+1. Easy enough to keep track of and stop when you want.
You can pass the depth to the value you put in the queue. You can also keep a separate array to store the depth you reached each node.
Encapsulate the vertices you pass together with their distance from the source of the BFS.
Another possibility would be to just mark the vertices in the queue; usually frameworks for graphs allow you to assign weights to elements of the graph, which is a mechanism you could use for your purpose.
One last possibility would be to insert a marking vertex that isn't actually in the graph into the queue after the frontier of one level of the BFS has been completely processed so you know when a new level of BFS depth starts. That would make your queue look something like v u w x y MARKER s t j l k with all of these being vertices of the graph, except for MARKER.
I am working on a graph library that requires to determine whether two nodes are connected or not and if connected what is the degree of separation between them
i.e number of nodes needed to travel to reach the target node from the source node.
Since its an non-weighted graph, a bfs gives the shortest path. But how to keep the track of number of nodes discovered before reaching the target node.
A simple counter which increments on discovering a new node will give a wrong answer as it may include nodes which are not even in the path.
Another way would be to treat this as a weighted graph of uniform weighted edges and using Djkastra's shortest path algorithm.
But I want to manage it with bfs only.
How to do it ?
During the BFS, have each node store a pointer to its predecessor node (the node in the graph along whose edge the node was first discovered). Then, once you've run BFS, you can repeatedly follow this pointer from the destination node to the source node. If you count up how many steps this takes, you will have the distance from the destination to the source node.
Alternatively, if you need to repeatedly determine the distances between nodes, you might want to use the Floyd-Warshall all-pairs shortest paths algorithm, which if precomputed would let you immediately read off the distances between any pair of nodes.
Hope this helps!
I don't see why a simple counter wouldn't work. In this case, breadth-first search would definitely give you the shortest path. So what you want to do is attach a property to every node called 'count'. Now when you encounter a node that you have not visited yet, you populate the 'count' property with whatever the current count is and move on.
If later on, you come back to the node, you should know by the populated count property that it has already been visited.
EDIT: To expand a bit on my answer here, you'll have to maintain a variable that'll track the degree of separation from your starting node as you navigate the graph. For every new set of children that you load into the queue, make sure that you increment the value in that variable.
If all you want to know is the distance (possibly to cut off the search if the distance is too large), and all edges have the same weight (i.e. 1):
Pseudocode:
Let Token := a new node object which is different from every node in the graph.
Let Distance := 0
Let Queue := an empty queue of nodes
Push Start node and Token onto Queue
(Breadth-first-search):
While Queue is not empty:
If the head of Queue is Target node:
return Distance
If the head of Queue is Token:
Increment Distance
Push Token onto back of the Queue
If the head of Queue has not yet been seen:
Mark the head of the Queue as seen
Push all neighbours of the head of the Queue onto the back of Queue
Pop the head of Queue
(Did not find target)
I have a cyclic directed graph. Starting at the leaves, I wish to propagate data attached to each node downstream to all nodes that are reachable from that node. In particular, I need to keep pushing data around any cycles that are reached until the cycles stabilise.
I'm completely sure that this is a stock graph traversal problem. However, I'm having a fair bit of difficulty trying to find a suitable algorithm --- I think I'm missing a few crucial search keywords.
Before I attempt to write my own half-assed O(n^3) algorithm, can anyone point me at a proper solution? And what is this particular problem called?
Since the graph is cyclic (i.e. can contain cycles), I would first break it down into strongly connected components. A strongly connected component of a directed graph is a subgraph where each node is reachable from every other node in the same subgraph. This would yield a set of subgraphs. Notice that a strongly connected component of more than one node is effectively a cycle.
Now, in each component, any information in one node will eventually end up in every other node of the graph (since they are all reachable). Thus for each subgraph we can simply take all the data from all the nodes in it and make every node have the same set of data. No need to keep going through the cycles. Also, at the end of this step, all nodes in the same component contains exactly the same data.
The next step would be to collapse each strongly connected component into a single node. As the nodes within the same component all have the same data, and are therefore basically the same, this operation does not really change the graph. The newly created "super node" will inherit all the edges going out or coming into the component's nodes from nodes outside the component.
Since we have collapsed all strongly connected components, there will be no cycles in the resultant graph (why? because had there been a cycle formed by the resultant nodes, they would all have been placed in the same component in the first place). The resultant graph is now a Directed Acyclic Graph. There are no cycles, and a simple depth first traversal from all nodes with indegree=0 (i.e. nodes that have no incoming edges), propagating data from each node to its adjacent nodes (i.e. its "children"), should get the job done.