How can we detect if a directed graph is cyclic? I thought using breadth first search, but I'm not sure. Any ideas?
What you really need, I believe, is a topological sorting algorithm like the one described here:
http://en.wikipedia.org/wiki/Topological_sorting
If the directed graph has a cycle then the algorithm will fail.
The comments/replies that I've seen so far seem to be missing the fact that in a directed graph there may be more than one way to get from node X to node Y without there being any (directed) cycles in the graph.
Usually depth-first search is used instead. I don't know if BFS is applicable easily.
In DFS, a spanning tree is built in order of visiting. If a the ancestor of a node in the tree is visited (i.e. a back-edge is created), then we detect a cycle.
See http://www.cs.nyu.edu/courses/summer04/G22.1170-001/6a-Graphs-More.pdf for a more detailed explanation.
Use DFS to search if any path is cyclic
class Node<T> { T value; List<Node<T>> adjacent; }
class Graph<T>{
List<Node<T>> nodes;
public boolean isCyclicRec()
{
for (Node<T> node : nodes)
{
Set<Node<T>> initPath = new HashSet<>();
if (isCyclicRec(node, initPath))
{
return true;
}
}
return false;
}
private boolean isCyclicRec(Node<T> currNode, Set<Node<T>> path)
{
if (path.contains(currNode))
{
return true;
}
else
{
path.add(currNode);
for (Node<T> node : currNode.adjacent)
{
if (isCyclicRec(node, path))
{
return true;
}
else
{
path.remove(node);
}
}
}
return false;
}
approach:1
how about a level no assignment to detect a cycle. eg: consider the graph below. A->(B,C) B->D D->(E,F) E,F->(G) E->D As you perform a DFS start assigning a level no to the node you visit (root A=0). level no of node = parent+1. So A=0, B=1, D=2, F=3, G=4 then, recursion reaches D, so E=3. Dont mark level for G (G already a level no assigned which is grater than E) Now E also has an edge to D. So levelization would say D should get a level no of 4. But D already has a "lower level" assigned to it of 2. Thus any time you attempt to assign a level number to a node while doing DFS that already has a lower level number set to it, you know the directed graph has a cycle..
approach2:
use 3 colors. white, gray, black. color only white nodes, white nodes to gray as you go down the DFS, color gray nodes to black when recursion unfolds (all children are processed). if not all children yet processed and you hit a gray node thats a cycle.
eg: all white to begin in above direct graph.
color A, B, D, F,G are colored white-gray. G is leaf so all children processed color it gray to black. recursion unfolds to F(all children processed) color it black. now you reach D, D has unprocessed children, so color E gray, G already colored black so dont go further down. E also has edge to D, so while still processing D (D still gray), you find an edge back to D(a gray node), a cycle is detected.
Testing for Topological sort over the given graph will lead you to the solution. If the algorithm for topsort, i.e the edges should always be directed in one way fails, then it means that the graph contains cycles.
Another simple solution would be a mark-and-sweep approach. Basically, for each node in tree you flag it as "visited" and then move on to it's children. If you ever see a node with the "visted" flag set, you know there's a cycle.
If modifying the graph to include a "visited" bit isn't possible, a set of node pointers can be used instead. To flag a node as visited, you place a pointer to it in the set. If the pointer is already in the set, there's a cycle.
Related
I am learning graph traversal from The Algorithm Design Manual by Steven S. Skiena. In his book, he has provided the code for traversing the graph using dfs. Below is the code.
dfs(graph *g, int v)
{
edgenode *p;
int y;
if (finished) return;
discovered[v] = TRUE;
time = time + 1;
entry_time[v] = time;
process_vertex_early(v);
p = g->edges[v];
while (p != NULL) {
/* temporary pointer */
/* successor vertex */
/* allow for search termination */
y = p->y;
if (discovered[y] == FALSE) {
parent[y] = v;
process_edge(v,y);
dfs(g,y);
}
else if ((!processed[y]) || (g->directed))
process_edge(v,y);
}
if (finished) return;
p = p->next;
}
process_vertex_late(v);
time = time + 1;
exit_time[v] = time;
processed[v] = TRUE;
}
In a undirected graph, it looks like below code is processing the edge twice (calling the method process_edge(v,y). One while traversing the vertex v and another at processing the vertex y) . So I have added the condition parent[v]!=y in else if ((!processed[y]) || (g->directed)). It processes the edge only once. However, I am not sure how to modify this code to work with the parallel edge and self-loop edge. The code should process the parallel edge and self-loop.
Short Answer:
Substitute your (parent[v]!=y) for (!processed[y]) instead of adding it to the condition.
Detailed Answer:
In my opinion there is a mistake in the implementation written in the book, which you discovered and fixed (except for parallel edges. More on that below). The implementation is supposed to be correct for both directed and undeirected graphs, with the distinction between them recorded in the g->directed boolean property.
In the book, just before the implementation the author writes:
The other important property of a depth-first search is that it partitions the
edges of an undirected graph into exactly two classes: tree edges and back edges. The
tree edges discover new vertices, and are those encoded in the parent relation. Back
edges are those whose other endpoint is an ancestor of the vertex being expanded,
so they point back into the tree.
So the condition (!processed[y]) is supposed to handle undirected graphs (as the condition (g->directed) is to handle directed graphs) by allowing the algorithm to process the edges that are back-edges and preventing it from re-process those that are tree edges (in the opposite direction). As you noticed, though, the tree-edges are treated as back-edges when read through the child with this condition so you should just replace this condition with your suggested (parent[v]!=y).
The condition (!processed[y]) will ALWAYS be true for an undirected graph when the algorithm reads it as long as there are no parallel edges (further details why this is true - *). If there are parallel edges - those parallel edges that are read after the first "copy" of them will yield false and the edge will not be processed, when it should be. Your suggested condition, however, will distinguish between tree-edges and the rest (back-edges, parallel edges and self-loops) and allow the algorithm to process only those that are not tree-edges in the opposite direction.
To refer to self-edges, they should be fine both with the new and old conditions: they are edges with y==v. Getting to them, y is discovered (because v is discovered before going through its edges), not processed (v is processed only as the last line - after going through its edges) and it is not v's parent (v is not its own parent).
*Going through v's edges, the algorithm reads this condition for y that has been discovered (so it doesn't go into the first conditional block). As quoted above (in the book there is a semi-proof for that as well which I will include at the end of this footnote), p is either a tree-edge or a back-edge. As y is discovered, it cannot be a tree-edge from v to y. It can be a back edge to an ancestor which means the call is in a recursion call that started processing this ancestor at some point, and so the ancestor's call has yet to reach the final line, marking it as processed (so it is still marked as not processed) and it can be a tree-edge from y to v, in which case the same situation holds - and y is still marked as not processed.
The semi-proof for every edge being a tree-edge or a back-edge:
Why can’t an edge go to a brother or cousin node instead of an ancestor?
All nodes reachable from a given vertex v are expanded before we finish with the
traversal from v, so such topologies are impossible for undirected graphs.
You are correct.
Quoting the book's (2nd edition) errata:
(*) Page 171, line -2 -- The dfs code has a bug, where each tree edge
is processed twice in undirected graphs. The test needs to be
strengthed to be:
else if (((!processed[y]) && (parent[v]!=y)) || (g->directed))
As for cycles - see here
(Note: I thought about asking this on https://cstheory.stackexchange.com/, but decided my question is not theoretical enough -- it's about an algorithm. If there is a better Stack Exchange community for this post, I'm happy to listen!)
I'm using the terminology "starting node" to mean a node with no links into it, and "terminal node" to mean a node with no links out of it. So the following graph has starting nodes A and B and terminal nodes F and G:
I want to draw it with the following rules:
at least one starting node has a depth of 0.
links always point from top to bottom
nodes are packed vertically as closely as possible
Using those rules, depth of for each node is shown for the graph above. Can someone suggest an algorithm to compute the depth of each node that runs in less than O(n^2) time?
update:
I tweaked the graph to show that the DAG may contain starting and terminal nodes at different depths. (This was a case that I didn't consider in my original buggy answer.) I also switched terminology from "x coordinate" to "depth" in order to emphasize that this is about "graphing" and not "graphics".
Your x coordinate of a node corresponds to the longest way from any node without incomming edges to this node in question. For a DAG it can be calculated in O(N):
given DAG G:
calculate incomming_degree[v] for every v in G
initialize queue q={v with incomming_degree[v]==0}, x[v]=0 for every v in q
while(q not empty):
v=q.pop() #retreive and delete first element
for(w in neighbors of v):
incomming_degree[w]--
if(incomming_degree[w]==0): #no further way to w exists, evaluate
q.offer(w)
x[w]=x[v]+1
x stores the desired information.
Here's one solution which is essentially a two-pass depth-first tree walk. The first pass (traverseA) traces the DAG from the starting nodes (A and B in the O.P.'s example) until encountering terminal nodes (F and G in the example). It them marks them with the maximum depth as traced through the graph.
The second pass (traverseB) starts at the terminal nodes and traces back towards the starting nodes, marking each node along the way with the node's current value OR the previous node's value minus one, whichever is smaller if the node hasn't been visited yet:
function labelDAG() {
nodes.forEach(function(node) { node.depth = -1; }); // initialize
// find and mark terminal nodes
startingNodes().forEach(function(node) { traverseA(node, 0); });
// walk backwards from the terminal nodes
terminalNodes().forEach(function(node) { traverseB(node); });
dumpGraph();
};
function traverseA(node, depth) {
var targets = targetsOf(node);
if (targets.length === 0) {
// we're at a leaf (terminal) node -- set depth
node.depth = Math.max(node.depth, depth);
} else {
// traverse each subtree with depth = depth+1
targets.forEach(function(target) {
traverseA(target, depth+1);
});
};
};
// walk backwards from a terminal node, setting each source node's depth value
// along the way.
function traverseB(node) {
sourcesOf(node).forEach(function(source) {
if ((source.depth === -1) || (source.depth > node.x - 1)) {
// source has not yet been visited, or we found a longer path
// between terminal node and source node.
source.depth = node.depth - 1;
}
traverseB(source);
});
};
I want to write pseudo-code for visiting a state machine using DFS. State machine can be considered as a directed graph. Following algorithm from Cormen book uses a DFS algorithm to visit a graph.
DFS-VISIT(G, u) //G= graph, u=root vertex
u.color = GRAY
for each v from G.Adjacents(u) //edge (u, v)
if v.color == WHITE
DFS-VISIT(G, v)
State machine however can have multiple edges between the two vertices. And above algorithm stores edges in an adjacency list. I have implemented the algorithm in Java with following classes,
class Node{
String name;
....
ArrayList<Transition> transitions;
void addTransition(Transition tr);
}
class Transition
{
String src;
String dest;
}
With the above given information, I have built a state machine with node and transition objects. I want to modify the above algorithm where I don't have a graph object G. I just have access to root object.
Can I modify the above algorithm like below?
DFS-VISIT(root) //root node
root.color = GRAY
for each t from root.getTransitions() //t=transition
v = t.getDestination() //v=destination of t
if v.color == WHITE
DFS-VISIT(v)
The algorithm is independent of implementation. Actually, it's the other way around. The question you should be asking is this: "do your proprietary implementation has all the exact properties that are required by the algorithm?"
The algorithm has very few strict demands. You need a set of nodes and a set of edges, where each edge connects 2 nodes. That's the generic definition of graph. How you acquire, store and process these sets is irrelevant to the algorithm. For the algorithm step all you need is access to a given node from the set and access to a set of its neighbors. What you've presented seems fine (of course, for the next step you'll need to progress to the next node after root).
I am having difficulties finding a way to properly classify the edges while a breadth-first search on a directed graph.
During a breadth-first or depth-first search, you can classify the edges met with 4 classes:
TREE
BACK
CROSS
FORWARD
Skiena [1] gives an implementation. If you move along an edge from v1 to v2, here is a way to return the class during a DFS in java, for reference. The parents map returns the parent vertex for the current search, and the timeOf() method, the time at which the vertex has been discovered.
if ( v1.equals( parents.get( v2 ) ) ) { return EdgeClass.TREE; }
if ( discovered.contains( v2 ) && !processed.contains( v2 ) ) { return EdgeClass.BACK; }
if ( processed.contains( v2 ) )
{
if ( timeOf( v1 ) < timeOf( v2 ) )
{
return EdgeClass.FORWARD;
}
else
{
return EdgeClass.CROSS;
}
}
return EdgeClass.UNCLASSIFIED;
My problem is that I cannot get it right for a Breadth first search on a directed graph. For instance:
The following graph - that is a loop - is ok:
A -> B
A -> C
B -> C
BFSing from A, B will be discovered, then C. The edges eAB and eAC are TREE edges, and when eBC is crossed last, B and C are processed and discovered, and this edge is properly classified as CROSS.
But a plain loop does not work:
A -> B
B -> C
C -> A
When the edge eCA is crossed last, A is processed and discovered. So this edge is incorrectly labeled as CROSS, whether it should be a BACK edge.
There is indeed no difference in the way the two cases are treated, even if the two edges have different classes.
How do you implement a proper edge classification for a BFS on a directed graph?
[1] http://www.algorist.com/
EDIT
Here an implementation derived from #redtuna answer.
I just added a check not to fetch the parent of the root.
I have JUnits tests that show it works for directed and undirected graphs, in the case of a loop, a straight line, a fork, a standard example, a single edge, etc....
#Override
public EdgeClass edgeClass( final V from, final V to )
{
if ( !discovered.contains( to ) ) { return EdgeClass.TREE; }
int toDepth = depths.get( to );
int fromDepth = depths.get( from );
V b = to;
while ( toDepth > 0 && fromDepth < toDepth )
{
b = parents.get( b );
toDepth = depths.get( b );
}
V a = from;
while ( fromDepth > 0 && toDepth < fromDepth )
{
a = parents.get( a );
fromDepth = depths.get( a );
}
if ( a.equals( b ) )
{
return EdgeClass.BACK;
}
else
{
return EdgeClass.CROSS;
}
}
How do you implement a proper edge classification for a BFS on a
directed graph?
As you already established, seeing a node for the first time creates a tree edge. The problem with BFS instead of DFS, as David Eisenstat said before me, is that back edges cannot be distinguished from cross ones just based on traversal order.
Instead, you need to do a bit of extra work to distinguish them. The key, as you'll see, is to use the definition of a cross edge.
The simplest (but memory-intensive) way is to associate every node with the set of its predecessors. This can be done trivially when you visit nodes. When finding a non-tree edge between nodes a and b, consider their predecessor sets. If one is a proper subset of the other, then you have a back edge. Otherwise, it's a cross edge. This comes directly from the definition of a cross edge: it's an edge between nodes where neither is the ancestor nor the descendant of the other on the tree.
A better way is to associate only a "depth" number with each node instead of a set. Again, this is readily done as you visit nodes. Now when you find a non-tree edge between a and b, start from the deeper of the two nodes and follow the tree edges backwards until you go back to the same depth as the other. So for example suppose a was deeper. Then you repeatedly compute a=parent(a) until depth(a)=depth(b).
If at this point a=b then you can classify the edge as a back edge because, as per the definition, one of the nodes is an ancestor of the other on the tree. Otherwise you can classify it as a cross edge because we know that neither node is an ancestor or descendant of the other.
pseudocode:
foreach edge(a,b) in BFS order:
if !b.known then:
b.known = true
b.depth = a.depth+1
edge type is TREE
continue to next edge
while (b.depth > a.depth): b=parent(b)
while (a.depth > b.depth): a=parent(a)
if a==b then:
edge type is BACK
else:
edge type is CROSS
The key property of DFS here is that, given two nodes u and v, the interval [u.discovered, u.processed] is a subinterval of [v.discovered, v.processed] if and only if u is a descendant of v. The times in BFS do not have this property; you have to do something else, e.g., compute the intervals via DFS on the tree that BFS produced. Then the classification pseudocode is 1. check for membership in the tree (tree edge) 2. check for head's interval contains tail's (back edge) 3. check for tail's interval contains head's (forward edge) 4. otherwise, declare a cross edge.
Instead of timeof(), you need an other vertex property, which contains the distance from the root. Let name that distance.
You have to processing a v vertex in the following way:
for (v0 in v.neighbours) {
if (!v0.discovered) {
v0.discovered = true;
v0.parent = v;
v0.distance = v.distance + 1;
}
}
v.processed = true;
After you processed a vertex a v vertex, you can run the following algorithm for every edge (from v1 to v2) of the v:
if (!v1.discovered) return EdgeClass.BACK;
else if (!v2.discovered) return EdgeClass.FORWARD;
else if (v1.distance == v2.distance) return EdgeClass.CROSS;
else if (v1.distance > v2.distance) return EdgeClass.BACK;
else {
if (v2.parent == v1) return EdgeClass.TREE;
else return EdgeClass.FORWARD;
}
While solving the problems on Techgig.com, I got struck with one one of the problem. The problem is like this:
A company organizes two trips for their employees in a year. They want
to know whether all the employees can be sent on the trip or not. The
condition is like, no employee can go on both the trips. Also to
determine which employee can go together the constraint is that the
employees who have worked together in past won't be in the same group.
Examples of the problems:
Suppose the work history is given as follows: {(1,2),(2,3),(3,4)};
then it is possible to accommodate all the four employees in two trips
(one trip consisting of employees 1& 3 and other having employees 2 &
4). Neither of the two employees in the same trip have worked together
in past. Suppose the work history is given as {(1,2),(1,3),(2,3)} then
there is no way possible to have two trips satisfying the company rule
and accommodating all the employees.
Can anyone tell me how to proceed on this problem?
I am using this code for DFS and coloring the vertices.
static boolean DFS(int rootNode) {
Stack<Integer> s = new Stack<Integer>();
s.push(rootNode);
state[rootNode] = true;
color[rootNode] = 1;
while (!s.isEmpty()) {
int u = s.peek();
for (int child = 0; child < numofemployees; child++) {
if (adjmatrix[u][child] == 1) {
if (!state[child]) {
state[child] = true;
s.push(child);
color[child] = color[u] == 1 ? 2 : 1;
break;
} else {
s.pop();
if (color[u] == color[child])
return false;
}
}
}
}
return true;
}
This problem is functionally equivalent to testing if an undirected graph is bipartite. A bipartite graph is a graph for which all of the nodes can be distributed among two sets, and within each set, no node is adjacent to another node.
To solve the problem, take the following steps.
Using the adjacency pairs, construct an undirected graph. This is pretty straightforward: each number represents a node, and for each pair you are given, form a connection between those nodes.
Test the newly generated graph for bipartiteness. This can be achieved in linear time, as described here.
If the graph is bipartite and you've generated the two node sets, the answer to the problem is yes, and each node set, along with its nodes (employees), correspond to one of the two trips.
Excerpt on how to test for bipartiteness:
It is possible to test whether a graph is bipartite, and to return
either a two-coloring (if it is bipartite) or an odd cycle (if it is
not) in linear time, using depth-first search. The main idea is to
assign to each vertex the color that differs from the color of its
parent in the depth-first search tree, assigning colors in a preorder
traversal of the depth-first-search tree. This will necessarily
provide a two-coloring of the spanning tree consisting of the edges
connecting vertices to their parents, but it may not properly color
some of the non-tree edges. In a depth-first search tree, one of the
two endpoints of every non-tree edge is an ancestor of the other
endpoint, and when the depth first search discovers an edge of this
type it should check that these two vertices have different colors. If
they do not, then the path in the tree from ancestor to descendant,
together with the miscolored edge, form an odd cycle, which is
returned from the algorithm together with the result that the graph is
not bipartite. However, if the algorithm terminates without detecting
an odd cycle of this type, then every edge must be properly colored,
and the algorithm returns the coloring together with the result that
the graph is bipartite.
I even used a recursive solution but this one is also passing the same number of cases. Am I leaving any special case handling ?
Below is the recursive solution of the problem:
static void dfs(int v, int curr) {
state[v] = true;
color[v] = curr;
for (int i = 0; i < numofemployees; i++) {
if (adjmatrix[v][i] == 1) {
if (color[i] == curr) {
bipartite = false;
return;
}
if (!state[i])
dfs(i, curr == 1 ? 2 : 1);
}
}
}
I am calling this function from main() as dfs(0,1) where 0 is the starting vertex and 1 is one of the color