Hamiltonian cycle in O(N) [closed] - algorithm

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
I've been training with codility.com questions. There is a problem Eta 2011, which is trying to find the number of unique hamiltonian path.
You can read the whole problem here
In summary. we have a graph, where each inner node is connected to exactly 3 other nodes, while outer nodes are connected to 1 inner node. We draw a path that passes through all outer nodes. Now all nodes(inner and outer) are connected to exactly 3 nodes. This is an undirected graph.
He would like to solve the problem in O(N)!!!
The solutions available solves the problem in O(2^N) or higher. There are also heuristic solutions but obviously they are not precise.
Using the knowledge that each node in the graph is connected to exactly three other nodes, is it possible to solve the hamiltonian path in O(N)?
Due to copyright I believe I'm not authorized to copy/paste the whole problem. but a link is provided in the first paragraph.
cheers
Moataz

From wikipedia:
... They remain NP-complete even for undirected planar graphs of
maximum degree three
So, unless you have more information on the graph structure, all planar graphs of in-degree 3 is a subset of the possible input cases for this problem, and thus if you can solve this problem polynomially - you can also solve the problem for all planar graphs with in-degree 3 polynomially, and you can conclude P=NP

The graph is basically a tree with root node having 3 children and all other non-leaf nodes having 2 children. The leaves are connected from left to right.
You can think of each sub-tree as having two endpoint leaf nodes (say start and end).
Now given a subtree rooted at node n. If the hamiltonian route does not involve n and it's parent, then it will involve a path from the start to end and will cover all vertices of the sub-tree (in effect, a hamiltion route in the subtree routed at n).
Now consider the root of the tree. Suppose we take edges to x and y, with x being to left and y to the right.
Now we have to take the path from root to end point of subtree at x, and start point of subtree at y.
(A figure helps).
The rest of the path is completed by connect start to end of the subtrees which need paths to themselves.
This gives a recursive algorithm, and can be computed in O(n) time I believe.
Insane expectation of 30 minutes.

Hamiltonian cycle is polynomial for certain subsets of graphs, e.g. co-comparability graphs.
If your input graph is one of such graphs, you can solve the problem in polynomial time. Note that I am not stating that Hamiltonian cycle is not NP-C. All I am saying that it is polynomial for certain graphs.
Thus, if your input graph is a co-comparability graph, then you have a polynomial solution.

First, consider a tree with the root and all the non-leaf nodes having two children.
The leaves are also connected from left to right, but the first leaf is not connected
to the last. How many paths are there from the leftmost to the rightmost leaf?
The answer is only one, and it's not hard to prove.
Now take the tree from the input. Pick any node and remove one of its edges.
You're left with two trees of the same structure as the one I mentioned at the beginning.
Using those two you can build exactly one Hamiltonian path.
Now the node you picked had three edges you could remove, hence there are three ways to
make a Hamiltonian path in total :).
So the code boils down to checking for consistency and writing 3 in case everything is OK.
30 minutes is not that insane really.

Related

Algorithm to cover all edges given starting node

Working on an algorithm for a game I am developing with a friend and we got stuck. Currently, we have a cyclic undirected graph, and we are trying to find the quickest path from starting node S that covers every edge. We are not looking for a tour and there can be repeated edges.
Any ideas on an algorithm or approximation? I'm sure this problem is NP-hard, but I don't believe it's TSP.
Route Inspection
This is known as the route inspection problem and it does have a polynomial solution.
The basic idea (see the link for more details) is that it is easy to solve for an Eulerian path (where we visit every edge once), but an Eulerian path is only possible for certain graphs.
In particular, a graph has to be connected and have either 0 or 2 vertices of odd degree.
However, it is possible to generalise this for other graphs by adding additional edges in the cheapest way that will produce a graph that does have an Eulerian path. (Note that we have added more edges so we may travel multiple times over edges in the original graph.)
The way of choosing the best way to add additional edges is a maximal matching problem that can be solved in O(n^3).
P.S.
Concidentally I wrote a simple demo earlier today (link to game) for a planar max-cut problem. The solution to this turns out to be based on exactly the same route inspection problem :)
EDIT
I just spotted from the comments that in your particular case your graph may be a tree.
If so, then I believe the answer is much simpler as you just need to do a DFS over the tree making sure to visit the shallowest subtree first.
For example, suppose you have a tree with edges S->A and S->A->B. S has two subtrees, and you should visit A first because it is shallower.
The total edges visited will equal the number of edges visited in a full DFS, minus the depth of the last leaf visited, which is why to minimise the total edges you want to maximise the depth of the last leaf, and hence visit the shallowest subtree first.
This is somewhat like the Eulerian Path. The main distinction is that there may be dead-ends and you may be able to modify the algorithm to suit your needs. Pruning dead-ends is one option or you may be able to reduce the graph into a number of connected components.
DFS will work here. However you must have a good evaluation function to prun the branch early. Otherwise you can not solve this problem fast. You can refer to my discussion and implementation in Java here http://www.capacode.com/?p=650
Detail of my evaluation function
My first try is if the length of the current path plus the distance from U to G is not shorter than the minimum length (stored in minLength variable) we found, we will not visit U next because it can not lead a shorter path.
Actually, the above evaluation function is not efficient because it only works when we already visit most of the cities. We need to compute more precise the minimum length to reach G with all cities visited.
Assume s is the length from S to U, from U to visit G and pass all cities, the length is at least s’ = s + ∑ minDistance(K) where K is an unvisited city and different from U; minDistance(K) is the minimum distance from K to an unvisited state. Basically, for each unvisited state, we assume that we can reach that city with the shortest edge. Note that those shortest edges may not compose a valid path. Then, we will not visit U if s’ ≥ minLength.
With that evaluation function, my program can handle the problem with 20 cities within 1 second. I also add another optimization to improve the performance more. Before running the program, I use greedy algorithm to get a good value for minLength. Specifically, for each city, we will visit the nearest city next. The reason is when we have a smaller minLength, we can prun more.

How to detect if the given graph has a cycle containing all of its nodes? Does the suggested algorithm have any flaws?

I have a connected, non-directed, graph with N nodes and 2N-3 edges. You can consider the graph as it is built onto an existing initial graph, which has 3 nodes and 3 edges. Every node added onto the graph and has 2 connections with the existing nodes in the graph. When all nodes are added to the graph (N-3 nodes added in total), the final graph is constructed.
Originally I'm asked, what is the maximum number of nodes in this graph that can be visited exactly once (except for the initial node), i.e., what is the maximum number of nodes contained in the largest Hamiltonian path of the given graph? (Okay, saying largest Hamiltonian path is not a valid phrase, but considering the question's nature, I need to find a max. number of nodes that are visited once and the trip ends at the initial node. I thought it can be considered as a sub-graph which is Hamiltonian, and consists max. number of nodes, thus largest possible Hamiltonian path).
Since i'm not asked to find a path, I should check if a hamiltonian path exists for given number of nodes first. I know that planar graphs and cycle graphs (Cn) are hamiltonian graphs (I also know Ore's theorem for Hamiltonian graphs, but the graph I will be working on will not be a dense graph with a great probability, thus making Ore's theorem pretty much useless in my case). Therefore I need to find an algorithm for checking if the graph is cycle graph, i.e. does there exist a cycle which contains all the nodes of the given graph.
Since DFS is used for detecting cycles, I thought some minor manipulation to the DFS can help me detect what I am looking for, as in keeping track of explored nodes, and finally checking if the last node visited has a connection to the initial node. Unfortunately
I could not succeed with that approach.
Another approach I tried was excluding a node, and then try to reach to its adjacent node starting from its other adjacent node. That algorithm may not give correct results according to the chosen adjacent nodes.
I'm pretty much stuck here. Can you help me think of another algorithm to tell me if the graph is a cycle graph?
Edit
I realized by the help of the comment (thank you for it n.m.):
A cycle graph consists of a single cycle and has N edges and N vertices. If there exist a cycle which contains all the nodes of the given graph, that's a Hamiltonian cycle. – n.m.
that I am actually searching for a Hamiltonian path, which I did not intend to do so:)
On a second thought, I think checking the Hamiltonian property of the graph while building it will be more efficient, which is I'm also looking for: time efficiency.
After some thinking, I thought whatever the number of nodes will be, the graph seems to be Hamiltonian due to node addition criteria. The problem is I can't be sure and I can't prove it. Does adding nodes in that fashion, i.e. adding new nodes with 2 edges which connect the added node to the existing nodes, alter the Hamiltonian property of the graph? If it doesn't alter the Hamiltonian property, how so? If it does alter, again, how so? Thanks.
EDIT #2
I, again, realized that building the graph the way I described might alter the Hamiltonian property. Consider an input given as follows:
1 3
2 3
1 5
1 3
these input says that 4th node is connected to node 1 and node 3, 5th to node 2 and node 3 . . .
4th and 7th node are connected to the same nodes, thus lowering the maximum number of nodes that can be visited exactly once, by 1. If i detect these collisions (NOT including an input such as 3 3, which is an example that you suggested since the problem states that the newly added edges are connected to 2 other nodes) and lower the maximum number of nodes, starting from N, I believe I can get the right result.
See, I do not choose the connections, they are given to me and I have to find the max. number of nodes.
I think counting the same connections while building the graph and subtracting the number of same connections from N will give the right result? Can you confirm this or is there a flaw with this algorithm?
What we have in this problem is a connected, non-directed graph with N nodes and 2N-3 edges. Consider the graph given below,
A
/ \
B _ C
( )
D
The Graph does not have a Hamiltonian Cycle. But the Graph is constructed conforming to your rules of adding nodes. So searching for a Hamiltonian Cycle may not give you the solution. More over even if it is possible Hamiltonian Cycle detection is an NP-Complete problem with O(2N) complexity. So the approach may not be ideal.
What I suggest is to use a modified version of Floyd's Cycle Finding algorithm (Also called the Tortoise and Hare Algorithm).
The modified algorithm is,
1. Initialize a List CYC_LIST to ∅.
2. Add the root node to the list CYC_LIST and set it as unvisited.
3. Call the function Floyd() twice with the unvisited node in the list CYC_LIST for each of the two edges. Mark the node as visited.
4. Add all the previously unvisited vertices traversed by the Tortoise pointer to the list CYC_LIST.
5. Repeat steps 3 and 4 until no more unvisited nodes remains in the list.
6. If the list CYC_LIST contains N nodes, then the Graph contains a Cycle involving all the nodes.
The algorithm calls Floyd's Cycle Finding Algorithm a maximum of 2N times. Floyd's Cycle Finding algorithm takes a linear time ( O(N) ). So the complexity of the modied algorithm is O(N2) which is much better than the exponential time taken by the Hamiltonian Cycle based approach.
One possible problem with this approach is that it will detect closed paths along with cycles unless stricter checking criteria are implemented.
Reply to Edit #2
Consider the Graph given below,
A------------\
/ \ \
B _ C \
|\ /| \
| D | F
\ / /
\ / /
E------------/
According to your algorithm this graph does not have a cycle containing all the nodes.
But there is a cycle in this graph containing all the nodes.
A-B-D-C-E-F-A
So I think there is some flaw with your approach. But suppose if your algorithm is correct, it is far better than my approach. Since mine takes O(n2) time, where as yours takes just O(n).
To add some clarification to this thread: finding a Hamiltonian Cycle is NP-complete, which implies that finding a longest cycle is also NP-complete because if we can find a longest cycle in any graph, we can find the Hamiltonian cycle of the subgraph induced by the vertices that lie on that cycle. (See also for example this paper regarding the longest cycle problem)
We can't use Dirac's criterion here: Dirac only tells us minimum degree >= n/2 -> Hamiltonian Cycle, that is the implication in the opposite direction of what we would need. The other way around is definitely wrong: take a cycle over n vertices, every vertex in it has exactly degree 2, no matter the size of the circle, but it has (is) an HC. What you can tell from Dirac is that no Hamiltonian Cycle -> minimum degree < n/2, which is of no use here since we don't know whether our graph has an HC or not, so we can't use the implication (nevertheless every graph we construct according to what OP described will have a vertex of degree 2, namely the last vertex added to the graph, so for arbitrary n, we have minimum degree 2).
The problem is that you can construct both graphs of arbitrary size that have an HC and graphs of arbitrary size that do not have an HC. For the first part: if the original triangle is A,B,C and the vertices added are numbered 1 to k, then connect the 1st added vertex to A and C and the k+1-th vertex to A and the k-th vertex for all k >= 1. The cycle is A,B,C,1,2,...,k,A. For the second part, connect both vertices 1 and 2 to A and B; that graph does not have an HC.
What is also important to note is that the property of having an HC can change from one vertex to the other during construction. You can both create and destroy the HC property when you add a vertex, so you would have to check for it every time you add a vertex. A simple example: take the graph after the 1st vertex was added, and add a second vertex along with edges to the same two vertices of the triangle that the 1st vertex was connected to. This constructs from a graph with an HC a graph without an HC. The other way around: add now a 3rd vertex and connect it to 1 and 2; this builds from a graph without an HC a graph with an HC.
Storing the last known HC during construction doesn't really help you because it may change completely. You could have an HC after the 20th vertex was added, then not have one for k in [21,2000], and have one again for the 2001st vertex added. Most likely the HC you had on 23 vertices will not help you a lot.
If you want to figure out how to solve this problem efficiently, you'll have to find criteria that work for all your graphs that can be checked for efficiently. Otherwise, your problem doesn't appear to me to be simpler than the Hamiltonian Cycle problem is in the general case, so you might be able to adjust one of the algorithms used for that problem to your variant of it.
Below I have added three extra nodes (3,4,5) in the original graph and it does seem like I can keep adding new nodes indefinitely while keeping the property of Hamiltonian cycle. For the below graph the cycle would be 0-1-3-5-4-2-0
1---3---5
/ \ / \ /
0---2---4
As there were no extra restrictions about how you can add a new node with two edges, I think by construction you can have a graph that holds the property of hamiltonian cycle.

better algorithm to check if directed graph is dag or not [duplicate]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
The community reviewed whether to reopen this question 11 months ago and left it closed:
Original close reason(s) were not resolved
Improve this question
Is there an efficient algorithm for detecting cycles within a directed graph?
I have a directed graph representing a schedule of jobs that need to be executed, a job being a node and a dependency being an edge. I need to detect the error case of a cycle within this graph leading to cyclic dependencies.
Tarjan's strongly connected components algorithm has O(|E| + |V|) time complexity.
For other algorithms, see Strongly connected components on Wikipedia.
Given that this is a schedule of jobs, I suspect that at some point you are going to sort them into a proposed order of execution.
If that's the case, then a topological sort implementation may in any case detect cycles. UNIX tsort certainly does. I think it is likely that it is therefore more efficient to detect cycles at the same time as tsorting, rather than in a separate step.
So the question might become, "how do I most efficiently tsort", rather than "how do I most efficiently detect loops". To which the answer is probably "use a library", but failing that the following Wikipedia article:
http://en.wikipedia.org/wiki/Topological_sorting
has the pseudo-code for one algorithm, and a brief description of another from Tarjan. Both have O(|V| + |E|) time complexity.
According to Lemma 22.11 of Cormen et al., Introduction to Algorithms (CLRS):
A directed graph G is acyclic if and only if a depth-first search of G yields no back edges.
This has been mentioned in several answers; here I'll also provide a code example based on chapter 22 of CLRS. The example graph is illustrated below.
CLRS' pseudo-code for depth-first search reads:
In the example in CLRS Figure 22.4, the graph consists of two DFS trees: one consisting of nodes u, v, x, and y, and the other of nodes w and z. Each tree contains one back edge: one from x to v and another from z to z (a self-loop).
The key realization is that a back edge is encountered when, in the DFS-VISIT function, while iterating over the neighbors v of u, a node is encountered with the GRAY color.
The following Python code is an adaptation of CLRS' pseudocode with an if clause added which detects cycles:
import collections
class Graph(object):
def __init__(self, edges):
self.edges = edges
self.adj = Graph._build_adjacency_list(edges)
#staticmethod
def _build_adjacency_list(edges):
adj = collections.defaultdict(list)
for edge in edges:
adj[edge[0]].append(edge[1])
adj[edge[1]] # side effect only
return adj
def dfs(G):
discovered = set()
finished = set()
for u in G.adj:
if u not in discovered and u not in finished:
discovered, finished = dfs_visit(G, u, discovered, finished)
def dfs_visit(G, u, discovered, finished):
discovered.add(u)
for v in G.adj[u]:
# Detect cycles
if v in discovered:
print(f"Cycle detected: found a back edge from {u} to {v}.")
break
# Recurse into DFS tree
if v not in finished:
dfs_visit(G, v, discovered, finished)
discovered.remove(u)
finished.add(u)
return discovered, finished
if __name__ == "__main__":
G = Graph([
('u', 'v'),
('u', 'x'),
('v', 'y'),
('w', 'y'),
('w', 'z'),
('x', 'v'),
('y', 'x'),
('z', 'z')])
dfs(G)
Note that in this example, the time in CLRS' pseudocode is not captured because we're only interested in detecting cycles. There is also some boilerplate code for building the adjacency list representation of a graph from a list of edges.
When this script is executed, it prints the following output:
Cycle detected: found a back edge from x to v.
Cycle detected: found a back edge from z to z.
These are exactly the back edges in the example in CLRS Figure 22.4.
The simplest way to do it is to do a depth first traversal (DFT) of the graph.
If the graph has n vertices, this is a O(n) time complexity algorithm. Since you will possibly have to do a DFT starting from each vertex, the total complexity becomes O(n^2).
You have to maintain a stack containing all vertices in the current depth first traversal, with its first element being the root node. If you come across an element which is already in the stack during the DFT, then you have a cycle.
In my opinion, the most understandable algorithm for detecting cycle in a directed graph is the graph-coloring-algorithm.
Basically, the graph coloring algorithm walks the graph in a DFS manner (Depth First Search, which means that it explores a path completely before exploring another path). When it finds a back edge, it marks the graph as containing a loop.
For an in depth explanation of the graph coloring algorithm, please read this article: http://www.geeksforgeeks.org/detect-cycle-direct-graph-using-colors/
Also, I provide an implementation of graph coloring in JavaScript https://github.com/dexcodeinc/graph_algorithm.js/blob/master/graph_algorithm.js
Start with a DFS: a cycle exists if and only if a back-edge is discovered during DFS. This is proved as a result of white-path theorum.
If you can't add a "visited" property to the nodes, use a set (or map) and just add all visited nodes to the set unless they are already in the set. Use a unique key or the address of the objects as the "key".
This also gives you the information about the "root" node of the cyclic dependency which will come in handy when a user has to fix the problem.
Another solution is to try to find the next dependency to execute. For this, you must have some stack where you can remember where you are now and what you need to do next. Check if a dependency is already on this stack before you execute it. If it is, you've found a cycle.
While this might seem to have a complexity of O(N*M) you must remember that the stack has a very limited depth (so N is small) and that M becomes smaller with each dependency that you can check off as "executed" plus you can stop the search when you found a leaf (so you never have to check every node -> M will be small, too).
In MetaMake, I created the graph as a list of lists and then deleted every node as I executed them which naturally cut down the search volume. I never actually had to run an independent check, it all happened automatically during normal execution.
If you need a "test only" mode, just add a "dry-run" flag which disables the execution of the actual jobs.
There is no algorithm which can find all the cycles in a directed graph in polynomial time. Suppose, the directed graph has n nodes and every pair of the nodes has connections to each other which means you have a complete graph. So any non-empty subset of these n nodes indicates a cycle and there are 2^n-1 number of such subsets. So no polynomial time algorithm exists.
So suppose you have an efficient (non-stupid) algorithm which can tell you the number of directed cycles in a graph, you can first find the strong connected components, then applying your algorithm on these connected components. Since cycles only exist within the components and not between them.
I had implemented this problem in sml ( imperative programming) . Here is the outline . Find all the nodes that either have an indegree or outdegree of 0 . Such nodes cannot be part of a cycle ( so remove them ) . Next remove all the incoming or outgoing edges from such nodes.
Recursively apply this process to the resulting graph. If at the end you are not left with any node or edge , the graph does not have any cycles , else it has.
https://mathoverflow.net/questions/16393/finding-a-cycle-of-fixed-length I like this solution the best specially for 4 length:)
Also phys wizard says u have to do O(V^2). I believe that we need only O(V)/O(V+E).
If the graph is connected then DFS will visit all nodes. If the graph has connected sub graphs then each time we run a DFS on a vertex of this sub graph we will find the connected vertices and wont have to consider these for the next run of the DFS. Therefore the possibility of running for each vertex is incorrect.
The way I do it is to do a Topological Sort, counting the number of vertices visited. If that number is less than the total number of vertices in the DAG, you have a cycle.
As you said, you have set of jobs, it need to be executed in certain order. Topological sort given you required order for scheduling of jobs(or for dependency problems if it is a direct acyclic graph). Run dfs and maintain a list, and start adding node in the beginning of the list, and if you encountered a node which is already visited. Then you found a cycle in given graph.
If DFS finds an edge that points to an already-visited vertex, you have a cycle there.
If a graph satisfy this property
|e| > |v| - 1
then the graph contains at least on cycle.

Difference between DIjkstra and BellmanFord algorithm [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I am wring thesis about shortest path algorithms.
And i don't understand one thing...
I have made visualisation of dijkstras algorithm.
1) Is it correct ? Or am i doing something wrong?
2) How would look Bellman-Ford algorithm? As fas as i have looked for difference, i found "Bellman-ford: the basic idea is very similar to Dijkstra's, but instead of selecting the shortest distance neighbour edges, it select all the neighbour edges." But also dijkstra checks all vertexes and all edges, isnt it?
dijkstra assumes that the cost of paths is montonically increasing. that plus the ordered search (using the priority queue) mans that when you first reach a node, you have arrived via the shortest path.
this is not true with negative weights. if you use dijkstra with negative weights then you may find a later path is better than an earlier one (because a negative weight improved the path on a later step).
so in bellman-ford, when you arrive at a node you test to see if the new path is shorter. in contrast, with dijkstra, you can cull nodes
in some (most) cases dijkstra will not explore all complete paths. for example, if G linked only back to C then any path through G would be higher cost that any through C. bellman-ford would still consider all paths through G to F (dijkstra would never look at those because they are of higher cost that going through C). if it does not do this it can't guarantee finding negative loops.
here's an example: the above never calculates the path AGEF. E has already been marked as visited by the time you arrive from G.
I am also thinking the same
Dijkstra's algorithm solves the single-source shortest-path problem when all edges have non-negative weights. It is a greedy algorithm and similar to Prim's algorithm. Algorithm starts at the source vertex, s, it grows a tree, T, that ultimately spans all vertices reachable from S. Vertices are added to T in order of distance i.e., first S, then the vertex closest to S, then the next closest, and so on.

efficient algorithm for loop identification in a directed graph? [duplicate]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
The community reviewed whether to reopen this question 11 months ago and left it closed:
Original close reason(s) were not resolved
Improve this question
Is there an efficient algorithm for detecting cycles within a directed graph?
I have a directed graph representing a schedule of jobs that need to be executed, a job being a node and a dependency being an edge. I need to detect the error case of a cycle within this graph leading to cyclic dependencies.
Tarjan's strongly connected components algorithm has O(|E| + |V|) time complexity.
For other algorithms, see Strongly connected components on Wikipedia.
Given that this is a schedule of jobs, I suspect that at some point you are going to sort them into a proposed order of execution.
If that's the case, then a topological sort implementation may in any case detect cycles. UNIX tsort certainly does. I think it is likely that it is therefore more efficient to detect cycles at the same time as tsorting, rather than in a separate step.
So the question might become, "how do I most efficiently tsort", rather than "how do I most efficiently detect loops". To which the answer is probably "use a library", but failing that the following Wikipedia article:
http://en.wikipedia.org/wiki/Topological_sorting
has the pseudo-code for one algorithm, and a brief description of another from Tarjan. Both have O(|V| + |E|) time complexity.
According to Lemma 22.11 of Cormen et al., Introduction to Algorithms (CLRS):
A directed graph G is acyclic if and only if a depth-first search of G yields no back edges.
This has been mentioned in several answers; here I'll also provide a code example based on chapter 22 of CLRS. The example graph is illustrated below.
CLRS' pseudo-code for depth-first search reads:
In the example in CLRS Figure 22.4, the graph consists of two DFS trees: one consisting of nodes u, v, x, and y, and the other of nodes w and z. Each tree contains one back edge: one from x to v and another from z to z (a self-loop).
The key realization is that a back edge is encountered when, in the DFS-VISIT function, while iterating over the neighbors v of u, a node is encountered with the GRAY color.
The following Python code is an adaptation of CLRS' pseudocode with an if clause added which detects cycles:
import collections
class Graph(object):
def __init__(self, edges):
self.edges = edges
self.adj = Graph._build_adjacency_list(edges)
#staticmethod
def _build_adjacency_list(edges):
adj = collections.defaultdict(list)
for edge in edges:
adj[edge[0]].append(edge[1])
adj[edge[1]] # side effect only
return adj
def dfs(G):
discovered = set()
finished = set()
for u in G.adj:
if u not in discovered and u not in finished:
discovered, finished = dfs_visit(G, u, discovered, finished)
def dfs_visit(G, u, discovered, finished):
discovered.add(u)
for v in G.adj[u]:
# Detect cycles
if v in discovered:
print(f"Cycle detected: found a back edge from {u} to {v}.")
break
# Recurse into DFS tree
if v not in finished:
dfs_visit(G, v, discovered, finished)
discovered.remove(u)
finished.add(u)
return discovered, finished
if __name__ == "__main__":
G = Graph([
('u', 'v'),
('u', 'x'),
('v', 'y'),
('w', 'y'),
('w', 'z'),
('x', 'v'),
('y', 'x'),
('z', 'z')])
dfs(G)
Note that in this example, the time in CLRS' pseudocode is not captured because we're only interested in detecting cycles. There is also some boilerplate code for building the adjacency list representation of a graph from a list of edges.
When this script is executed, it prints the following output:
Cycle detected: found a back edge from x to v.
Cycle detected: found a back edge from z to z.
These are exactly the back edges in the example in CLRS Figure 22.4.
The simplest way to do it is to do a depth first traversal (DFT) of the graph.
If the graph has n vertices, this is a O(n) time complexity algorithm. Since you will possibly have to do a DFT starting from each vertex, the total complexity becomes O(n^2).
You have to maintain a stack containing all vertices in the current depth first traversal, with its first element being the root node. If you come across an element which is already in the stack during the DFT, then you have a cycle.
In my opinion, the most understandable algorithm for detecting cycle in a directed graph is the graph-coloring-algorithm.
Basically, the graph coloring algorithm walks the graph in a DFS manner (Depth First Search, which means that it explores a path completely before exploring another path). When it finds a back edge, it marks the graph as containing a loop.
For an in depth explanation of the graph coloring algorithm, please read this article: http://www.geeksforgeeks.org/detect-cycle-direct-graph-using-colors/
Also, I provide an implementation of graph coloring in JavaScript https://github.com/dexcodeinc/graph_algorithm.js/blob/master/graph_algorithm.js
Start with a DFS: a cycle exists if and only if a back-edge is discovered during DFS. This is proved as a result of white-path theorum.
If you can't add a "visited" property to the nodes, use a set (or map) and just add all visited nodes to the set unless they are already in the set. Use a unique key or the address of the objects as the "key".
This also gives you the information about the "root" node of the cyclic dependency which will come in handy when a user has to fix the problem.
Another solution is to try to find the next dependency to execute. For this, you must have some stack where you can remember where you are now and what you need to do next. Check if a dependency is already on this stack before you execute it. If it is, you've found a cycle.
While this might seem to have a complexity of O(N*M) you must remember that the stack has a very limited depth (so N is small) and that M becomes smaller with each dependency that you can check off as "executed" plus you can stop the search when you found a leaf (so you never have to check every node -> M will be small, too).
In MetaMake, I created the graph as a list of lists and then deleted every node as I executed them which naturally cut down the search volume. I never actually had to run an independent check, it all happened automatically during normal execution.
If you need a "test only" mode, just add a "dry-run" flag which disables the execution of the actual jobs.
There is no algorithm which can find all the cycles in a directed graph in polynomial time. Suppose, the directed graph has n nodes and every pair of the nodes has connections to each other which means you have a complete graph. So any non-empty subset of these n nodes indicates a cycle and there are 2^n-1 number of such subsets. So no polynomial time algorithm exists.
So suppose you have an efficient (non-stupid) algorithm which can tell you the number of directed cycles in a graph, you can first find the strong connected components, then applying your algorithm on these connected components. Since cycles only exist within the components and not between them.
I had implemented this problem in sml ( imperative programming) . Here is the outline . Find all the nodes that either have an indegree or outdegree of 0 . Such nodes cannot be part of a cycle ( so remove them ) . Next remove all the incoming or outgoing edges from such nodes.
Recursively apply this process to the resulting graph. If at the end you are not left with any node or edge , the graph does not have any cycles , else it has.
https://mathoverflow.net/questions/16393/finding-a-cycle-of-fixed-length I like this solution the best specially for 4 length:)
Also phys wizard says u have to do O(V^2). I believe that we need only O(V)/O(V+E).
If the graph is connected then DFS will visit all nodes. If the graph has connected sub graphs then each time we run a DFS on a vertex of this sub graph we will find the connected vertices and wont have to consider these for the next run of the DFS. Therefore the possibility of running for each vertex is incorrect.
The way I do it is to do a Topological Sort, counting the number of vertices visited. If that number is less than the total number of vertices in the DAG, you have a cycle.
As you said, you have set of jobs, it need to be executed in certain order. Topological sort given you required order for scheduling of jobs(or for dependency problems if it is a direct acyclic graph). Run dfs and maintain a list, and start adding node in the beginning of the list, and if you encountered a node which is already visited. Then you found a cycle in given graph.
If DFS finds an edge that points to an already-visited vertex, you have a cycle there.
If a graph satisfy this property
|e| > |v| - 1
then the graph contains at least on cycle.

Resources