How to store visited states in iterative deepening / depth limited search? - algorithm

Update: Search for the first solution.
for a normal Depth First Search it is simple, just use a hashset
bool DFS (currentState) =
{
if (myHashSet.Contains(currentState))
{
return;
}
else
{
myHashSet.Add(currentState);
}
if (IsSolution(currentState) return true;
else
{
for (var nextState in GetNextStates(currentState))
if (DFS(nextState)) return true;
}
return false;
}
However, when it becomes depth limited, i cannot simply do this
bool DFS (currentState, maxDepth) =
{
if (maxDepth = 0) return false;
if (myHashSet.Contains(currentState))
{
return;
}
else
{
myHashSet.Add(currentState);
}
if (IsSolution(currentState) return true;
else
{
for (var nextState in GetNextStates(currentState))
if (DFS(nextState, maxDepth - 1)) return true;
}
return false;
}
Because then it is not going to do a complete search (in a sense of always be able to find a solution if there is any) before maxdepth
How should I fix it? Would it add more space complexity to the algorithm?
Or it just doesn't require to memoize the state at all.
Update:
for example, a decision tree is the following:
A - B - C - D - E - A
|
F - G (Goal)
Starting from state A. and G is a goal state. Clearly there is a solution under depth 3.
However, using my implementation under depth 4, if the direction of search happens to be
A(0) -> B(1) -> C(2) -> D(3) -> E(4) -> F(5) exceeds depth, then it would do back track to A, however E is visited, it would ignore the check direction A - E - F - G

I had the same problem as yours, here's my thread Iterative deepening in common lisp
Basically to solve this problem using hashtables, you can't just check if a node was visited before or not, you have to also consider the depth at which it was previously visited. If the node you're about to examine contains a state that was not previously seen, or it was seen before but at a higher depth, then you should still consider it since it may lead to a shallower solution which is what iterative deepening supposed to do, it returns the same solution that BFS would return, which would be the shallowest. So in the hashtable you can have the state as the key, and the depth as the value. You will need to keep updating the depth value in the hashtable after finding a shallower node though.
An alternative solution for cycle checking would be to backtrack on the path from the current node up to the root, if the node you're about to examine already appears on the path, then it will lead to a cycle. This approach would be more generic, and can be used with any search strategy. It is slower than the hashtable approach though, having O(d) time complexity where d is the depth, but the memory complexity will be greatly reduced.

In each step of IDFS, you are actually searching for a path which is shortest, you can't simple use hashSet. HashSet helps only when you are searching for the existence of a path where the length is unlimited.
In this case, you should probably use hashMap to store the minimum step to reach the state and prune the branch only if the map value can't be updated. The time complexity may changed in correspond.
But in fact, IDFS is used in place of BFS when the space is limited. As hashing the state may take almost as many space as BFS, usually you can't store the all the state in IDFS trivially.
The IDFS in wiki dose not have a hash neither. http://en.wikipedia.org/wiki/Iterative_deepening_depth-first_search
So let's drop out the hash and trade time for space!
Update
It's worthwhile to store the state in the current dfs stack, then the search path would not result into a trivial circle. The psudocode implementing this feature would be:
bool DFS (currentState, maxDepth) =
{
if (maxDepth = 0) return false;
if (myHashSet.Contains(currentState))
{
return;
}
else
{
myHashSet.Add(currentState);
}
if (IsSolution(currentState) return true;
else
{
for (var nextState in GetNextStates(currentState))
if (DFS(nextState, maxDepth - 1)) return true;
}
myHashSet.Remove(currentState); //the state is pop out from the stack
return false;
}

The solution you show is perfectly fine and works for DFSID(depth-first search with iterative deepening). Just do not forget to clear myHashSet before increasing the depth.

Related

Find if path exists in a matrix using DFS?

The problem is given an mxn matrix of 1 and 0, where 1 is obstacle and 0 is allowed vertex, find if a path exists from top left to bottom right of the matrix using DFS. You can move up down left or right.
Notice it doesn't ask for shortest path, this problem has actually surprisingly tripped me up. I can do this quite easily with BFS, but the DFS aspect is confusing, moreover whats confusing with DFS is that, it is supposedly faster in the best case as it doesnt explore all possible paths like BFS.
But if we are doing DFS, wont we be doing backtracking as this is an undirected graph so we cant do DFS directly? From my understanding this blows up the time complexity to O(4^n), which is significantly slower than BFS.
public boolean pathExists(int[][] matrix){
boolean [][] used = new boolean[matrix.length][matrix[0].length];
}
public boolean pathExistsHelper(int [][] matrix, int vertexRow, int vertexCol, boolean [][] used){
if(outOfbounds(...) return false;
if(used[vertexRow][vertexCol]) return false;
used[vertexRow][vertexCol] = true;
for(each direction) if(pathExists...) return true
used[vertexRow][vertexCol] = false; // backtrack
return false;
}
So you can see what I mean, I was told DFS is equally as fast, but how is it possible to do DFS on an undirected graph without backtracking into exponential complexity? Thanks
You do not need to unmark vertices - because your goal is not to enumerate all possible non-self-intersecting paths, of which there are indeed many. This reduces the run-time complexity to O(n), where n is the number of vertices; the same complexity as in BFS, but without needing to store as many vertices in memory, since you do not need a queue.
This kind of DFS is very common to find connected components in graphs, and is also referred to as "flood fill".
A simpler code could look like this:
public boolean pathExists(int[][] matrix) {
boolean exitFound = pathExists(m, 0, 0);
// replace 2s by 0s to undo changes to matrix here
// ...
return exitFound;
}
public boolean pathExists(int[][] m, int row, int col) {
// base cases
if (row < 0 || col < 0 || row >= m.length || col >= m[0].length)
return false; // out-of-bounds
if (m[row][col] != 0)
return false; // avoid visiting walls or re-visiting
if (row == m.length-1 && col == m[0].length-1)
return true; // success!
// mark & prepare for recursion
m[row][col] = 2; // never visit again; replace 2s by 0s to undo changes
return pathExists(m, row+1, col) ||
pathExists(m, row-1, col) ||
pathExists(m, row, col+1) ||
pathExists(m, row, col-1);
}
In general DFS cannot be applied without additions to an undirected graph because it cannot handle cycles. So you must at least add cycle-detection to avoid to run around a cycle for ever.
In you case you need to avoid to leave a matrix-position in the same direction as you already did before in your path (this would lead to a cycle and you'll never find a solution).
About the speed of DFS and BFS:
In worst case they both need to search the whole stat-tree of the problem and both search every node just once so the they are equally fast (in worst case).

Non recursive DFS algorithm for simple paths between two points

I am looking for a non-recursive Depth first search algorithm to find all simple paths between two points in undirected graphs (cycles are possible).
I checked many posts, all showed recursive algorithm.
seems no one interested in non-recursive version.
a recursive version is like this;
void dfs(Graph G, int v, int t)
{
path.push(v);
onPath[v] = true;
if (v == t)
{
print(path);
}
else
{
for (int w : G.adj(v))
{
if (!onPath[w])
dfs(G, w, t);
}
}
path.pop();
onPath[v] = false;
}
so, I tried it as (non-recursive), but when i check it, it computed wrong
void dfs(node start,node end)
{
stack m_stack=new stack();
m_stack.push(start);
while(!m_stack.empty)
{
var current= m_stack.pop();
path.push(current);
if (current == end)
{
print(path);
}
else
{
for ( node in adj(current))
{
if (!path.contain(node))
m_stack.push(node);
}
}
path.pop();
}
the test graph is:
(a,b),(b,a),
(b,c),(c,b),
(b,d),(d,b),
(c,f),(f,c),
(d,f),(f,d),
(f,h),(h,f).
it is undirected, that is why there are (a,b) and (b,a).
If the start and end nodes are 'a' and 'h', then there should be two simple paths:
a,b,c,f,h
a,b,d,f,h.
but that algorithm could not find both.
it displayed output as:
a,b,d,f,h,
a,b,d.
stack become at the start of second path, that is the problem.
please point out my mistake when changing it to non-recursive version.
your help will be appreciated!
I think dfs is a pretty complicated algorithm especially in its iterative form. The most important part of the iterative version is the insight, that in the recursive version not only the current node, but also the current neighbour, both are stored on the stack. With this in mind, in C++ the iterative version could look like:
//graph[i][j] stores the j-th neighbour of the node i
void dfs(size_t start, size_t end, const vector<vector<size_t> > &graph)
{
//initialize:
//remember the node (first) and the index of the next neighbour (second)
typedef pair<size_t, size_t> State;
stack<State> to_do_stack;
vector<size_t> path; //remembering the way
vector<bool> visited(graph.size(), false); //caching visited - no need for searching in the path-vector
//start in start!
to_do_stack.push(make_pair(start, 0));
visited[start]=true;
path.push_back(start);
while(!to_do_stack.empty())
{
State &current = to_do_stack.top();//current stays on the stack for the time being...
if (current.first == end || current.second == graph[current.first].size())//goal reached or done with neighbours?
{
if (current.first == end)
print(path);//found a way!
//backtrack:
visited[current.first]=false;//no longer considered visited
path.pop_back();//go a step back
to_do_stack.pop();//no need to explore further neighbours
}
else{//normal case: explore neighbours
size_t next=graph[current.first][current.second];
current.second++;//update the next neighbour in the stack!
if(!visited[next]){
//putting the neighbour on the todo-list
to_do_stack.push(make_pair(next, 0));
visited[next]=true;
path.push_back(next);
}
}
}
}
No warranty it is bug-free, but I hope you get the gist and at least it finds the both paths in your example.
The path computation is all wrong. You pop the last node before you process it's neighbors. Your code should output just the last node.
The simplest fix is to trust the compiler to optimize the recursive solution sufficiently that it won't matter. You can help by not passing large objects between calls and by avoiding allocating/deallocating many objects per call.
The easy fix is to store the entire path in the stack (instead of just the last node).
A harder fix is that you have 2 types of nodes on the stack. Insert and remove. When you reach a insert node x value you add first remove node x then push to the stack insert node y for all neighbours y. When you hit a remove node x you need to pop the last value (x) from the path. This better simulates the dynamics of the recursive solution.
A better fix is to just do breadth-first-search since that's easier to implement in an iterative fashion.

Creating path array using IDDFS

My IDDFS algorithm finds the shortest path of my graph using adjacency matrix.
It shows how deep is the solution (I understand that this is amount of points connected together from starting point to end).
I would like to get these points in array.
For example:
Let's say that solution is found in depth 5, so I would like to have array with points: {0,2,3,4,6}.
Depth 3: array {1,2,3}.
Here is the algorithm in C++:
(I'm not sure if that algorithm "knows" if points which were visited are visited again while searching or not - I'm almost beginner with graphs)
int DLS(int node, int goal, int depth,int adj[9][9])
{
int i,x;
if ( depth >= 0 )
{
if ( node == goal )
return node;
for(i=0;i<nodes;i++)
{
if(adj[node][i] == 1)
{
child = i;
x = DLS(child, goal, depth-1,adj);
if(x == goal)
return goal;
}
}
}
return 0;
}
int IDDFS(int root,int goal,int adj[9][9])
{
depth = 0;
solution = 0;
while(solution <= 0 && depth < nodes)
{
solution = DLS(root, goal, depth,adj);
depth++;
}
if(depth == nodes)
return inf;
return depth-1;
}
int main()
{
int i,u,v,source,goal;
int adj[9][9] = {{0,1,0,1,0,0,0,0,0},
{1,0,1,0,1,0,0,0,0},
{0,1,0,0,0,1,0,0,0},
{1,0,0,0,1,0,1,0,0},
{0,1,0,1,0,1,0,1,0},
{0,0,1,0,1,0,0,0,1},
{0,0,0,1,0,0,0,1,0},
{0,0,0,0,1,0,1,0,1},
{0,0,0,0,0,1,0,1,0}
};
nodes=9;
edges=12;
source=0;
goal=8;
depth = IDDFS(source,goal,adj);
if(depth == inf)printf("No solution Exist\n");
else printf("Solution Found in depth limit ( %d ).\n",depth);
system("PAUSE");
return 0;
}
The reason why I'm using IDDFS instead of other path-finding algorithm is that I want to change depth to specified number to search for paths of exact length (but I'm not sure if that will work).
If someone would suggest other algorithm for finding path of specified length using adjacency matrix, please let me know about it.
The idea of getting the actual path retrieved from a pathfinding algorithm is to use a map:V->V such that the key is a vertex, and the value is the vertex used to discover the key (The source will not be a key, or be a key with null value, since it was not discovered from any vertex).
The pathfinding algorithm will modify this map while it runs, and when it is done - you can get your path by reading from the table iteratively - starting from the target - all the way up to the source - and you get your path in reversed order.
In DFS: you insert the (key,value) pair each time you discover a new vertex (which is key). Note that if key is already a key in the map - you should skip this branch.
Once you finish exploring a certain path, and "close" a vertex, you need take it out of the list, However - sometimes you can optimize the algorithm and skip this part (it will make the branch factor smaller).
Since IDDFS is actually doing DFS iteratively, you can just follow the same logic, and each time you make a new DFS iteration - for higher depth, you can just clear the old map, and start a new one from scratch.
Other pathfinding algorithms are are BFS, A* and dijkstra's algorithm. Note that the last 2 also fit for weighted graphs. All of these can be terminated when you reach a certain depth, same as DFS is terminated when you reach a certain depth in IDDFS.

Evaluating expression trees

Skiena's book on Algorithm contains the following question:
1) Evaluate expression given as binary tree in O(n) time, given n nodes.
2) Evaluate expression given as DAG in O(n+m) time, given n nodes and m edges in DAG.
I could think of a way for the first question:
evaluate(node) {
if(node has children){
left_val = evaluate(node->left);
right_val = evaluate(node->right);
// find operation symbol for node and use it as
// val = left_val operation right_val
return val;
}
else {
return node_value;
}
}
Since we visit each node once, it will take O(n) time.
Since the book has no solutions, can anyone please tell if this is correct ?
Also can anyone suggest a solution for second question.
Thanks.
First way looks fine to me.
For the DAG, if you can modify the tree to add cached values to each node, you can use the same algorithm with a small tweak to not recurse if an operator node has a cached value. This should be O(n+m) time (at most one arithmetic operation per node and at most one pointer lookup per edge). Explicitly:
evaluate(node) {
if (node has value) {
return node->value;
} else {
left = evaluate(node->left);
right = evaluate(node->right);
// find operation symbol for node and use it as
// val = left_val operation right_val
node->value = val;
return val;
}
}

Finding all cycles in a directed graph

How can I find (iterate over) ALL the cycles in a directed graph from/to a given node?
For example, I want something like this:
A->B->A
A->B->C->A
but not:
B->C->B
I found this page in my search and since cycles are not same as strongly connected components, I kept on searching and finally, I found an efficient algorithm which lists all (elementary) cycles of a directed graph. It is from Donald B. Johnson and the paper can be found in the following link:
http://www.cs.tufts.edu/comp/150GA/homeworks/hw1/Johnson%2075.PDF
A java implementation can be found in:
http://normalisiert.de/code/java/elementaryCycles.zip
A Mathematica demonstration of Johnson's algorithm can be found here, implementation can be downloaded from the right ("Download author code").
Note: Actually, there are many algorithms for this problem. Some of them are listed in this article:
http://dx.doi.org/10.1137/0205007
According to the article, Johnson's algorithm is the fastest one.
Depth first search with backtracking should work here.
Keep an array of boolean values to keep track of whether you visited a node before. If you run out of new nodes to go to (without hitting a node you have already been), then just backtrack and try a different branch.
The DFS is easy to implement if you have an adjacency list to represent the graph. For example adj[A] = {B,C} indicates that B and C are the children of A.
For example, pseudo-code below. "start" is the node you start from.
dfs(adj,node,visited):
if (visited[node]):
if (node == start):
"found a path"
return;
visited[node]=YES;
for child in adj[node]:
dfs(adj,child,visited)
visited[node]=NO;
Call the above function with the start node:
visited = {}
dfs(adj,start,visited)
The simplest choice I found to solve this problem was using the python lib called networkx.
It implements the Johnson's algorithm mentioned in the best answer of this question but it makes quite simple to execute.
In short you need the following:
import networkx as nx
import matplotlib.pyplot as plt
# Create Directed Graph
G=nx.DiGraph()
# Add a list of nodes:
G.add_nodes_from(["a","b","c","d","e"])
# Add a list of edges:
G.add_edges_from([("a","b"),("b","c"), ("c","a"), ("b","d"), ("d","e"), ("e","a")])
#Return a list of cycles described as a list o nodes
list(nx.simple_cycles(G))
Answer: [['a', 'b', 'd', 'e'], ['a', 'b', 'c']]
First of all - you do not really want to try find literally all cycles because if there is 1 then there is an infinite number of those. For example A-B-A, A-B-A-B-A etc. Or it may be possible to join together 2 cycles into an 8-like cycle etc., etc... The meaningful approach is to look for all so called simple cycles - those that do not cross themselves except in the start/end point. Then if you wish you can generate combinations of simple cycles.
One of the baseline algorithms for finding all simple cycles in a directed graph is this: Do a depth-first traversal of all simple paths (those that do not cross themselves) in the graph. Every time when the current node has a successor on the stack a simple cycle is discovered. It consists of the elements on the stack starting with the identified successor and ending with the top of the stack. Depth first traversal of all simple paths is similar to depth first search but you do not mark/record visited nodes other than those currently on the stack as stop points.
The brute force algorithm above is terribly inefficient and in addition to that generates multiple copies of the cycles. It is however the starting point of multiple practical algorithms which apply various enhancements in order to improve performance and avoid cycle duplication. I was surprised to find out some time ago that these algorithms are not readily available in textbooks and on the web. So I did some research and implemented 4 such algorithms and 1 algorithm for cycles in undirected graphs in an open source Java library here : http://code.google.com/p/niographs/ .
BTW, since I mentioned undirected graphs : The algorithm for those is different. Build a spanning tree and then every edge which is not part of the tree forms a simple cycle together with some edges in the tree. The cycles found this way form a so called cycle base. All simple cycles can then be found by combining 2 or more distinct base cycles. For more details see e.g. this : http://dspace.mit.edu/bitstream/handle/1721.1/68106/FTL_R_1982_07.pdf .
The DFS-based variants with back edges will find cycles indeed, but in many cases it will NOT be minimal cycles. In general DFS gives you the flag that there is a cycle but it is not good enough to actually find cycles. For example, imagine 5 different cycles sharing two edges. There is no simple way to identify cycles using just DFS (including backtracking variants).
Johnson's algorithm is indeed gives all unique simple cycles and has good time and space complexity.
But if you want to just find MINIMAL cycles (meaning that there may be more then one cycle going through any vertex and we are interested in finding minimal ones) AND your graph is not very large, you can try to use the simple method below.
It is VERY simple but rather slow compared to Johnson's.
So, one of the absolutely easiest way to find MINIMAL cycles is to use Floyd's algorithm to find minimal paths between all the vertices using adjacency matrix.
This algorithm is nowhere near as optimal as Johnson's, but it is so simple and its inner loop is so tight that for smaller graphs (<=50-100 nodes) it absolutely makes sense to use it.
Time complexity is O(n^3), space complexity O(n^2) if you use parent tracking and O(1) if you don't.
First of all let's find the answer to the question if there is a cycle.
The algorithm is dead-simple. Below is snippet in Scala.
val NO_EDGE = Integer.MAX_VALUE / 2
def shortestPath(weights: Array[Array[Int]]) = {
for (k <- weights.indices;
i <- weights.indices;
j <- weights.indices) {
val throughK = weights(i)(k) + weights(k)(j)
if (throughK < weights(i)(j)) {
weights(i)(j) = throughK
}
}
}
Originally this algorithm operates on weighted-edge graph to find all shortest paths between all pairs of nodes (hence the weights argument). For it to work correctly you need to provide 1 if there is a directed edge between the nodes or NO_EDGE otherwise.
After algorithm executes, you can check the main diagonal, if there are values less then NO_EDGE than this node participates in a cycle of length equal to the value. Every other node of the same cycle will have the same value (on the main diagonal).
To reconstruct the cycle itself we need to use slightly modified version of algorithm with parent tracking.
def shortestPath(weights: Array[Array[Int]], parents: Array[Array[Int]]) = {
for (k <- weights.indices;
i <- weights.indices;
j <- weights.indices) {
val throughK = weights(i)(k) + weights(k)(j)
if (throughK < weights(i)(j)) {
parents(i)(j) = k
weights(i)(j) = throughK
}
}
}
Parents matrix initially should contain source vertex index in an edge cell if there is an edge between the vertices and -1 otherwise.
After function returns, for each edge you will have reference to the parent node in the shortest path tree.
And then it's easy to recover actual cycles.
All in all we have the following program to find all minimal cycles
val NO_EDGE = Integer.MAX_VALUE / 2;
def shortestPathWithParentTracking(
weights: Array[Array[Int]],
parents: Array[Array[Int]]) = {
for (k <- weights.indices;
i <- weights.indices;
j <- weights.indices) {
val throughK = weights(i)(k) + weights(k)(j)
if (throughK < weights(i)(j)) {
parents(i)(j) = parents(i)(k)
weights(i)(j) = throughK
}
}
}
def recoverCycles(
cycleNodes: Seq[Int],
parents: Array[Array[Int]]): Set[Seq[Int]] = {
val res = new mutable.HashSet[Seq[Int]]()
for (node <- cycleNodes) {
var cycle = new mutable.ArrayBuffer[Int]()
cycle += node
var other = parents(node)(node)
do {
cycle += other
other = parents(other)(node)
} while(other != node)
res += cycle.sorted
}
res.toSet
}
and a small main method just to test the result
def main(args: Array[String]): Unit = {
val n = 3
val weights = Array(Array(NO_EDGE, 1, NO_EDGE), Array(NO_EDGE, NO_EDGE, 1), Array(1, NO_EDGE, NO_EDGE))
val parents = Array(Array(-1, 1, -1), Array(-1, -1, 2), Array(0, -1, -1))
shortestPathWithParentTracking(weights, parents)
val cycleNodes = parents.indices.filter(i => parents(i)(i) < NO_EDGE)
val cycles: Set[Seq[Int]] = recoverCycles(cycleNodes, parents)
println("The following minimal cycle found:")
cycles.foreach(c => println(c.mkString))
println(s"Total: ${cycles.size} cycle found")
}
and the output is
The following minimal cycle found:
012
Total: 1 cycle found
To clarify:
Strongly Connected Components will find all subgraphs that have at least one cycle in them, not all possible cycles in the graph. e.g. if you take all strongly connected components and collapse/group/merge each one of them into one node (i.e. a node per component), you'll get a tree with no cycles (a DAG actually). Each component (which is basically a subgraph with at least one cycle in it) can contain many more possible cycles internally, so SCC will NOT find all possible cycles, it will find all possible groups that have at least one cycle, and if you group them, then the graph will not have cycles.
to find all simple cycles in a graph, as others mentioned, Johnson's algorithm is a candidate.
I was given this as an interview question once, I suspect this has happened to you and you are coming here for help. Break the problem into three questions and it becomes easier.
how do you determine the next valid
route
how do you determine if a point has
been used
how do you avoid crossing over the
same point again
Problem 1)
Use the iterator pattern to provide a way of iterating route results. A good place to put the logic to get the next route is probably the "moveNext" of your iterator. To find a valid route, it depends on your data structure. For me it was a sql table full of valid route possibilities so I had to build a query to get the valid destinations given a source.
Problem 2)
Push each node as you find them into a collection as you get them, this means that you can see if you are "doubling back" over a point very easily by interrogating the collection you are building on the fly.
Problem 3)
If at any point you see you are doubling back, you can pop things off the collection and "back up". Then from that point try to "move forward" again.
Hack: if you are using Sql Server 2008 there is are some new "hierarchy" things you can use to quickly solve this if you structure your data in a tree.
In the case of undirected graph, a paper recently published (Optimal listing of cycles and st-paths in undirected graphs) offers an asymptotically optimal solution. You can read it here http://arxiv.org/abs/1205.2766 or here http://dl.acm.org/citation.cfm?id=2627951
I know it doesn't answer your question, but since the title of your question doesn't mention direction, it might still be useful for Google search
Start at node X and check for all child nodes (parent and child nodes are equivalent if undirected). Mark those child nodes as being children of X. From any such child node A, mark it's children of being children of A, X', where X' is marked as being 2 steps away.). If you later hit X and mark it as being a child of X'', that means X is in a 3 node cycle. Backtracking to it's parent is easy (as-is, the algorithm has no support for this so you'd find whichever parent has X').
Note: If graph is undirected or has any bidirectional edges, this algorithm gets more complicated, assuming you don't want to traverse the same edge twice for a cycle.
If what you want is to find all elementary circuits in a graph you can use the EC algorithm, by JAMES C. TIERNAN, found on a paper since 1970.
The very original EC algorithm as I managed to implement it in php (hope there are no mistakes is shown below). It can find loops too if there are any. The circuits in this implementation (that tries to clone the original) are the non zero elements. Zero here stands for non-existence (null as we know it).
Apart from that below follows an other implementation that gives the algorithm more independece, this means the nodes can start from anywhere even from negative numbers, e.g -4,-3,-2,.. etc.
In both cases it is required that the nodes are sequential.
You might need to study the original paper, James C. Tiernan Elementary Circuit Algorithm
<?php
echo "<pre><br><br>";
$G = array(
1=>array(1,2,3),
2=>array(1,2,3),
3=>array(1,2,3)
);
define('N',key(array_slice($G, -1, 1, true)));
$P = array(1=>0,2=>0,3=>0,4=>0,5=>0);
$H = array(1=>$P, 2=>$P, 3=>$P, 4=>$P, 5=>$P );
$k = 1;
$P[$k] = key($G);
$Circ = array();
#[Path Extension]
EC2_Path_Extension:
foreach($G[$P[$k]] as $j => $child ){
if( $child>$P[1] and in_array($child, $P)===false and in_array($child, $H[$P[$k]])===false ){
$k++;
$P[$k] = $child;
goto EC2_Path_Extension;
} }
#[EC3 Circuit Confirmation]
if( in_array($P[1], $G[$P[$k]])===true ){//if PATH[1] is not child of PATH[current] then don't have a cycle
$Circ[] = $P;
}
#[EC4 Vertex Closure]
if($k===1){
goto EC5_Advance_Initial_Vertex;
}
//afou den ksana theoreitai einai asfales na svisoume
for( $m=1; $m<=N; $m++){//H[P[k], m] <- O, m = 1, 2, . . . , N
if( $H[$P[$k-1]][$m]===0 ){
$H[$P[$k-1]][$m]=$P[$k];
break(1);
}
}
for( $m=1; $m<=N; $m++ ){//H[P[k], m] <- O, m = 1, 2, . . . , N
$H[$P[$k]][$m]=0;
}
$P[$k]=0;
$k--;
goto EC2_Path_Extension;
#[EC5 Advance Initial Vertex]
EC5_Advance_Initial_Vertex:
if($P[1] === N){
goto EC6_Terminate;
}
$P[1]++;
$k=1;
$H=array(
1=>array(1=>0,2=>0,3=>0,4=>0,5=>0),
2=>array(1=>0,2=>0,3=>0,4=>0,5=>0),
3=>array(1=>0,2=>0,3=>0,4=>0,5=>0),
4=>array(1=>0,2=>0,3=>0,4=>0,5=>0),
5=>array(1=>0,2=>0,3=>0,4=>0,5=>0)
);
goto EC2_Path_Extension;
#[EC5 Advance Initial Vertex]
EC6_Terminate:
print_r($Circ);
?>
then this is the other implementation, more independent of the graph, without goto and without array values, instead it uses array keys, the path, the graph and circuits are stored as array keys (use array values if you like, just change the required lines). The example graph start from -4 to show its independence.
<?php
$G = array(
-4=>array(-4=>true,-3=>true,-2=>true),
-3=>array(-4=>true,-3=>true,-2=>true),
-2=>array(-4=>true,-3=>true,-2=>true)
);
$C = array();
EC($G,$C);
echo "<pre>";
print_r($C);
function EC($G, &$C){
$CNST_not_closed = false; // this flag indicates no closure
$CNST_closed = true; // this flag indicates closure
// define the state where there is no closures for some node
$tmp_first_node = key($G); // first node = first key
$tmp_last_node = $tmp_first_node-1+count($G); // last node = last key
$CNST_closure_reset = array();
for($k=$tmp_first_node; $k<=$tmp_last_node; $k++){
$CNST_closure_reset[$k] = $CNST_not_closed;
}
// define the state where there is no closure for all nodes
for($k=$tmp_first_node; $k<=$tmp_last_node; $k++){
$H[$k] = $CNST_closure_reset; // Key in the closure arrays represent nodes
}
unset($tmp_first_node);
unset($tmp_last_node);
# Start algorithm
foreach($G as $init_node => $children){#[Jump to initial node set]
#[Initial Node Set]
$P = array(); // declare at starup, remove the old $init_node from path on loop
$P[$init_node]=true; // the first key in P is always the new initial node
$k=$init_node; // update the current node
// On loop H[old_init_node] is not cleared cause is never checked again
do{#Path 1,3,7,4 jump here to extend father 7
do{#Path from 1,3,8,5 became 2,4,8,5,6 jump here to extend child 6
$new_expansion = false;
foreach( $G[$k] as $child => $foo ){#Consider each child of 7 or 6
if( $child>$init_node and isset($P[$child])===false and $H[$k][$child]===$CNST_not_closed ){
$P[$child]=true; // add this child to the path
$k = $child; // update the current node
$new_expansion=true;// set the flag for expanding the child of k
break(1); // we are done, one child at a time
} } }while(($new_expansion===true));// Do while a new child has been added to the path
# If the first node is child of the last we have a circuit
if( isset($G[$k][$init_node])===true ){
$C[] = $P; // Leaving this out of closure will catch loops to
}
# Closure
if($k>$init_node){ //if k>init_node then alwaya count(P)>1, so proceed to closure
$new_expansion=true; // $new_expansion is never true, set true to expand father of k
unset($P[$k]); // remove k from path
end($P); $k_father = key($P); // get father of k
$H[$k_father][$k]=$CNST_closed; // mark k as closed
$H[$k] = $CNST_closure_reset; // reset k closure
$k = $k_father; // update k
} } while($new_expansion===true);//if we don't wnter the if block m has the old k$k_father_old = $k;
// Advance Initial Vertex Context
}//foreach initial
}//function
?>
I have analized and documented the EC but unfortunately the documentation is in Greek.
There are two steps (algorithms) involved in finding all cycles in a DAG.
The first step is to use Tarjan's algorithm to find the set of strongly connected components.
Start from any arbitrary vertex.
DFS from that vertex. For each node x, keep two numbers, dfs_index[x] and dfs_lowval[x].
dfs_index[x] stores when that node is visited, while dfs_lowval[x] = min(dfs_low[k]) where
k is all the children of x that is not the directly parent of x in the dfs-spanning tree.
All nodes with the same dfs_lowval[x] are in the same strongly connected component.
The second step is to find cycles (paths) within the connected components. My suggestion is to use a modified version of Hierholzer's algorithm.
The idea is:
Choose any starting vertex v, and follow a trail of edges from that vertex until you return to v.
It is not possible to get stuck at any vertex other than v, because the even degree of all vertices ensures that, when the trail enters another vertex w there must be an unused edge leaving w. The tour formed in this way is a closed tour, but may not cover all the vertices and edges of the initial graph.
As long as there exists a vertex v that belongs to the current tour but that has adjacent edges not part of the tour, start another trail from v, following unused edges until you return to v, and join the tour formed in this way to the previous tour.
Here is the link to a Java implementation with a test case:
http://stones333.blogspot.com/2013/12/find-cycles-in-directed-graph-dag.html
I stumbled over the following algorithm which seems to be more efficient than Johnson's algorithm (at least for larger graphs). I am however not sure about its performance compared to Tarjan's algorithm.
Additionally, I only checked it out for triangles so far. If interested, please see "Arboricity and Subgraph Listing Algorithms" by Norishige Chiba and Takao Nishizeki (http://dx.doi.org/10.1137/0214017)
DFS from the start node s, keep track of the DFS path during traversal, and record the path if you find an edge from node v in the path to s. (v,s) is a back-edge in the DFS tree and thus indicates a cycle containing s.
Regarding your question about the Permutation Cycle, read more here:
https://www.codechef.com/problems/PCYCLE
You can try this code (enter the size and the digits number):
# include<cstdio>
using namespace std;
int main()
{
int n;
scanf("%d",&n);
int num[1000];
int visited[1000]={0};
int vindex[2000];
for(int i=1;i<=n;i++)
scanf("%d",&num[i]);
int t_visited=0;
int cycles=0;
int start=0, index;
while(t_visited < n)
{
for(int i=1;i<=n;i++)
{
if(visited[i]==0)
{
vindex[start]=i;
visited[i]=1;
t_visited++;
index=start;
break;
}
}
while(true)
{
index++;
vindex[index]=num[vindex[index-1]];
if(vindex[index]==vindex[start])
break;
visited[vindex[index]]=1;
t_visited++;
}
vindex[++index]=0;
start=index+1;
cycles++;
}
printf("%d\n",cycles,vindex[0]);
for(int i=0;i<(n+2*cycles);i++)
{
if(vindex[i]==0)
printf("\n");
else
printf("%d ",vindex[i]);
}
}
DFS c++ version for the pseudo-code in second floor's answer:
void findCircleUnit(int start, int v, bool* visited, vector<int>& path) {
if(visited[v]) {
if(v == start) {
for(auto c : path)
cout << c << " ";
cout << endl;
return;
}
else
return;
}
visited[v] = true;
path.push_back(v);
for(auto i : G[v])
findCircleUnit(start, i, visited, path);
visited[v] = false;
path.pop_back();
}
http://www.me.utexas.edu/~bard/IP/Handouts/cycles.pdf
The CXXGraph library give a set of algorithms and functions to detect cycles.
For a full algorithm explanation visit the wiki.

Resources