Dyanmic Shortest Path - algorithm

Your friends are planning an expedition to a small town deep in the Canadian north
next winter break. They’ve researched all the travel options and have drawn up a directed
graph whose nodes represent intermediat destinations and edges represent the reoads betweeen
them.
In the course of this, they’ve also learned that extreme weather causes roads in this part of
the world to become quite slow in the winter and may cause large travel delays. They’ve
found an excellent travel Web site that can accurately predict how fast they’ll be able to
travel along the roads; however, the speed of travel depends on the time of the year. More
precisely, the Web site answers queries of the following form: given an edge e = (u, v)
connecting two sites u and v, and given a proposed starting time t from location u, the
site will return a value fe(t), the predicted arrival time at v. The web site guarantees that
1
fe(t) > t for every edge e and every time t (you can’t travel backward in time), and that
fe(t) is a monotone increasing function of t (that is, you do not arrive earlier by starting
later). Other than that, the functions fe may be arbitrary. For example, in areas where the
travel time does not vary with the season, we would have fe(t) = t + e, wheree is the
time needed to travel from the beginning to the end of the edge e.
Your friends want to use the Web site to determine the fastest way to travel through the
directed graph from their starting point to their intended destination. (You should assume
that they start at time 0 and that all predictions made by the Web site are completely
correct.) Give a polynomial-time algorithm to do this, where we treat a single query to
the Web site (based on a specific edge e and a time t) as taking a single computational step.
def updatepath(node):
randomvalue = random.randint(0,3)
print(node,"to other node:",randomvalue)
for i in range(0,n):
distance[node][i] = distance[node][i] + randomvalue
def minDistance(dist,flag_array,n):
min_value = math.inf
for i in range(0,n):
if dist[i] < min_value and flag_array[i] == False:
min_value = dist[i]
min_index = i
return min_index
def shortest_path(graph, src,n):
dist = [math.inf] * n
flag_array = [False] * n
dist[src] = 0
for cout in range(n):
#find the node index that have min cost
u = minDistance(dist,flag_array,n)
flag_array[u] = True
updatepath(u)
for i in range(n):
if graph[u][i] > 0 and flag_array[i]==False and dist[i] > dist[u] + graph[u][i]:
dist[i] = dist[u] + graph[u][i]
path[i] = u
return dist
I applied Dijkstra algorithm but it is not correct ? What would i change in my algorithm to work it for dynamic changing edge.

Well, Key points are that function is monotonically increasing. There is an algorithm which exploits this property and it is called A*.
Accumulated cost: Your prof wants you to use two distances one is accumulated cost(this is simple the cost from previous added to the cost/time needed to move to the next node).
Heuristic cost: This is some predicted cost.
Disjkstra approach would not work because you are working with heuristic cost/predicted and accumulated cost.
Monotonically increasing means h(A) <= h(A) + f(A..B).It simply says that if you move from node A to node B then the cost should not be less than the previous node (in this case A) and this is heuristic + accumulated. If this property holds then the first path which A* chooses is always the path to goal and it never needs to backtrack.
Note: The power of this algorithm is totally base on how you predict value.
If you underestimate the value that will be corrected with accumulated value but if you overestimate the value it will chose wrong path.
Algorithm:
Create a Min Priority queue.
insert initial city in q.
while(!pq.isEmpty() && !Goalfound)
Node min = pq.delMin() //this should return you a cities to which your
distance(heuristic+accumulated is minial).
put all succesors of min in pq // all cities which you can reach, you
can better make a list of visited
cities s that queue will be
efficient by not placing same
element twice.
Keep doing this and at the end you will either reach goal or your queue will be empty
Extra
Here i implemented a 8-puzzle-solve using A*, it can give you an idea about how costs are defined and ho it works.
`
private void solve(MinPQ<Node> pq, HashSet<Node> closedList) {
while(!(pq.min().getBoad().isGoal(pq.min().getBoad()))){
Node e = pq.delMin();
closedList.add(e);
for(Board boards: e.getBoad().neighbors()){
Node nextNode = new Node(boards,e,e.getMoves()+1);
if(!equalToPreviousNode(nextNode,e.getPreviousNode()))
pq.insert(nextNode);
}
}
Node collection = pq.delMin();
while(!(collection.getPreviousNode() == null)){
this.getB().add(collection.getBoad());
collection =collection.getPreviousNode();
}
this.getB().add(collection.getBoad());
System.out.println(pq.size());
}
A link to full code is here.

Related

Shortest Path for n entities

I'm trying to solve a problem that is about minimizing the distance traveled by a group of n entities who have to go trough a group of x points in a given order.
The n entities all start in the same position (1,1) and then I'm given x points that are in a queue and have to be "answered" in the correct order. However, I want the distance to be minimal.
My approach to the algorithm so far was to order the entities in increasing order of their distance to the x that is next in line. Then, I'd check, from the ones that are closer to the ones that are the furthest away, if this distance to the next in line is bigger than the distance to the one that comes afterwards to minimize the distance. The closest one to not fulfill this condition went to answer. If all were closer to the x that came afterwards, I'd reorder them in increasing order of distance to the one that came afterwards and send the furthest away from this to answer the x. Since this is a test problem I'm doing as practice for a competition I know what the result should be for my test case and it seems I'm doing this wrong.
How should I implement such an algorithm that guarantees that the distance is minimal?
The algorithm you've described sounds like a greedy search algorithm. Greedy algorithms are not guaranteed to find optimal solutions except under specific conditions that don't seem to hold here.
This looks like a candidate for a dynamic programming formulation. Alternatively, you can use heuristic search such as the A* search algorithm. I'd go with the latter if I was in the competition. See the link for a description of how the algorithm works, how to implement it, and how you might apply it to your problem.
Although since the points need to be visited in order, there is a bound on the number of possible arrangements, I couldn't think of a more efficient way than the following formulation. Let f(ns, i) represent the optimal arrangement up to the ith point, where ns is the list of the last chosen point for each entity that has at least one point. Then we have at most two choices: either start a new entity if we haven't run out, or try the current point as the next visit for each entity.
Python recursion:
import math
def d(p1, p2):
return math.sqrt(math.pow(p1[0] - p2[0], 2) + math.pow(p1[1] - p2[1], 2))
def f(ps, n):
def g(ns, i):
if i == len(ps):
return 0
# start a new entity if we haven't run out
best = d((0,0), ps[i]) + g(ns[:] + [i], i + 1) if len(ns) < n else float('inf')
# try the current point as the next visit for each entity
for entity_idx, point_idx in enumerate(ns):
_ns = ns[:]
_ns[entity_idx] = i
best = min(best, d(ps[point_idx], ps[i]) + g(_ns, i + 1))
return best
return d((0,0), ps[0]) + g([0], 1)
Output:
"""
p5
p3
p1
(0,0) p4 p6
p2
"""
points = [(0,1), (0,-1), (0,2), (2,0), (0,3), (3,0)]
print f(points, 3) # 7.0

Bidirectional spanning tree

I came across this question from interviewstreet.com
Machines have once again attacked the kingdom of Xions. The kingdom
of Xions has N cities and N-1 bidirectional roads. The road network is
such that there is a unique path between any pair of cities.
Morpheus has the news that K Machines are planning to destroy the
whole kingdom. These Machines are initially living in K different
cities of the kingdom and anytime from now they can plan and launch an
attack. So he has asked Neo to destroy some of the roads to disrupt
the connection among Machines i.e after destroying those roads there
should not be any path between any two Machines.
Since the attack can be at any time from now, Neo has to do this task
as fast as possible. Each road in the kingdom takes certain time to
get destroyed and they can be destroyed only one at a time.
You need to write a program that tells Neo the minimum amount of time
he will require to disrupt the connection among machines.
Sample Input First line of the input contains two, space-separated
integers, N and K. Cities are numbered 0 to N-1. Then follow N-1
lines, each containing three, space-separated integers, x y z, which
means there is a bidirectional road connecting city x and city y, and
to destroy this road it takes z units of time. Then follow K lines
each containing an integer. Ith integer is the id of city in which ith
Machine is currently located.
Output Format Print in a single line the minimum time required to
disrupt the connection among Machines.
Sample Input
5 3
2 1 8
1 0 5
2 4 5
1 3 4
2
4
0
Sample Output
10
Explanation Neo can destroy the road connecting city 2 and city 4 of
weight 5 , and the road connecting city 0 and city 1 of weight 5. As
only one road can be destroyed at a time, the total minimum time taken
is 10 units of time. After destroying these roads none of the Machines
can reach other Machine via any path.
Constraints
2 <= N <= 100,000
2 <= K <= N
1 <= time to destroy a road <= 1000,000
Can someone give idea how to approach the solution.
The kingdom has N cities, N-1 edges and it's fully connected, therefore our kingdom is tree (in graph theory). At this picture you can see tree representation of your input graph in which Machines are represented by red vertices.
By the way you should consider all paths from the root vertex to all leaf nodes. So in every path you would have several red nodes and during removing edges you should take in account only neighboring red nodes. For example in path 0-10 there are two meaningfull pairs - (0,3) and (3,10). And you must remove exactly one node (not less, not more) from each path which connected vertices in pairs.
I hope this advice was helpful.
All the three answers will lead to correct solution but you can not achieve the solution within the time limit provided by interviewstreet.com. You have to think of some simple approach to solve this problem successfully.
HINT: start from the node where machine is present.
As said by others, a connected graph with N vertices and N-1 edges is a tree.
This kind of problem asks for a greedy solution; I'd go for a modification of Kruskal's algorithm:
Start with a set of N components - 1 for every node (city). Keep track of which components contain a machine-occupied city.
Take 1 edge (road) at a time, order by descending weight (starting with roads most costly to destroy). For this edge (which necessarily connects two components - the graph is a tree):
if both neigboring components contain a machine-occupied city, this road must be destroyed, mark it as such
otherwise, merge the neigboring components into one. If one of them contained a machine-occupied city, so does the merged component.
When you're done with all edges, return the sum of costs for the destroyed roads.
Complexity will be the same as Kruskal's algorithm, that is, almost linear for well chosen data structure and sorting method.
pjotr has a correct answer (though not quite asymptotically optimal), but this statement
This kind of problem asks for a greedy solution
really requires proof, as in the real world (as distinguished from competitive programming), there are several problems of this “kind” for which the greedy solution is not optimal (e.g., this very same problem in general graphs, which is called multiterminal cut and is NP-hard). In this case, proof consists of verifying the matroid axioms. Let a set of edges A &subseteq; E be independent if the graph (V, E &setminus; A) has exactly |A| + 1 connected components containing at least one machine.
Independence of the empty set. Trivial.
Hereditary property. Let A be an independent set. Every edge e &in; A joins two connected components of the graph (V, E &setminus; A), and every connected component contains at least one machine. In putting e back in the graph, the number of connected components containing at least one machine decreases by 1, so A &setminus; {e} is also independent.
Augmentation property. Let A and B be independent sets with |A| < |B|. Since (V, E &setminus; B) has more connected components than (V, E &setminus; A), there exists by the pigeonhole principle a pair of machines u, v such that u and v are disconnected by B but not by A. Since there is exactly one path from u to v, B contains at least one edge e on this path, and A cannot contain e. The removal of A ∪ {e} induces one more connected component containing at least one machine than A, so A ∪ {e} is independent, as required.
Start performing a DFS from either of the machine nodes. Also, keep track of the edge with min weight encountered so far. As soon as you find the next node which also contains a machine, delete the min edge recorded so far. Start DFS from this new node now.
Repeat until you have found all nodes where the machines exists.
Should be of the O(N) that way !!
I write some code, and pasted all the tests.
#include <iostream>
#include<algorithm>
using namespace std;
class Line {
public:
Line(){
begin=0;end=0; weight=0;
}
int begin;int end;int weight;
bool operator<(const Line& _l)const {
return weight>_l.weight;
}
};
class Point{
public:
Point(){
pre=0;machine=false;
}
int pre;
bool machine;
};
void DP_Matrix();
void outputLines(Line* lines,Point* points,int N);
int main() {
DP_Matrix();
system("pause");
return 0;
}
int FMSFind(Point* trees,int x){
int r=x;
while(trees[r].pre!=r)
r=trees[r].pre;
int i=x;int j;
while(i!=r) {
j=trees[i].pre;
trees[i].pre=r;
i=j;
}
return r;
}
void DP_Matrix(){
int N,K,machine_index;scanf("%d%d",&N,&K);
Line* lines=new Line[100000];
Point* points=new Point[100000];
N--;
for(int i=0;i<N;i++) {
scanf("%d%d%d",&lines[i].begin,&lines[i].end,&lines[i].weight);
points[i].pre=i;
}
points[N].pre=N;
for(int i=0;i<K;i++) {
scanf("%d",&machine_index);
points[machine_index].machine=true;
}
long long finalRes=0;
for(int i=0;i<N;i++) {
int bP=FMSFind(points,lines[i].begin);
int eP=FMSFind(points,lines[i].end);
if(points[bP].machine&&points[eP].machine){
finalRes+=lines[i].weight;
}
else{
points[bP].pre=eP;
points[eP].machine=points[bP].machine||points[eP].machine;
points[bP].machine=points[eP].machine;
}
}
cout<<finalRes<<endl;
delete[] lines;
delete[] points;
}
void outputLines(Line* lines,Point* points,int N){
printf("\nLines:\n");
for(int i=0;i<N;i++){
printf("%d\t%d\t%d\n",lines[i].begin,lines[i].end,lines[i].weight);
}
printf("\nPoints:\n");
for(int i=0;i<=N;i++){
printf("%d\t%d\t%d\n",i,points[i].machine,points[i].pre);
}
}

what is the best algorithm to traverse a graph with negative nodes and looping nodes

I have a really difficult problem to solve and Im just wondering what what algorithm can be used to find the quickest route. The undirected graph consist of positive and negative adjustments, these adjustments effect a bot or thing which navigate the maze. The problem I have is mazes which contain loops that can be + or -. An example might help:-
node A gives 10 points to the object
node B takes 15 from the object
node C gives 20 points to the object
route=""
the starting node is A, and the ending node is C
given the graph structure as:-
a(+10)-----b(-15)-----c+20
node() means the node loops to itself - and + are the adjustments
nodes with no loops are c+20, so node c has a positive adjustment of 20 but has no loops
if the bot or object has 10 points in its resource then the best path would be :-
a > b > c the object would have 25 points when it arrives at c
route="a,b,c"
this is quite easy to implement, the next challenge is knowing how to backtrack to a good node, lets assume that at each node you can find out any of its neighbour's nodes and their adjustment level. here is the next example:-
if the bot started with only 5 points then the best path would be
a > a > b > c the bot would have 25 points when arriving at c
route="a,a,b,c"
this was a very simple graph, but when you have lots of more nodes it becomes very difficult for the bot to know whether to loop at a good node or go from one good node to another, while keeping track of a possible route.
such a route would be a backtrack queue.
A harder example would result in lots of going back and forth
bot has 10 points
a(+10)-----b(-5)-----c-30
a > b > a > b > a > b > a > b > a > b > c having 5 pts left.
another way the bot could do it is:-
a > a > a > b > c
this is a more efficient way, but how the heck you can program this is partly my question.
does anyone know of a good algorithm to solve this, ive already looked into Bellman-fords and Dijkstra but these only give a simple path not a looping one.
could it be recursive in some way or some form of heuristics?
referring to your analogy:-
I think I get what you mean, a bit of pseudo would be clearer, so far route()
q.add(v)
best=v
hash visited(v,true)
while(q is not empty)
q.remove(v)
for each u of v in G
if u not visited before
visited(u,true)
best=u=>v.dist
else
best=v=>u.dist
This is a straightforward dynamic programming problem.
Suppose that for a given length of path, for each node, you want to know the best cost ending at that node, and where that route came from. (The data for that length can be stored in a hash, the route in a linked list.)
Suppose we have this data for n steps. Then for the n+1st we start with a clean slate, and then take each answer for the n'th, move it one node forward, and if we land on a node we don't have data for, or else that we're better than the best found, then we update the data for that node with our improved score, and add the route (just this node linking back to the previous linked list).
Once we have this for the number of steps you want, find the node with the best existing route, and then you have your score and your route as a linked list.
========
Here is actual code implementing the algorithm:
class Graph:
def __init__(self, nodes=[]):
self.nodes = {}
for node in nodes:
self.insert(node)
def insert(self, node):
self.nodes[ node.name ] = node
def connect(self, name1, name2):
node1 = self.nodes[ name1 ]
node2 = self.nodes[ name2 ]
node1.neighbors.add(node2)
node2.neighbors.add(node1)
def node(self, name):
return self.nodes[ name ]
class GraphNode:
def __init__(self, name, score, neighbors=[]):
self.name = name
self.score = score
self.neighbors = set(neighbors)
def __repr__(self):
return self.name
def find_path (start_node, start_score, end_node):
prev_solution = {start_node: [start_score + start_node.score, None]}
room_to_grow = True
while end_node not in prev_solution:
if not room_to_grow:
# No point looping endlessly...
return None
room_to_grow = False
solution = {}
for node, info in prev_solution.iteritems():
score, prev_path = info
for neighbor in node.neighbors:
new_score = score + neighbor.score
if neighbor not in prev_solution:
room_to_grow = True
if 0 < new_score and (neighbor not in solution or solution[neighbor][0] < new_score):
solution[neighbor] = [new_score, [node, prev_path]]
prev_solution = solution
path = prev_solution[end_node][1]
answer = [end_node]
while path is not None:
answer.append(path[0])
path = path[1]
answer.reverse()
return answer
And here is a sample of how to use it:
graph = Graph([GraphNode('A', 10), GraphNode('B', -5), GraphNode('C', -30)])
graph.connect('A', 'A')
graph.connect('A', 'B')
graph.connect('B', 'B')
graph.connect('B', 'B')
graph.connect('B', 'C')
graph.connect('C', 'C')
print find_path(graph.node('A'), 10, graph.node('C'))
Note that I explicitly connected each node to itself. Depending on your problem you might want to make that automatic.
(Note, there is one possible infinite loop left. Suppose that the starting node has a score of 0 and there is no way off of it. In that case we'll loop forever. It would take work to add a check for this case.)
I'm a little confused by your description, it seems like you are just looking for shortest path algorithms. In which case google is your friend.
In the example you've given you have -ve adjustments which should really be +ve costs in the usual parlance of graph traversal. I.e. you want to find a path with the lowest cost so you want more +ve adjustments.
If your graph has loops that are beneficial to traverse (i.e. decrease cost or increase points through adjustments) then the best path is undefined because going through the loop one more time will improve your score.
Here's some psuedocode
steps = []
steps[0] = [None*graph.#nodes]
step = 1
while True:
steps[step] = [None*graph.#nodes]
for node in graph:
for node2 in graph:
steps[step][node2.index] = max(steps[step-1][node.index]+node2.cost, steps[step][node2.index])
if steps[step][lastnode] >= 0:
break;

Algorithm to establish ordering amongst a set of items

I have a set of students (referred to as items in the title for generality). Amongst these students, some have a reputation for being rambunctious. We are told about a set of hate relationships of the form 'i hates j'. 'i hates j' does not imply 'j hates i'. We are supposed to arrange the students in rows (front most row numbered 1) in a way such that if 'i hates j' then i should be put in a row that is strictly lesser numbered than that of j (in other words: in some row that is in front of j's row) so that i doesn't throw anything at j (Turning back is not allowed). What would be an efficient algorithm to find the minimum number of rows needed (each row need not have the same number of students)?
We will make the following assumptions:
1) If we model this as a directed graph, there are no cycles in the graph. The most basic cycle would be: if 'i hates j' is true, 'j hates i' is false. Because otherwise, I think the ordering would become impossible.
2) Every student in the group is at least hated by one other student OR at least hates one other student. Of course, there would be students who are both hated by some and who in turn hate other students. This means that there are no stray students who don't form part of the graph.
Update: I have already thought of constructing a directed graph with i --> j if 'i hates j and doing topological sorting. However, since the general topological sort would suit better if I had to line all the students in a single line. Since there is a variation of the rows here, I am trying to figure out how to factor in the change into topological sort so it gives me what I want.
When you answer, please state the complexity of your solution. If anybody is giving code and you don't mind the language, then I'd prefer Java but of course any other language is just as fine.
JFYI This is not for any kind of homework (I am not a student btw :)).
It sounds to me that you need to investigate topological sorting.
This problem is basically another way to put the longest path in a directed graph problem. The number of rows is actually number of nodes in path (number of edges + 1).
Assuming the graph is acyclic, the solution is topological sort.
Acyclic is a bit stronger the your assumption 1. Not only A -> B and B -> A is invalid. Also A -> B, B -> C, C -> A and any cycle of any length.
HINT: the question is how many rows are needed, not which student in which row. The answer to the question is the length of the longest path.
It's from a project management theory (or scheduling theory, I don't know the exact term). There the task is about sorting jobs (vertex is a job, arc is a job order relationship).
Obviously we have some connected oriented graph without loops. There is an arc from vertex a to vertex b if and only if a hates b. Let's assume there is a source (without incoming arcs) and destination (without outgoing arcs) vertex. If that is not the case, just add imaginary ones. Now we want to find length of a longest path from source to destination (it will be number of rows - 1, but mind the imaginary verteces).
We will define vertex rank (r[v]) as number of arcs in a longest path between source and this vertex v. Obviously we want to know r[destination]. Algorithm for finding rank:
0) r_0[v] := 0 for all verteces v
repeat
t) r_t[end(j)] := max( r_{t-1}[end(j)], r_{t-1}[start(j)] + 1 ) for all arcs j
until for all arcs j r_{t+1}[end(j)] = r_t[end(j)] // i.e. no changes on this iteration
On each step at least one vertex increases its rank. Therefore in this form complexity is O(n^3).
By the way, this algorithm also gives you student distribution among rows. Just group students by their respective ranks.
Edit: Another code with the same idea. Possibly it is better understandable.
# Python
# V is a list of vertex indices, let it be something like V = range(N)
# source has index 0, destination has index N-1
# E is a list of edges, i.e. tuples of the form (start vertex, end vertex)
R = [0] * len(V)
do:
changes = False
for e in E:
if R[e[1]] < R[e[0]] + 1:
changes = True
R[e[1]] = R[e[0]] + 1
while changes
# The answer is derived from value of R[N-1]
Of course this is the simplest implementation. It can be optimized, and time estimate can be better.
Edit2: obvious optimization - update only verteces adjacent to those that were updated on the previous step. I.e. introduce a queue with verteces whose rank was updated. Also for edge storing one should use adjacency lists. With such optimization complexity would be O(N^2). Indeed, each vertex may appear in the queue at most rank times. But vertex rank never exceeds N - number of verteces. Therefore total number of algorithm steps will not exceed O(N^2).
Essentailly the important thing in assumption #1 is that there must not be any cycles in this graph. If there are any cycles you can't solve this problem.
I would start by seating all of the students that do not hate any other students in the back row. Then you can seat the students who hate these students in the next row and etc.
The number of rows is the length of the longest path in the directed graph, plus one. As a limit case, if there is no hate relationship everyone can fit on the same row.
To allocate the rows, put everyone who is not hated by anyone else on the row one. These are the "roots" of your graph. Everyone else is put on row N + 1 if N is the length of the longest path from any of the roots to that person (this path is of length one at least).
A simple O(N^3) algorithm is the following:
S = set of students
for s in S: s.row = -1 # initialize row field
rownum = 0 # start from first row below
flag = true # when to finish
while (flag):
rownum = rownum + 1 # proceed to next row
flag = false
for s in S:
if (s.row != -1) continue # already allocated
ok = true
foreach q in S:
# Check if there is student q who will sit
# on this or later row who hates s
if ((q.row == -1 or q.row = rownum)
and s hated by q) ok = false; break
if (ok): # can put s here
s.row = rownum
flag = true
Simple answer = 1 row.
Put all students in the same row.
Actually that might not solve the question as stated - lesser row, rather than equal row...
Put all students in row 1
For each hate relation, put the not-hating student in a row behind the hating student
Iterate till you have no activity, or iterate Num(relation) times.
But I'm sure there are better algorithms - look at acyclic graphs.
Construct a relationship graph where i hates j will have a directed edge from i to j. So end result is a directed graph. It should be a DAG otherwise no solutions as it's not possible to resolve circular hate relations ship.
Now simply do a DFS search and during the post node callbacks, means the once the DFS of all the children are done and before returning from the DFS call to this node, simply check the row number of all the children and assign the row number of this node as row max row of the child + 1. Incase if there is some one who doesn't hate anyone basically node with no adjacency list simply assign him row 0.
Once all the nodes are processed reverse the row numbers. This should be easy as this is just about finding the max and assigning the row numbers as max-already assigned row numbers.
Here is the sample code.
postNodeCb( graph g, int node )
{
if ( /* No adj list */ )
row[ node ] = 0;
else
row[ node ] = max( row number of all children ) + 1;
}
main()
{
.
.
for ( int i = 0; i < NUM_VER; i++ )
if ( !visited[ i ] )
graphTraverseDfs( g, i );`enter code here`
.
.
}

Finding all cycles in a directed graph

How can I find (iterate over) ALL the cycles in a directed graph from/to a given node?
For example, I want something like this:
A->B->A
A->B->C->A
but not:
B->C->B
I found this page in my search and since cycles are not same as strongly connected components, I kept on searching and finally, I found an efficient algorithm which lists all (elementary) cycles of a directed graph. It is from Donald B. Johnson and the paper can be found in the following link:
http://www.cs.tufts.edu/comp/150GA/homeworks/hw1/Johnson%2075.PDF
A java implementation can be found in:
http://normalisiert.de/code/java/elementaryCycles.zip
A Mathematica demonstration of Johnson's algorithm can be found here, implementation can be downloaded from the right ("Download author code").
Note: Actually, there are many algorithms for this problem. Some of them are listed in this article:
http://dx.doi.org/10.1137/0205007
According to the article, Johnson's algorithm is the fastest one.
Depth first search with backtracking should work here.
Keep an array of boolean values to keep track of whether you visited a node before. If you run out of new nodes to go to (without hitting a node you have already been), then just backtrack and try a different branch.
The DFS is easy to implement if you have an adjacency list to represent the graph. For example adj[A] = {B,C} indicates that B and C are the children of A.
For example, pseudo-code below. "start" is the node you start from.
dfs(adj,node,visited):
if (visited[node]):
if (node == start):
"found a path"
return;
visited[node]=YES;
for child in adj[node]:
dfs(adj,child,visited)
visited[node]=NO;
Call the above function with the start node:
visited = {}
dfs(adj,start,visited)
The simplest choice I found to solve this problem was using the python lib called networkx.
It implements the Johnson's algorithm mentioned in the best answer of this question but it makes quite simple to execute.
In short you need the following:
import networkx as nx
import matplotlib.pyplot as plt
# Create Directed Graph
G=nx.DiGraph()
# Add a list of nodes:
G.add_nodes_from(["a","b","c","d","e"])
# Add a list of edges:
G.add_edges_from([("a","b"),("b","c"), ("c","a"), ("b","d"), ("d","e"), ("e","a")])
#Return a list of cycles described as a list o nodes
list(nx.simple_cycles(G))
Answer: [['a', 'b', 'd', 'e'], ['a', 'b', 'c']]
First of all - you do not really want to try find literally all cycles because if there is 1 then there is an infinite number of those. For example A-B-A, A-B-A-B-A etc. Or it may be possible to join together 2 cycles into an 8-like cycle etc., etc... The meaningful approach is to look for all so called simple cycles - those that do not cross themselves except in the start/end point. Then if you wish you can generate combinations of simple cycles.
One of the baseline algorithms for finding all simple cycles in a directed graph is this: Do a depth-first traversal of all simple paths (those that do not cross themselves) in the graph. Every time when the current node has a successor on the stack a simple cycle is discovered. It consists of the elements on the stack starting with the identified successor and ending with the top of the stack. Depth first traversal of all simple paths is similar to depth first search but you do not mark/record visited nodes other than those currently on the stack as stop points.
The brute force algorithm above is terribly inefficient and in addition to that generates multiple copies of the cycles. It is however the starting point of multiple practical algorithms which apply various enhancements in order to improve performance and avoid cycle duplication. I was surprised to find out some time ago that these algorithms are not readily available in textbooks and on the web. So I did some research and implemented 4 such algorithms and 1 algorithm for cycles in undirected graphs in an open source Java library here : http://code.google.com/p/niographs/ .
BTW, since I mentioned undirected graphs : The algorithm for those is different. Build a spanning tree and then every edge which is not part of the tree forms a simple cycle together with some edges in the tree. The cycles found this way form a so called cycle base. All simple cycles can then be found by combining 2 or more distinct base cycles. For more details see e.g. this : http://dspace.mit.edu/bitstream/handle/1721.1/68106/FTL_R_1982_07.pdf .
The DFS-based variants with back edges will find cycles indeed, but in many cases it will NOT be minimal cycles. In general DFS gives you the flag that there is a cycle but it is not good enough to actually find cycles. For example, imagine 5 different cycles sharing two edges. There is no simple way to identify cycles using just DFS (including backtracking variants).
Johnson's algorithm is indeed gives all unique simple cycles and has good time and space complexity.
But if you want to just find MINIMAL cycles (meaning that there may be more then one cycle going through any vertex and we are interested in finding minimal ones) AND your graph is not very large, you can try to use the simple method below.
It is VERY simple but rather slow compared to Johnson's.
So, one of the absolutely easiest way to find MINIMAL cycles is to use Floyd's algorithm to find minimal paths between all the vertices using adjacency matrix.
This algorithm is nowhere near as optimal as Johnson's, but it is so simple and its inner loop is so tight that for smaller graphs (<=50-100 nodes) it absolutely makes sense to use it.
Time complexity is O(n^3), space complexity O(n^2) if you use parent tracking and O(1) if you don't.
First of all let's find the answer to the question if there is a cycle.
The algorithm is dead-simple. Below is snippet in Scala.
val NO_EDGE = Integer.MAX_VALUE / 2
def shortestPath(weights: Array[Array[Int]]) = {
for (k <- weights.indices;
i <- weights.indices;
j <- weights.indices) {
val throughK = weights(i)(k) + weights(k)(j)
if (throughK < weights(i)(j)) {
weights(i)(j) = throughK
}
}
}
Originally this algorithm operates on weighted-edge graph to find all shortest paths between all pairs of nodes (hence the weights argument). For it to work correctly you need to provide 1 if there is a directed edge between the nodes or NO_EDGE otherwise.
After algorithm executes, you can check the main diagonal, if there are values less then NO_EDGE than this node participates in a cycle of length equal to the value. Every other node of the same cycle will have the same value (on the main diagonal).
To reconstruct the cycle itself we need to use slightly modified version of algorithm with parent tracking.
def shortestPath(weights: Array[Array[Int]], parents: Array[Array[Int]]) = {
for (k <- weights.indices;
i <- weights.indices;
j <- weights.indices) {
val throughK = weights(i)(k) + weights(k)(j)
if (throughK < weights(i)(j)) {
parents(i)(j) = k
weights(i)(j) = throughK
}
}
}
Parents matrix initially should contain source vertex index in an edge cell if there is an edge between the vertices and -1 otherwise.
After function returns, for each edge you will have reference to the parent node in the shortest path tree.
And then it's easy to recover actual cycles.
All in all we have the following program to find all minimal cycles
val NO_EDGE = Integer.MAX_VALUE / 2;
def shortestPathWithParentTracking(
weights: Array[Array[Int]],
parents: Array[Array[Int]]) = {
for (k <- weights.indices;
i <- weights.indices;
j <- weights.indices) {
val throughK = weights(i)(k) + weights(k)(j)
if (throughK < weights(i)(j)) {
parents(i)(j) = parents(i)(k)
weights(i)(j) = throughK
}
}
}
def recoverCycles(
cycleNodes: Seq[Int],
parents: Array[Array[Int]]): Set[Seq[Int]] = {
val res = new mutable.HashSet[Seq[Int]]()
for (node <- cycleNodes) {
var cycle = new mutable.ArrayBuffer[Int]()
cycle += node
var other = parents(node)(node)
do {
cycle += other
other = parents(other)(node)
} while(other != node)
res += cycle.sorted
}
res.toSet
}
and a small main method just to test the result
def main(args: Array[String]): Unit = {
val n = 3
val weights = Array(Array(NO_EDGE, 1, NO_EDGE), Array(NO_EDGE, NO_EDGE, 1), Array(1, NO_EDGE, NO_EDGE))
val parents = Array(Array(-1, 1, -1), Array(-1, -1, 2), Array(0, -1, -1))
shortestPathWithParentTracking(weights, parents)
val cycleNodes = parents.indices.filter(i => parents(i)(i) < NO_EDGE)
val cycles: Set[Seq[Int]] = recoverCycles(cycleNodes, parents)
println("The following minimal cycle found:")
cycles.foreach(c => println(c.mkString))
println(s"Total: ${cycles.size} cycle found")
}
and the output is
The following minimal cycle found:
012
Total: 1 cycle found
To clarify:
Strongly Connected Components will find all subgraphs that have at least one cycle in them, not all possible cycles in the graph. e.g. if you take all strongly connected components and collapse/group/merge each one of them into one node (i.e. a node per component), you'll get a tree with no cycles (a DAG actually). Each component (which is basically a subgraph with at least one cycle in it) can contain many more possible cycles internally, so SCC will NOT find all possible cycles, it will find all possible groups that have at least one cycle, and if you group them, then the graph will not have cycles.
to find all simple cycles in a graph, as others mentioned, Johnson's algorithm is a candidate.
I was given this as an interview question once, I suspect this has happened to you and you are coming here for help. Break the problem into three questions and it becomes easier.
how do you determine the next valid
route
how do you determine if a point has
been used
how do you avoid crossing over the
same point again
Problem 1)
Use the iterator pattern to provide a way of iterating route results. A good place to put the logic to get the next route is probably the "moveNext" of your iterator. To find a valid route, it depends on your data structure. For me it was a sql table full of valid route possibilities so I had to build a query to get the valid destinations given a source.
Problem 2)
Push each node as you find them into a collection as you get them, this means that you can see if you are "doubling back" over a point very easily by interrogating the collection you are building on the fly.
Problem 3)
If at any point you see you are doubling back, you can pop things off the collection and "back up". Then from that point try to "move forward" again.
Hack: if you are using Sql Server 2008 there is are some new "hierarchy" things you can use to quickly solve this if you structure your data in a tree.
In the case of undirected graph, a paper recently published (Optimal listing of cycles and st-paths in undirected graphs) offers an asymptotically optimal solution. You can read it here http://arxiv.org/abs/1205.2766 or here http://dl.acm.org/citation.cfm?id=2627951
I know it doesn't answer your question, but since the title of your question doesn't mention direction, it might still be useful for Google search
Start at node X and check for all child nodes (parent and child nodes are equivalent if undirected). Mark those child nodes as being children of X. From any such child node A, mark it's children of being children of A, X', where X' is marked as being 2 steps away.). If you later hit X and mark it as being a child of X'', that means X is in a 3 node cycle. Backtracking to it's parent is easy (as-is, the algorithm has no support for this so you'd find whichever parent has X').
Note: If graph is undirected or has any bidirectional edges, this algorithm gets more complicated, assuming you don't want to traverse the same edge twice for a cycle.
If what you want is to find all elementary circuits in a graph you can use the EC algorithm, by JAMES C. TIERNAN, found on a paper since 1970.
The very original EC algorithm as I managed to implement it in php (hope there are no mistakes is shown below). It can find loops too if there are any. The circuits in this implementation (that tries to clone the original) are the non zero elements. Zero here stands for non-existence (null as we know it).
Apart from that below follows an other implementation that gives the algorithm more independece, this means the nodes can start from anywhere even from negative numbers, e.g -4,-3,-2,.. etc.
In both cases it is required that the nodes are sequential.
You might need to study the original paper, James C. Tiernan Elementary Circuit Algorithm
<?php
echo "<pre><br><br>";
$G = array(
1=>array(1,2,3),
2=>array(1,2,3),
3=>array(1,2,3)
);
define('N',key(array_slice($G, -1, 1, true)));
$P = array(1=>0,2=>0,3=>0,4=>0,5=>0);
$H = array(1=>$P, 2=>$P, 3=>$P, 4=>$P, 5=>$P );
$k = 1;
$P[$k] = key($G);
$Circ = array();
#[Path Extension]
EC2_Path_Extension:
foreach($G[$P[$k]] as $j => $child ){
if( $child>$P[1] and in_array($child, $P)===false and in_array($child, $H[$P[$k]])===false ){
$k++;
$P[$k] = $child;
goto EC2_Path_Extension;
} }
#[EC3 Circuit Confirmation]
if( in_array($P[1], $G[$P[$k]])===true ){//if PATH[1] is not child of PATH[current] then don't have a cycle
$Circ[] = $P;
}
#[EC4 Vertex Closure]
if($k===1){
goto EC5_Advance_Initial_Vertex;
}
//afou den ksana theoreitai einai asfales na svisoume
for( $m=1; $m<=N; $m++){//H[P[k], m] <- O, m = 1, 2, . . . , N
if( $H[$P[$k-1]][$m]===0 ){
$H[$P[$k-1]][$m]=$P[$k];
break(1);
}
}
for( $m=1; $m<=N; $m++ ){//H[P[k], m] <- O, m = 1, 2, . . . , N
$H[$P[$k]][$m]=0;
}
$P[$k]=0;
$k--;
goto EC2_Path_Extension;
#[EC5 Advance Initial Vertex]
EC5_Advance_Initial_Vertex:
if($P[1] === N){
goto EC6_Terminate;
}
$P[1]++;
$k=1;
$H=array(
1=>array(1=>0,2=>0,3=>0,4=>0,5=>0),
2=>array(1=>0,2=>0,3=>0,4=>0,5=>0),
3=>array(1=>0,2=>0,3=>0,4=>0,5=>0),
4=>array(1=>0,2=>0,3=>0,4=>0,5=>0),
5=>array(1=>0,2=>0,3=>0,4=>0,5=>0)
);
goto EC2_Path_Extension;
#[EC5 Advance Initial Vertex]
EC6_Terminate:
print_r($Circ);
?>
then this is the other implementation, more independent of the graph, without goto and without array values, instead it uses array keys, the path, the graph and circuits are stored as array keys (use array values if you like, just change the required lines). The example graph start from -4 to show its independence.
<?php
$G = array(
-4=>array(-4=>true,-3=>true,-2=>true),
-3=>array(-4=>true,-3=>true,-2=>true),
-2=>array(-4=>true,-3=>true,-2=>true)
);
$C = array();
EC($G,$C);
echo "<pre>";
print_r($C);
function EC($G, &$C){
$CNST_not_closed = false; // this flag indicates no closure
$CNST_closed = true; // this flag indicates closure
// define the state where there is no closures for some node
$tmp_first_node = key($G); // first node = first key
$tmp_last_node = $tmp_first_node-1+count($G); // last node = last key
$CNST_closure_reset = array();
for($k=$tmp_first_node; $k<=$tmp_last_node; $k++){
$CNST_closure_reset[$k] = $CNST_not_closed;
}
// define the state where there is no closure for all nodes
for($k=$tmp_first_node; $k<=$tmp_last_node; $k++){
$H[$k] = $CNST_closure_reset; // Key in the closure arrays represent nodes
}
unset($tmp_first_node);
unset($tmp_last_node);
# Start algorithm
foreach($G as $init_node => $children){#[Jump to initial node set]
#[Initial Node Set]
$P = array(); // declare at starup, remove the old $init_node from path on loop
$P[$init_node]=true; // the first key in P is always the new initial node
$k=$init_node; // update the current node
// On loop H[old_init_node] is not cleared cause is never checked again
do{#Path 1,3,7,4 jump here to extend father 7
do{#Path from 1,3,8,5 became 2,4,8,5,6 jump here to extend child 6
$new_expansion = false;
foreach( $G[$k] as $child => $foo ){#Consider each child of 7 or 6
if( $child>$init_node and isset($P[$child])===false and $H[$k][$child]===$CNST_not_closed ){
$P[$child]=true; // add this child to the path
$k = $child; // update the current node
$new_expansion=true;// set the flag for expanding the child of k
break(1); // we are done, one child at a time
} } }while(($new_expansion===true));// Do while a new child has been added to the path
# If the first node is child of the last we have a circuit
if( isset($G[$k][$init_node])===true ){
$C[] = $P; // Leaving this out of closure will catch loops to
}
# Closure
if($k>$init_node){ //if k>init_node then alwaya count(P)>1, so proceed to closure
$new_expansion=true; // $new_expansion is never true, set true to expand father of k
unset($P[$k]); // remove k from path
end($P); $k_father = key($P); // get father of k
$H[$k_father][$k]=$CNST_closed; // mark k as closed
$H[$k] = $CNST_closure_reset; // reset k closure
$k = $k_father; // update k
} } while($new_expansion===true);//if we don't wnter the if block m has the old k$k_father_old = $k;
// Advance Initial Vertex Context
}//foreach initial
}//function
?>
I have analized and documented the EC but unfortunately the documentation is in Greek.
There are two steps (algorithms) involved in finding all cycles in a DAG.
The first step is to use Tarjan's algorithm to find the set of strongly connected components.
Start from any arbitrary vertex.
DFS from that vertex. For each node x, keep two numbers, dfs_index[x] and dfs_lowval[x].
dfs_index[x] stores when that node is visited, while dfs_lowval[x] = min(dfs_low[k]) where
k is all the children of x that is not the directly parent of x in the dfs-spanning tree.
All nodes with the same dfs_lowval[x] are in the same strongly connected component.
The second step is to find cycles (paths) within the connected components. My suggestion is to use a modified version of Hierholzer's algorithm.
The idea is:
Choose any starting vertex v, and follow a trail of edges from that vertex until you return to v.
It is not possible to get stuck at any vertex other than v, because the even degree of all vertices ensures that, when the trail enters another vertex w there must be an unused edge leaving w. The tour formed in this way is a closed tour, but may not cover all the vertices and edges of the initial graph.
As long as there exists a vertex v that belongs to the current tour but that has adjacent edges not part of the tour, start another trail from v, following unused edges until you return to v, and join the tour formed in this way to the previous tour.
Here is the link to a Java implementation with a test case:
http://stones333.blogspot.com/2013/12/find-cycles-in-directed-graph-dag.html
I stumbled over the following algorithm which seems to be more efficient than Johnson's algorithm (at least for larger graphs). I am however not sure about its performance compared to Tarjan's algorithm.
Additionally, I only checked it out for triangles so far. If interested, please see "Arboricity and Subgraph Listing Algorithms" by Norishige Chiba and Takao Nishizeki (http://dx.doi.org/10.1137/0214017)
DFS from the start node s, keep track of the DFS path during traversal, and record the path if you find an edge from node v in the path to s. (v,s) is a back-edge in the DFS tree and thus indicates a cycle containing s.
Regarding your question about the Permutation Cycle, read more here:
https://www.codechef.com/problems/PCYCLE
You can try this code (enter the size and the digits number):
# include<cstdio>
using namespace std;
int main()
{
int n;
scanf("%d",&n);
int num[1000];
int visited[1000]={0};
int vindex[2000];
for(int i=1;i<=n;i++)
scanf("%d",&num[i]);
int t_visited=0;
int cycles=0;
int start=0, index;
while(t_visited < n)
{
for(int i=1;i<=n;i++)
{
if(visited[i]==0)
{
vindex[start]=i;
visited[i]=1;
t_visited++;
index=start;
break;
}
}
while(true)
{
index++;
vindex[index]=num[vindex[index-1]];
if(vindex[index]==vindex[start])
break;
visited[vindex[index]]=1;
t_visited++;
}
vindex[++index]=0;
start=index+1;
cycles++;
}
printf("%d\n",cycles,vindex[0]);
for(int i=0;i<(n+2*cycles);i++)
{
if(vindex[i]==0)
printf("\n");
else
printf("%d ",vindex[i]);
}
}
DFS c++ version for the pseudo-code in second floor's answer:
void findCircleUnit(int start, int v, bool* visited, vector<int>& path) {
if(visited[v]) {
if(v == start) {
for(auto c : path)
cout << c << " ";
cout << endl;
return;
}
else
return;
}
visited[v] = true;
path.push_back(v);
for(auto i : G[v])
findCircleUnit(start, i, visited, path);
visited[v] = false;
path.pop_back();
}
http://www.me.utexas.edu/~bard/IP/Handouts/cycles.pdf
The CXXGraph library give a set of algorithms and functions to detect cycles.
For a full algorithm explanation visit the wiki.

Resources