Find all paths between two graph nodes - algorithm

I am working on an implementation of Dijkstra's Algorithm to retrieve the shortest path between interconnected nodes on a network of routes. I have the implementation working. It returns all the shortest paths to all the nodes when I pass the start node into the algorithm.
My question:
How does one go about retrieving all possible paths from Node A to, say, Node G or even all possible paths from Node A and back to Node A?

Finding all possible paths is a hard problem, since there are exponential number of simple paths. Even finding the kth shortest path [or longest path] are NP-Hard.
One possible solution to find all paths [or all paths up to a certain length] from s to t is BFS, without keeping a visited set, or for the weighted version - you might want to use uniform cost search
Note that also in every graph which has cycles [it is not a DAG] there might be infinite number of paths between s to t.

I've implemented a version where it basically finds all possible paths from one node to the other, but it doesn't count any possible 'cycles' (the graph I'm using is cyclical). So basically, no one node will appear twice within the same path. And if the graph were acyclical, then I suppose you could say it seems to find all the possible paths between the two nodes. It seems to be working just fine, and for my graph size of ~150, it runs almost instantly on my machine, though I'm sure the running time must be something like exponential and so it'll start to get slow quickly as the graph gets bigger.
Here is some Java code that demonstrates what I'd implemented. I'm sure there must be more efficient or elegant ways to do it as well.
Stack connectionPath = new Stack();
List<Stack> connectionPaths = new ArrayList<>();
// Push to connectionsPath the object that would be passed as the parameter 'node' into the method below
void findAllPaths(Object node, Object targetNode) {
for (Object nextNode : nextNodes(node)) {
if (nextNode.equals(targetNode)) {
Stack temp = new Stack();
for (Object node1 : connectionPath)
temp.add(node1);
connectionPaths.add(temp);
} else if (!connectionPath.contains(nextNode)) {
connectionPath.push(nextNode);
findAllPaths(nextNode, targetNode);
connectionPath.pop();
}
}
}

I'm gonna give you a (somewhat small) version (although comprehensible, I think) of a scientific proof that you cannot do this under a feasible amount of time.
What I'm gonna prove is that the time complexity to enumerate all simple paths between two selected and distinct nodes (say, s and t) in an arbitrary graph G is not polynomial. Notice that, as we only care about the amount of paths between these nodes, the edge costs are unimportant.
Sure that, if the graph has some well selected properties, this can be easy. I'm considering the general case though.
Suppose that we have a polynomial algorithm that lists all simple paths between s and t.
If G is connected, the list is nonempty. If G is not and s and t are in different components, it's really easy to list all paths between them, because there are none! If they are in the same component, we can pretend that the whole graph consists only of that component. So let's assume G is indeed connected.
The number of listed paths must then be polynomial, otherwise the algorithm couldn't return me them all. If it enumerates all of them, it must give me the longest one, so it is in there. Having the list of paths, a simple procedure may be applied to point me which is this longest path.
We can show (although I can't think of a cohesive way to say it) that this longest path has to traverse all vertices of G. Thus, we have just found a Hamiltonian Path with a polynomial procedure! But this is a well known NP-hard problem.
We can then conclude that this polynomial algorithm we thought we had is very unlikely to exist, unless P = NP.

The following functions (modified BFS with a recursive path-finding function between two nodes) will do the job for an acyclic graph:
from collections import defaultdict
# modified BFS
def find_all_parents(G, s):
Q = [s]
parents = defaultdict(set)
while len(Q) != 0:
v = Q[0]
Q.pop(0)
for w in G.get(v, []):
parents[w].add(v)
Q.append(w)
return parents
# recursive path-finding function (assumes that there exists a path in G from a to b)
def find_all_paths(parents, a, b):
return [a] if a == b else [y + b for x in list(parents[b]) for y in find_all_paths(parents, a, x)]
For example, with the following graph (DAG) G given by
G = {'A':['B','C'], 'B':['D'], 'C':['D', 'F'], 'D':['E', 'F'], 'E':['F']}
if we want to find all paths between the nodes 'A' and 'F' (using the above-defined functions as find_all_paths(find_all_parents(G, 'A'), 'A', 'F')), it will return the following paths:

Here is an algorithm finding and printing all paths from s to t using modification of DFS. Also dynamic programming can be used to find the count of all possible paths. The pseudo code will look like this:
AllPaths(G(V,E),s,t)
C[1...n] //array of integers for storing path count from 's' to i
TopologicallySort(G(V,E)) //here suppose 's' is at i0 and 't' is at i1 index
for i<-0 to n
if i<i0
C[i]<-0 //there is no path from vertex ordered on the left from 's' after the topological sort
if i==i0
C[i]<-1
for j<-0 to Adj(i)
C[i]<- C[i]+C[j]
return C[i1]

If you actually care about ordering your paths from shortest path to longest path then it would be far better to use a modified A* or Dijkstra Algorithm. With a slight modification the algorithm will return as many of the possible paths as you want in order of shortest path first. So if what you really want are all possible paths ordered from shortest to longest then this is the way to go.
If you want an A* based implementation capable of returning all paths ordered from the shortest to the longest, the following will accomplish that. It has several advantages. First off it is efficient at sorting from shortest to longest. Also it computes each additional path only when needed, so if you stop early because you dont need every single path you save some processing time. It also reuses data for subsequent paths each time it calculates the next path so it is more efficient. Finally if you find some desired path you can abort early saving some computation time. Overall this should be the most efficient algorithm if you care about sorting by path length.
import java.util.*;
public class AstarSearch {
private final Map<Integer, Set<Neighbor>> adjacency;
private final int destination;
private final NavigableSet<Step> pending = new TreeSet<>();
public AstarSearch(Map<Integer, Set<Neighbor>> adjacency, int source, int destination) {
this.adjacency = adjacency;
this.destination = destination;
this.pending.add(new Step(source, null, 0));
}
public List<Integer> nextShortestPath() {
Step current = this.pending.pollFirst();
while( current != null) {
if( current.getId() == this.destination )
return current.generatePath();
for (Neighbor neighbor : this.adjacency.get(current.id)) {
if(!current.seen(neighbor.getId())) {
final Step nextStep = new Step(neighbor.getId(), current, current.cost + neighbor.cost + predictCost(neighbor.id, this.destination));
this.pending.add(nextStep);
}
}
current = this.pending.pollFirst();
}
return null;
}
protected int predictCost(int source, int destination) {
return 0; //Behaves identical to Dijkstra's algorithm, override to make it A*
}
private static class Step implements Comparable<Step> {
final int id;
final Step parent;
final int cost;
public Step(int id, Step parent, int cost) {
this.id = id;
this.parent = parent;
this.cost = cost;
}
public int getId() {
return id;
}
public Step getParent() {
return parent;
}
public int getCost() {
return cost;
}
public boolean seen(int node) {
if(this.id == node)
return true;
else if(parent == null)
return false;
else
return this.parent.seen(node);
}
public List<Integer> generatePath() {
final List<Integer> path;
if(this.parent != null)
path = this.parent.generatePath();
else
path = new ArrayList<>();
path.add(this.id);
return path;
}
#Override
public int compareTo(Step step) {
if(step == null)
return 1;
if( this.cost != step.cost)
return Integer.compare(this.cost, step.cost);
if( this.id != step.id )
return Integer.compare(this.id, step.id);
if( this.parent != null )
this.parent.compareTo(step.parent);
if(step.parent == null)
return 0;
return -1;
}
#Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
Step step = (Step) o;
return id == step.id &&
cost == step.cost &&
Objects.equals(parent, step.parent);
}
#Override
public int hashCode() {
return Objects.hash(id, parent, cost);
}
}
/*******************************************************
* Everything below here just sets up your adjacency *
* It will just be helpful for you to be able to test *
* It isnt part of the actual A* search algorithm *
********************************************************/
private static class Neighbor {
final int id;
final int cost;
public Neighbor(int id, int cost) {
this.id = id;
this.cost = cost;
}
public int getId() {
return id;
}
public int getCost() {
return cost;
}
}
public static void main(String[] args) {
final Map<Integer, Set<Neighbor>> adjacency = createAdjacency();
final AstarSearch search = new AstarSearch(adjacency, 1, 4);
System.out.println("printing all paths from shortest to longest...");
List<Integer> path = search.nextShortestPath();
while(path != null) {
System.out.println(path);
path = search.nextShortestPath();
}
}
private static Map<Integer, Set<Neighbor>> createAdjacency() {
final Map<Integer, Set<Neighbor>> adjacency = new HashMap<>();
//This sets up the adjacencies. In this case all adjacencies have a cost of 1, but they dont need to.
addAdjacency(adjacency, 1,2,1,5,1); //{1 | 2,5}
addAdjacency(adjacency, 2,1,1,3,1,4,1,5,1); //{2 | 1,3,4,5}
addAdjacency(adjacency, 3,2,1,5,1); //{3 | 2,5}
addAdjacency(adjacency, 4,2,1); //{4 | 2}
addAdjacency(adjacency, 5,1,1,2,1,3,1); //{5 | 1,2,3}
return Collections.unmodifiableMap(adjacency);
}
private static void addAdjacency(Map<Integer, Set<Neighbor>> adjacency, int source, Integer... dests) {
if( dests.length % 2 != 0)
throw new IllegalArgumentException("dests must have an equal number of arguments, each pair is the id and cost for that traversal");
final Set<Neighbor> destinations = new HashSet<>();
for(int i = 0; i < dests.length; i+=2)
destinations.add(new Neighbor(dests[i], dests[i+1]));
adjacency.put(source, Collections.unmodifiableSet(destinations));
}
}
The output from the above code is the following:
[1, 2, 4]
[1, 5, 2, 4]
[1, 5, 3, 2, 4]
Notice that each time you call nextShortestPath() it generates the next shortest path for you on demand. It only calculates the extra steps needed and doesnt traverse any old paths twice. Moreover if you decide you dont need all the paths and end execution early you've saved yourself considerable computation time. You only compute up to the number of paths you need and no more.
Finally it should be noted that the A* and Dijkstra algorithms do have some minor limitations, though I dont think it would effect you. Namely it will not work right on a graph that has negative weights.
Here is a link to JDoodle where you can run the code yourself in the browser and see it working. You can also change around the graph to show it works on other graphs as well: http://jdoodle.com/a/ukx

find_paths[s, t, d, k]
This question is now a bit old... but I'll throw my hat into the ring.
I personally find an algorithm of the form find_paths[s, t, d, k] useful, where:
s is the starting node
t is the target node
d is the maximum depth to search
k is the number of paths to find
Using your programming language's form of infinity for d and k will give you all paths§.
§ obviously if you are using a directed graph and you want all undirected paths between s and t you will have to run this both ways:
find_paths[s, t, d, k] <join> find_paths[t, s, d, k]
Helper Function
I personally like recursion, although it can difficult some times, anyway first lets define our helper function:
def find_paths_recursion(graph, current, goal, current_depth, max_depth, num_paths, current_path, paths_found)
current_path.append(current)
if current_depth > max_depth:
return
if current == goal:
if len(paths_found) <= number_of_paths_to_find:
paths_found.append(copy(current_path))
current_path.pop()
return
else:
for successor in graph[current]:
self.find_paths_recursion(graph, successor, goal, current_depth + 1, max_depth, num_paths, current_path, paths_found)
current_path.pop()
Main Function
With that out of the way, the core function is trivial:
def find_paths[s, t, d, k]:
paths_found = [] # PASSING THIS BY REFERENCE
find_paths_recursion(s, t, 0, d, k, [], paths_found)
First, lets notice a few thing:
the above pseudo-code is a mash-up of languages - but most strongly resembling python (since I was just coding in it). A strict copy-paste will not work.
[] is an uninitialized list, replace this with the equivalent for your programming language of choice
paths_found is passed by reference. It is clear that the recursion function doesn't return anything. Handle this appropriately.
here graph is assuming some form of hashed structure. There are a plethora of ways to implement a graph. Either way, graph[vertex] gets you a list of adjacent vertices in a directed graph - adjust accordingly.
this assumes you have pre-processed to remove "buckles" (self-loops), cycles and multi-edges

You usually don't want to, because there is an exponential number of them in nontrivial graphs; if you really want to get all (simple) paths, or all (simple) cycles, you just find one (by walking the graph), then backtrack to another.

I think what you want is some form of the Ford–Fulkerson algorithm which is based on BFS. Its used to calculate the max flow of a network, by finding all augmenting paths between two nodes.
http://en.wikipedia.org/wiki/Ford%E2%80%93Fulkerson_algorithm

There's a nice article which may answer your question /only it prints the paths instead of collecting them/.
Please note that you can experiment with the C++/Python samples in the online IDE.
http://www.geeksforgeeks.org/find-paths-given-source-destination/

I suppose you want to find 'simple' paths (a path is simple if no node appears in it more than once, except maybe the 1st and the last one).
Since the problem is NP-hard, you might want to do a variant of depth-first search.
Basically, generate all possible paths from A and check whether they end up in G.

Related

Is the bottleneck shortest path a collection of shortest edges or longest edges from s to t?

I just found about about the bottleneck shortest path problem, and I'm really confused about what we need to find there. Do we need to find the maximum cost edge on the path from s to t consisting of short edges, or the minimum cost edge on the path consisting of the long edges? The latter makes more sense to me, and it's described under the picture at right on wikipedia : https://en.wikipedia.org/wiki/Widest_path_problem
But when I looked at this link : http://homes.cs.washington.edu/~anderson/iucee/Slides_421_06/Lecture08_09_10.pdf
the algorithm described there doesn't seem to find the path consisting of long edges, I even implemented it and tested it on this graph:
Here's the algorithm that I took from that link above:
import java.util.ArrayDeque;
import java.util.Comparator;
import java.util.Deque;
import java.util.PriorityQueue;
public class BSP {
private double[] d;
private DirectedWeightedEdge[] edgeTo;
private PriorityQueue<Integer> pq;
private boolean[] visited;
private class VertexComparer implements Comparator<Integer> {
public int compare(Integer one, Integer two) {
if (d[one] < d[two]) return -1;
return 1;
}
}
public BSP(WeightedDigraph G, int source) {
d = new double[G.V()];
edgeTo = new DirectedWeightedEdge[G.V()];
visited = new boolean[G.V()];
for (int i = 0; i < G.V(); ++i) d[i] = Double.POSITIVE_INFINITY;
d[source] = Double.NEGATIVE_INFINITY;
pq = new PriorityQueue<>(new VertexComparer());
pq.add(source);
while (!pq.isEmpty()) {
int top = pq.poll();
if (!visited[top])
relax(G, top);
}
}
private void relax(WeightedDigraph G, int v) {
visited[v] = true;
for (DirectedWeightedEdge e : G.adj(v)) {
if (d[e.to()] > Math.max(d[e.from()], e.weight())) {
d[e.to()] = Math.max(d[e.from()], e.weight());
edgeTo[e.to()] = e;
pq.add(e.to());
}
}
}
public boolean hasPathTo(int v) {
return edgeTo[v] != null;
}
public Iterable<Integer> pathTo(int v) {
assert hasPathTo(v);
int w = v;
DirectedWeightedEdge e = edgeTo[w];
Deque<Integer> stack = new ArrayDeque<>();
for (; edgeTo[w] != null; e = edgeTo[e.from()]) {
stack.push(w);
w = e.from();
}
stack.push(w);
return stack;
}
}
Why doesn't that algorithm also find the path consisting of long edges, from 0 to 3, for example : {0,4}, {4,3} and conclude that the answer is 4? I can't seem to understand why it finds the path 0->1->2->3 instead. Or is the problem somehow different for directed graphs?
If the algorithm described on that link is wrong, please let me know what the right algorithm is. I just can't seem to understand why 2 sources give different information.
As with many similar problems, the bottleneck problem is symmetrical. In fact, you can talk of two different problems:
Find a path that has its shortest edge as long as possible
Find a path that has its longest edge as short as possible
The algorithm for both versions is the same, except that you reverse all weight relations (change max-heap to min-heap or vice-versa, change the sign of comparison in relax(), etc.) Even the Wikipedia link you gave states:
A closely related problem, the minimax path problem, asks for the path
that minimizes the maximum weight of any of its edges. <...> Any algorithm
for the widest path problem can be transformed into an algorithm for
the minimax path problem, or vice versa, by reversing the sense of all
the weight comparisons performed by the algorithm, or equivalently by
replacing every edge weight by its negation.
Obviously, both versions can be called the bottleneck problem, and you just came across lecture notes that talks about the second version. As the algorithms are the same, no much confusion will arise, you just need to be explicit about what version do you talk about.

Algorithm to find lowest common ancestor in directed acyclic graph?

Imagine a directed acyclic graph as follows, where:
"A" is the root (there is always exactly one root)
each node knows its parent(s)
the node names are arbitrary - nothing can be inferred from them
we know from another source that the nodes were added to the tree in the order A to G (e.g. they are commits in a version control system)
What algorithm could I use to determine the lowest common ancestor (LCA) of two arbitrary nodes, for example, the common ancestor of:
B and E is B
D and F is B
Note:
There is not necessarily a single path to a given node from the root (e.g. "G" has two paths), so you can't simply traverse paths from root to the two nodes and look for the last equal element
I've found LCA algorithms for trees, especially binary trees, but they do not apply here because a node can have multiple parents (i.e. this is not a tree)
Den Roman's link (Archived version) seems promising, but it seemed a little bit complicated to me, so I tried another approach. Here is a simple algorithm I used:
Let say you want to compute LCA(x,y) with x and y two nodes.
Each node must have a value color and count, resp. initialized to white and 0.
Color all ancestors of x as blue (can be done using BFS)
Color all blue ancestors of y as red (BFS again)
For each red node in the graph, increment its parents' count by one
Each red node having a count value set to 0 is a solution.
There can be more than one solution, depending on your graph. For instance, consider this graph:
LCA(4,5) possible solutions are 1 and 2.
Note it still work if you want find the LCA of 3 nodes or more, you just need to add a different color for each of them.
I was looking for a solution to the same problem and I found a solution in the following paper:
http://dx.doi.org/10.1016/j.ipl.2010.02.014
In short, you are not looking for the lowest common ancestor, but for the lowest SINGLE common ancestor, which they define in this paper.
I know it's and old question and pretty good discussion, but since I had some similar problem to solve I came across JGraphT's Lowest Common Ancestor algorithms, thought this might be of help:
NativeLcaFinder
TarjanLowestCommonAncestor
Just some wild thinking. What about using both input nodes as roots, and doing two BFS simultaneously step by step. At a certain step, when there are overlapping in their BLACK sets (recording visited nodes), algorithm stops and the overlapped nodes are their LCA(s). In this way, any other common ancestors will have longer distances than what we have discovered.
Assume that you want to find the ancestors of x and y in a graph.
Maintain an array of vectors- parents (storing parents of each node).
Firstly do a bfs(keep storing parents of each vertex) and find all the ancestors of x (find parents of x and using parents, find all the ancestors of x) and store them in a vector. Also, store the depth of each parent in the vector.
Find the ancestors of y using same method and store them in another vector. Now, you have two vectors storing the ancestors of x and y respectively along with their depth.
LCA would be common ancestor with greatest depth. Depth is defined as longest distance from root(vertex with in_degree=0). Now, we can sort the vectors in decreasing order of their depths and find out the LCA. Using this method, we can even find multiple LCA's (if there).
This link (Archived version) describes how it is done in Mercurial - the basic idea is to find all parents for the specified nodes, group them per distance from the root, then do a search on those groups.
If the graph has cycles then 'ancestor' is loosely defined. Perhaps you mean the ancestor on the tree output of a DFS or BFS? Or perhaps by 'ancestor' you mean the node in the digraph that minimizes the number of hops from E and B?
If you're not worried about complexity, then you could compute an A* (or Dijkstra's shortest path) from every node to both E and B. For the nodes that can reach both E and B, you can find the node that minimizes PathLengthToE + PathLengthToB.
EDIT:
Now that you've clarified a few things, I think I understand what you're looking for.
If you can only go "up" the tree, then I suggest you perform a BFS from E and also a BFS from B. Every node in your graph will have two variables associated with it: hops from B and hops from E. Let both B and E have copies of the list of graph nodes. B's list is sorted by hops from B while E's list is sorted by hops from E.
For each element in B's list, attempt to find it in E's list. Place matches in a third list, sorted by hops from B + hops from E. After you've exhausted B's list, your third sorted list should contain the LCA at its head. This allows for one solution, multiple solutions(arbitrarily chosen among by their BFS ordering for B), or no solution.
I also need exactly same thing , to find LCA in a DAG (directed acyclic graph). LCA problem is related to RMQ (Range Minimum Query Problem).
It is possible to reduce LCA to RMQ and find desired LCA of two arbitrary node from a directed acyclic graph.
I found THIS TUTORIAL detail and good. I am also planing to implement this.
I am proposing O(|V| + |E|) time complexity solution, and i think this approach is correct otherwise please correct me.
Given directed acyclic graph, we need to find LCA of two vertices v and w.
Step1: Find shortest distance of all vertices from root vertex using bfs http://en.wikipedia.org/wiki/Breadth-first_search with time complexity O(|V| + |E|) and also find the parent of each vertices.
Step2: Find the common ancestors of both the vertices by using parent until we reach root vertex Time complexity- 2|v|
Step3: LCA will be that common ancestor which have maximum shortest distance.
So, this is O(|V| + |E|) time complexity algorithm.
Please, correct me if i am wrong or any other suggestions are welcome.
package FB;
import java.util.*;
public class commomAnsectorForGraph {
public static void main(String[] args){
commomAnsectorForGraph com = new commomAnsectorForGraph();
graphNode g = new graphNode('g');
graphNode d = new graphNode('d');
graphNode f = new graphNode('f');
graphNode c = new graphNode('c');
graphNode e = new graphNode('e');
graphNode a = new graphNode('a');
graphNode b = new graphNode('b');
List<graphNode> gc = new ArrayList<>();
gc.add(d);
gc.add(f);
g.children = gc;
List<graphNode> dc = new ArrayList<>();
dc.add(c);
d.children = dc;
List<graphNode> cc = new ArrayList<>();
cc.add(b);
c.children = cc;
List<graphNode> bc = new ArrayList<>();
bc.add(a);
b.children = bc;
List<graphNode> fc = new ArrayList<>();
fc.add(e);
f.children = fc;
List<graphNode> ec = new ArrayList<>();
ec.add(b);
e.children = ec;
List<graphNode> ac = new ArrayList<>();
a.children = ac;
graphNode gn = com.findAncestor(g, c, d);
System.out.println(gn.value);
}
public graphNode findAncestor(graphNode root, graphNode a, graphNode b){
if(root == null) return null;
if(root.value == a.value || root.value == b.value) return root;
List<graphNode> list = root.children;
int count = 0;
List<graphNode> temp = new ArrayList<>();
for(graphNode node : list){
graphNode res = findAncestor(node, a, b);
temp.add(res);
if(res != null) {
count++;
}
}
if(count == 2) return root;
for(graphNode t : temp){
if(t != null) return t;
}
return null;
}
}
class graphNode{
char value;
graphNode parent;
List<graphNode> children;
public graphNode(char value){
this.value = value;
}
}
Everyone.
Try please in Java.
static String recentCommonAncestor(String[] commitHashes, String[][] ancestors, String strID, String strID1)
{
HashSet<String> setOfAncestorsLower = new HashSet<String>();
HashSet<String> setOfAncestorsUpper = new HashSet<String>();
String[] arrPair= {strID, strID1};
Arrays.sort(arrPair);
Comparator<String> comp = new Comparator<String>(){
#Override
public int compare(String s1, String s2) {
return s2.compareTo(s1);
}};
int indexUpper = Arrays.binarySearch(commitHashes, arrPair[0], comp);
int indexLower = Arrays.binarySearch(commitHashes, arrPair[1], comp);
setOfAncestorsLower.addAll(Arrays.asList(ancestors[indexLower]));
setOfAncestorsUpper.addAll(Arrays.asList(ancestors[indexUpper]));
HashSet<String>[] sets = new HashSet[] {setOfAncestorsLower, setOfAncestorsUpper};
for (int i = indexLower + 1; i < commitHashes.length; i++)
{
for (int j = 0; j < 2; j++)
{
if (sets[j].contains(commitHashes[i]))
{
if (i > indexUpper)
if(sets[1 - j].contains(commitHashes[i]))
return commitHashes[i];
sets[j].addAll(Arrays.asList(ancestors[i]));
}
}
}
return null;
}
The idea is very simple. We suppose that commitHashes ordered in downgrade sequence.
We find lowest and upper indexes of strings(hashes-does not mean).
It is clearly that (considering descendant order) the common ancestor can be only after upper index (lower value among hashes).
Then we start enumerating the hashes of commit and build chain of descendent parent chains . For this purpose we have two hashsets are initialised by parents of lowest and upper hash of commit. setOfAncestorsLower, setOfAncestorsUpper. If next hash -commit belongs to any of chains(hashsets),
then if current index is upper than index of lowest hash, then if it is contained in another set (chain) we return the current hash as result. If not, we add its parents (ancestors[i]) to hashset, which traces set of ancestors of set,, where the current element contained. That is the all, basically

Find all the paths forming simple cycles on an undirected graph

I am having some trouble writing an algorithm that returns all the paths forming simple cycles on an undirected graph.
I am considering at first all cycles starting from a vertex A, which would be, for the graph below
A,B,E,G,F
A,B,E,D,F
A,B,C,D,F
A,B,C,D,E,G,F
Additional cycles would be
B,C,D,E
F,D,E,G
but these could be found, for example, by calling the same algorithm again but starting from B and from D, respectively.
The graph is shown below -
My current approach is to build all the possible paths from A by visiting all the neighbors of A, and then the neighbors of the neightbors and so on, while following these rules:
each time that more than one neighbor exist, a fork is found and a new path from A is created and explored.
if any of the created paths visits the original vertex, that path is a cycle.
if any of the created paths visits the same vertex twice (different from A) the path is discarded.
continue until all possible paths have been explored.
I am currently having problems trying to avoid the same cycle being found more than once, and I am trying to solve this by looking if the new neighbor is already part of another existing path so that the two paths combined (if independent) build up a cycle.
My question is: Am I following the correct/better/simpler logic to solve this problem.?
I would appreciate your comments
Based on the answer of #eminsenay to other question, I used the elementaryCycles library developed by Frank Meyer, from web_at_normalisiert_dot_de which implements the algorithms of Johnson.
However, since this library is for directed graphs, I added some routines to:
build the adjacency matrix from a JGraphT undirected graph (needed by Meyer's lib)
filter the results to avoid cycles of length 2
delete repeated cycles, since Meyer's lib is for directed graphs, and each undirected cycle is two directed cycles (one on each direction).
The code is
package test;
import java.util.*;
import org.jgraph.graph.DefaultEdge;
import org.jgrapht.UndirectedGraph;
import org.jgrapht.graph.SimpleGraph;
public class GraphHandling<V> {
private UndirectedGraph<V,DefaultEdge> graph;
private List<V> vertexList;
private boolean adjMatrix[][];
public GraphHandling() {
this.graph = new SimpleGraph<V, DefaultEdge>(DefaultEdge.class);
this.vertexList = new ArrayList<V>();
}
public void addVertex(V vertex) {
this.graph.addVertex(vertex);
this.vertexList.add(vertex);
}
public void addEdge(V vertex1, V vertex2) {
this.graph.addEdge(vertex1, vertex2);
}
public UndirectedGraph<V, DefaultEdge> getGraph() {
return graph;
}
public List<List<V>> getAllCycles() {
this.buildAdjancyMatrix();
#SuppressWarnings("unchecked")
V[] vertexArray = (V[]) this.vertexList.toArray();
ElementaryCyclesSearch ecs = new ElementaryCyclesSearch(this.adjMatrix, vertexArray);
#SuppressWarnings("unchecked")
List<List<V>> cycles0 = ecs.getElementaryCycles();
// remove cycles of size 2
Iterator<List<V>> listIt = cycles0.iterator();
while(listIt.hasNext()) {
List<V> cycle = listIt.next();
if(cycle.size() == 2) {
listIt.remove();
}
}
// remove repeated cycles (two cycles are repeated if they have the same vertex (no matter the order)
List<List<V>> cycles1 = removeRepeatedLists(cycles0);
for(List<V> cycle : cycles1) {
System.out.println(cycle);
}
return cycles1;
}
private void buildAdjancyMatrix() {
Set<DefaultEdge> edges = this.graph.edgeSet();
Integer nVertex = this.vertexList.size();
this.adjMatrix = new boolean[nVertex][nVertex];
for(DefaultEdge edge : edges) {
V v1 = this.graph.getEdgeSource(edge);
V v2 = this.graph.getEdgeTarget(edge);
int i = this.vertexList.indexOf(v1);
int j = this.vertexList.indexOf(v2);
this.adjMatrix[i][j] = true;
this.adjMatrix[j][i] = true;
}
}
/* Here repeated lists are those with the same elements, no matter the order,
* and it is assumed that there are no repeated elements on any of the lists*/
private List<List<V>> removeRepeatedLists(List<List<V>> listOfLists) {
List<List<V>> inputListOfLists = new ArrayList<List<V>>(listOfLists);
List<List<V>> outputListOfLists = new ArrayList<List<V>>();
while(!inputListOfLists.isEmpty()) {
// get the first element
List<V> thisList = inputListOfLists.get(0);
// remove it
inputListOfLists.remove(0);
outputListOfLists.add(thisList);
// look for duplicates
Integer nEl = thisList.size();
Iterator<List<V>> listIt = inputListOfLists.iterator();
while(listIt.hasNext()) {
List<V> remainingList = listIt.next();
if(remainingList.size() == nEl) {
if(remainingList.containsAll(thisList)) {
listIt.remove();
}
}
}
}
return outputListOfLists;
}
}
I'm answering this on the basis that you want to find chordless cycles, but it can be modified to find cycles with chords.
This problem reduces to finding all (inclusion) minimal paths between two vertices s and t.
For all triplets, (v,s,t):
Either v,s,t form a triangle, in which case, output it and continue to next triplet.
Otherwise, remove v and its neighbor except s and t, and enumerate all s-t-paths.
Finding all s-t-paths can be done by dynamic programming.

How to determine if a given directed graph is a tree

The input to the program is the set of edges in the graph. For e.g. consider the following simple directed graph:
a -> b -> c
The set of edges for this graph is
{ (b, c), (a, b) }
So given a directed graph as a set of edges, how do you determine if the directed graph is a tree? If it is a tree, what is the root node of the tree?
First of I'm looking at how will you represent this graph, adjacency list/ adjacency matrix / any thing else? How will utilize the representation that you have chosen to efficiently answer the above questions?
Edit 1:
Some people are mentoning about using DFS for cycle detection but the problem is which node to start the DFS from. Since it is a directed graph we cannot start the DFS from a random node, for e.g. if I started a DFS from vertex 'c' it won't proceed further since there is no back edge to go to any other nodes. The follow up question here should be how do you determine what is the root of this tree.
Here's a fairly direct method. It can be done with either an adjacency matrix or an edge list.
Find the set, R, of nodes that do not appear as the destination of any edge. If R does not have exactly one member, the graph is not a tree.
If R does have exactly one member, r, it is the only possible root.
Mark r.
Starting from r, recursively mark all nodes that can be reached by following edges from source to destination. If any node is already marked, there is a cycle and the graph is not a tree. (This step is the same as a previously posted answer).
If any node is not marked at the end of step 3, the graph is not a tree.
If none of those steps find that the graph is not a tree, the graph is a tree with r as root.
It's difficult to know what is going to be efficient without some information about the numbers of nodes and edges.
There are 3 properties to check if a graph is a tree:
(1) The number of edges in the graph is exactly one less than the number of vertices |E| = |V| - 1
(2) There are no cycles
(3) The graph is connected
I think this example algorithm can work in the cases of a directed graph:
# given a graph and a starting vertex, check if the graph is a tree
def checkTree(G, v):
# |E| = |V| - 1
if edges.size != vertices.size - 1:
return false;
for v in vertices:
visited[v] = false;
hasCycle = explore_and_check_cycles(G, v);
# a tree is acyclic
if hasCycle:
return false;
for v in vertices:
if not visited[v]: # the graph isn't connected
return false;
# otherwise passes all properties for a tree
return true;
# given a Graph and a vertex, explore all reachable vertices from the vertex
# and returns true if there are any cycles
def explore_and_check_cycles(G, v):
visited[v] = true;
for (v, u) in edges:
if not visited[u]:
return explore_and_check_cyles(G, u)
else: # a backedge between two vertices indicates a cycle
return true
return false
Sources:
Algorithms by S. Dasgupta, C.H. Papadimitriou, and U.V. Vazirani
http://www.cs.berkeley.edu/~vazirani/algorithms.html
start from the root, "mark" it and then go to all children and repeat recursively. if you reach a child that is already marked it means that it is not a tree...
Note: By no means the most efficient way, but conceptually useful. Sometimes you want efficiency, sometimes you want an alternate viewpoint for pedagogic reasons. This is most certainly the later.
Algorithm: Starting with an adjacency matrix A of size n. Take the matrix power A**n. If the matrix is zero for every entry you know that it is at least a collection of trees (a forest). If you can show that it is connected, then it must be a tree. See a Nilpotent matrix. for more information.
To find the root node, we assume that you've shown the graph is a connected tree. Let k be the the number of times you have to raise the power A**k before the matrix becomes all zero. Take the transpose to the (k-1) power A.T ** (k-1). The only non-zero entry must be the root.
Analysis: A rough worse case analysis shows that it is bounded above by O(n^4), three for the matrix multiplication at most n times. You can do better by diagonalizing the matrix which should bring it down to O(n^3). Considering that this problem can be done in O(n), O(1) time/space this is only a useful exercise in logic and understanding of the problem.
For a directed graph, the underlying undirected graph will be a tree if the undirected graph is acyclic and fully connected. The same property holds good if for the directed graph, each vertex has in-degree=1 except one having in-degree=0.
If adjacency-list representation also supports in-degree property of each vertex, then we can apply the above rule easily. Otherwise, we should apply a tweaked DFS to find no-loops for underlying undirected graph as well as that |E|=|V|-1.
Following is the code I have written for this. Feel free to suggest optimisations.
import java.util.*;
import java.lang.*;
import java.io.*;
class Graph
{
private static int V;
private static int adj[][];
static void initializeGraph(int n)
{
V=n+1;
adj = new int[V][V];
for(int i=0;i<V;i++)
{
for(int j=0;j<V ;j++)
adj[i][j]= 0;
}
}
static int isTree(int edges[][],int n)
{
initializeGraph(n);
for(int i=0;i< edges.length;i++)
{
addDirectedEdge(edges[i][0],edges[i][1]);
}
int root = findRoot();
if(root == -1)
return -1;
boolean visited[] = new boolean[V];
boolean isTree = isTree(root, visited, -1);
boolean isConnected = isConnected(visited);
System.out.println("isTree= "+ isTree + " isConnected= "+ isConnected);
if(isTree && isConnected)
return root;
else
return -1;
}
static boolean isTree(int node, boolean visited[], int parent)
{
// System.out.println("node =" +node +" parent" +parent);
visited[node] = true;int j;
for(j =1;j<V;j++)
{
// System.out.println("node =" +node + " j=" +j+ "parent" + parent);
if(adj[node][j]==1)
{
if(visited[j])
{
// System.out.println("returning false for j="+ j);
return false;
}
else
{ //visit all adjacent vertices
boolean child = isTree(j, visited, node);
if(!child)
{
// System.out.println("returning false for j="+ j + " node=" +node);
return false;
}
}
}
}
if(j==V)
return true;
else
return false;
}
static int findRoot()
{
int root =-1, j=0,i;
int count =0;
for(j=1;j<V ;j++)
{
count=0;
for(i=1 ;i<V;i++)
{
if(adj[i][j]==1)
count++;
}
// System.out.println("j"+j +" count="+count);
if(count==0)
{
// System.out.println(j);
return j;
}
}
return -1;
}
static void addDirectedEdge(int s, int d)
{
// System.out.println("s="+ s+"d="+d);
adj[s][d]=1;
}
static boolean isConnected(boolean visited[])
{
for(int i=1; i<V;i++)
{
if(!visited[i])
return false;
}
return true;
}
public static void main (String[] args) throws java.lang.Exception
{
int edges[][]= {{2,3},{2,4},{3,1},{3,5},{3,7},{4,6}, {2,8}, {8,9}};
int n=9;
int root = isTree(edges,n);
System.out.println("root is:" + root);
int edges2[][]= {{2,3},{2,4},{3,1},{3,5},{3,7},{4,6}, {2,8}, {6,3}};
int n2=8;
root = isTree(edges2,n2);
System.out.println("root is:" + root);
}
}

Enumerate all paths in a weighted graph from A to B where path length is between C1 and C2

Given two points A and B in a weighted graph, find all paths from A to B where the length of the path is between C1 and C2.
Ideally, each vertex should only be visited once, although this is not a hard requirement. I suppose I could use a heuristic to sort the results of the algorithm to weed out "silly" paths (e.g. a path that just visits the same two nodes over and over again)
I can think of simple brute force algorithms, but are there any more sophisticed algorithms that will make this more efficient? I can imagine as the graph grows this could become expensive.
In the application I am developing, A & B are actually the same point (i.e. the path must return to the start), if that makes any difference.
Note that this is an engineering problem, not a computer science problem, so I can use an algorithm that is fast but not necessarily 100% accurate. i.e. it is ok if it returns most of the possible paths, or if most of the paths returned are within the given length range.
[UPDATE]
This is what I have so far. I have this working on a small graph (30 nodes with around 100 edges). The time required is < 100ms
I am using a directed graph.
I do a depth first search of all possible paths.
At each new node
For each edge leaving the node
Reject the edge if the path we have already contains this edge (in other words, never go down the same edge in the same direction twice)
Reject the edge if it leads back to the node we just came from (in other words, never double back. This removes a lot of 'silly' paths)
Reject the edge if (minimum distance from the end node of the edge to the target node B + the distance travelled so far) > Maximum path length (C2)
If the end node of the edge is our target node B:
If the path fits within the length criteria, add it to the list of suitable paths.
Otherwise reject the edge (in other words, we only ever visit the target node B at the end of the path. It won't be an intermediate point on a path)
Otherwise, add the edge to our path and recurse into it's target node
I use Dijkstra to precompute the minimum distance of all nodes to the target node.
I wrote some java code to test the DFS approach I suggested: the code does not check for paths in range, but prints all paths. It should be simple to modify the code to only keep those in range. I also ran some simple tests. It seems to be giving correct results with 10 vertices and 50 edges or so, though I did not find time for any thorough testing. I also ran it for 100 vertices and 1000 edges. It doesn't run out of memory and keeps printing new paths till I kill it, of which there are a lot. This is not surprising for randomly generated dense graphs, but may not be the case for real world graphs, for example where vertex degrees follow a power law (specially with narrow weight ranges. Also, if you are just interested in how path lengths are distributed in a range, you can stop once you have generated a certain number.
The program outputs the following:
a) the adjacency list of a randomly generated graph.
b) Set of all paths it has found till now.
public class AllPaths {
int numOfVertices;
int[] status;
AllPaths(int numOfVertices){
this.numOfVertices = numOfVertices;
status = new int[numOfVertices+1];
}
HashMap<Integer,ArrayList<Integer>>adjList = new HashMap<Integer,ArrayList<Integer>>();
class FoundSubpath{
int pathWeight=0;
int[] vertices;
}
// For each vertex, a a list of all subpaths of length less than UB found.
HashMap<Integer,ArrayList<FoundSubpath>>allSubpathsFromGivenVertex = new HashMap<Integer,ArrayList<FoundSubpath>>();
public void printInputGraph(){
System.out.println("Random Graph Adjacency List:");
for(int i=1;i<=numOfVertices;i++){
ArrayList<Integer>toVtcs = adjList.get(new Integer(i));
System.out.print(i+ " ");
if(toVtcs==null){
continue;
}
for(int j=0;j<toVtcs.size();j++){
System.out.print(toVtcs.get(j)+ " ");
}
System.out.println(" ");
}
}
public void randomlyGenerateGraph(int numOfTrials){
Random rnd = new Random();
for(int i=1;i < numOfTrials;i++){
Integer fromVtx = new Integer(rnd.nextInt(numOfVertices)+1);
Integer toVtx = new Integer(rnd.nextInt(numOfVertices)+1);
if(fromVtx.equals(toVtx)){
continue;
}
ArrayList<Integer>toVtcs = adjList.get(fromVtx);
boolean alreadyAdded = false;
if(toVtcs==null){
toVtcs = new ArrayList<Integer>();
}else{
for(int j=0;j<toVtcs.size();j++){
if(toVtcs.get(j).equals(toVtx)){
alreadyAdded = true;
break;
}
}
}
if(!alreadyAdded){
toVtcs.add(toVtx);
adjList.put(fromVtx, toVtcs);
}
}
}
public void addAllViableSubpathsToMap(ArrayList<Integer>VerticesTillNowInPath){
FoundSubpath foundSpObj;
ArrayList<FoundSubpath>foundPathsList;
for(int i=0;i<VerticesTillNowInPath.size()-1;i++){
Integer startVtx = VerticesTillNowInPath.get(i);
if(allSubpathsFromGivenVertex.containsKey(startVtx)){
foundPathsList = allSubpathsFromGivenVertex.get(startVtx);
}else{
foundPathsList = new ArrayList<FoundSubpath>();
}
foundSpObj = new FoundSubpath();
foundSpObj.vertices = new int[VerticesTillNowInPath.size()-i-1];
int cntr = 0;
for(int j=i+1;j<VerticesTillNowInPath.size();j++){
foundSpObj.vertices[cntr++] = VerticesTillNowInPath.get(j);
}
foundPathsList.add(foundSpObj);
allSubpathsFromGivenVertex.put(startVtx,foundPathsList);
}
}
public void printViablePaths(Integer v,ArrayList<Integer>VerticesTillNowInPath){
ArrayList<FoundSubpath>foundPathsList;
foundPathsList = allSubpathsFromGivenVertex.get(v);
if(foundPathsList==null){
return;
}
for(int j=0;j<foundPathsList.size();j++){
for(int i=0;i<VerticesTillNowInPath.size();i++){
System.out.print(VerticesTillNowInPath.get(i)+ " ");
}
FoundSubpath fpObj = foundPathsList.get(j) ;
for(int k=0;k<fpObj.vertices.length;k++){
System.out.print(fpObj.vertices[k]+" ");
}
System.out.println("");
}
}
boolean DfsModified(Integer v,ArrayList<Integer>VerticesTillNowInPath,Integer source,Integer dest){
if(v.equals(dest)){
addAllViableSubpathsToMap(VerticesTillNowInPath);
status[v] = 2;
return true;
}
// If vertex v is already explored till destination, just print all subpaths that meet criteria, using hashmap.
if(status[v] == 1 || status[v] == 2){
printViablePaths(v,VerticesTillNowInPath);
}
// Vertex in current path. Return to avoid cycle.
if(status[v]==1){
return false;
}
if(status[v]==2){
return true;
}
status[v] = 1;
boolean completed = true;
ArrayList<Integer>toVtcs = adjList.get(v);
if(toVtcs==null){
status[v] = 2;
return true;
}
for(int i=0;i<toVtcs.size();i++){
Integer vDest = toVtcs.get(i);
VerticesTillNowInPath.add(vDest);
boolean explorationComplete = DfsModified(vDest,VerticesTillNowInPath,source,dest);
if(explorationComplete==false){
completed = false;
}
VerticesTillNowInPath.remove(VerticesTillNowInPath.size()-1);
}
if(completed){
status[v] = 2;
}else{
status[v] = 0;
}
return completed;
}
}
public class AllPathsCaller {
public static void main(String[] args){
int numOfVertices = 20;
/* This is the number of attempts made to create an edge. The edge is usually created but may not be ( eg, if an edge already exists between randomly attempted source and destination.*/
int numOfEdges = 200;
int src = 1;
int dest = 10;
AllPaths allPaths = new AllPaths(numOfVertices);
allPaths.randomlyGenerateGraph(numOfEdges);
allPaths.printInputGraph();
ArrayList<Integer>VerticesTillNowInPath = new ArrayList<Integer>();
VerticesTillNowInPath.add(new Integer(src));
System.out.println("List of Paths");
allPaths.DfsModified(new Integer(src),VerticesTillNowInPath,new Integer(src),new Integer(dest));
System.out.println("done");
}
}
I think you are on the right track with BFS. I came up with some rough vaguely java-like pseudo-code for a proposed solution using BFS. The idea is to store subpaths found during previous traversals, and their lengths, for reuse. I'll try to improve the code sometime today when I find the time, but hopefully it gives a clue as to where I am going with this. The complexity, I am guessing, should be order O(E).
,
Further comments:
This seems like a reasonable approach, though I am not sure I understand completely. I've constructed a simple example to make sure I do. Lets consider a simple graph with all edges weighted 1, and adjacency list representation as follows:
A->B,C
B->C
C->D,F
F->D
Say we wanted to find all paths from A to F, not just those in range, and destination vertices from a source vertex are explored in alphabetic order. Then the algorithm would work as follows:
First starting with B:
ABCDF
ABCF
Then starting with C:
ACDF
ACF
Is that correct?
A simple improvement in that case, would be to store for each vertex visited, the paths found after the first visit to that node. For example, in this example, once you visit C from B, you find that there are two paths to F from C: CF and CDF. You can save this information, and in the next iteration once you reach C, you can just append CF and CDF to the path you have found, and won't need to explore further.
To find edges in range, you can use the conditions you already described for paths generated as above.
A further thought: maybe you do not need to run Dijkstra's to find shortest paths at all. A subpath's length will be found the first time you traverse the subpath. So, in this example, the length of CDF and CF the first time you visit C via B. This information can be used for pruning the next time C is visited directly via A. This length will be more accurate than that found by Dijkstra's, as it would be the exact value, not the lower bound.
Further comments:
The algorithm can probably be improved with some thought. For example, each time the relaxation step is executed in Dijkstra's algorithm (steps 16-19 in the wikipedia description), the rejected older/newer subpath can be remembered using some data structure, if the older path is a plausible candidate (less than upper bound). In the end, it should be possible to reconstruct all the rejected paths, and keep the ones in range.
This algorithm should be O(V^2).
I think visiting each vertex only once may be too optimistic: algorithms such as Djikstra's shortest path have complexity v^2 for finding a single path, the shortest path. Finding all paths (including shortest path) is a harder problem, so should have complexity at least V^2.
My first thought on approaching the problem is a variation of Djikstra's shortest path algorithm. Applying this algorithm once would give you the length of the shortest path. This gives you a lower bound on the path length between the two vertices. Removing an edge at a time from this shortest path, and recalculating the shortest path should give you slightly longer paths.
In turn, edges can be removed from these slightly longer paths to generate more paths, and so on. You can stop once you have a sufficient number of paths, or if the paths you generate are over your upper bound.
This is my first guess. I am a newbie to stackoverflow: any feedback is welcome.

Resources