Boost Graph Library A* for consistent heuristic - boost

I've read that, when applying A* to a problem, if your heuristic is consistent then you can further optimize the A* search. The Boost Graph Library offers two versions of the A* algorithm: astar_search and astar_search_tree. The documentation isn't very clear on the distinction between the two; does one of these perform the optimized search which assumes a consistent heuristic?

I got the answer I was looking for by consulting the Boost mailing list. I'll reproduce the answer here for posterity:
The distinction is whether there should be an effort made to avoid
visiting the same vertex multiple times. For a graph where vertices can
often by reached using many paths, you should prefer astar_search to avoid
extra work from revisiting that vertex and its successors. If it is
unlikely that the same vertex will appear on multiple paths, or checking
vertices is inexpensive enough that it is not worthwhile to avoid repeated
work, use astar_search_tree which does not remember which vertices it has
visited previously. The disadvantage of trying to find repeated vertices
is that it requires a growing amount of memory to store the lookup table
of which vertices have been seen before, and it takes time to search and
update this table. Both versions of the algorithm require admissible
heuristics to work correctly.
Link to the thread

The difference between the _tree and the non _tree boost graph algorithm versions is that the _tree version assumes your graph is really a tree, so it has no cycles and only has one in arrow for node.

Related

Find all cycles in a directed Graph Golang

I'm trying to generate all the cycles contained in a directed graph using Golang (or at least a few).
I currently have two structs :
Node : { ID (string), resolved (bool), edges ([]Edge) }
Edge : { ID (string), start (Node), end (Node), weight (Float64)}
The cycle weight is not an issue (for the moment).
I've found some answers regarding how to detect cycles, or find shortest path etc. but i didn't find an algorithm that can quite help me.
How shall I proceed? (any suggestion is welcome)
There are two parts to the question.
Regarding algorithms to detect all cycles in a graph, take a look at this related question (since this is not go-specific), there are useful explanations and pseudo-code that you can use to implement your solution.
Finding all cycles in a directed graph
As per specific go code, there are several libraries out there that work with graphs, you can take a look at their documentation and source code (they might even provide functionality that you can use out-of-the-box to solve your problem).
For example: https://godoc.org/github.com/twmb/algoimpl/go/graph
I would suggest starting with defining what a cycle is - for example let's suppose it is a traversal through the graph that starts and ends in the same node.
To enumerate all cycles with this definition, you'll need to consider all paths starting from all nodes, and check if any of those paths go back to their start point.
However, observe that this definition can actually count each cyclical subgraph many times - any node along a cyclical path - is that one cycle or several? And things get even more complicated if the paths of several cycles intersect, the number of cyclical paths increases drastically, and it's not very clear which cycles are "the same".
I hope it's easy to see that a brute force approach is intractable for anything but very small and simple graphs, and that something concerned with say minimal cycles or even just identifying cyclic subgraphs is enough for your purposes.
As already mentioned by #eugenioy, this has been asked before and you can probably narrow down your question by looking at the answers in that thread.
So, depending on what you mean by "all" and what you mean by "cycles", you can probably find an algorithm that defines cycles in the same way that you are interested in, and, and ask a more focused question if you're having trouble translating it to Go, which I don't think your question is really about at the moment.

What data structure to use for digraph paths?

I'm trying to represent a transitive relation (in a database) and having a hard time working out the best data structure.
Basically, the data structure is a series of pairs A → B such that if A → B and B → C, then implicitly A → C. It's important to me to be able to identify which entries are original input and which entries exist implicitly. Asking if A → C is equivalent to me having a digraph and asking if there exists a path from A to C in that digraph.
I could just represent the original entries, but if I do than then it takes a lot of time to determine if two items are related, since I need to search for all possible paths and this is rather slow.
Alternatively, I can store the original edges, as well as a listing of all paths. This makes adding a new edge easy, because when I add A → B I can just take the Cartesian product of paths ending in A and the paths ending in B and put them together. This has some significant space overhead of O(n2) in the worst case, but has the nice property that lookups, by far the most common operation, will be constant time. The issue is deleting, where I cannot think of anything really other than recalculating all paths that may or may not run through the edge deleted, and this can be really nasty.
Does anyone have any better ideas?
Technical notes: the digraph may be cyclic, but the relation is reflexive so I don't need to represent the reflexivity or store anything about it.
This is called the Reachability problem.
It would seem that you want an efficient online algorithm, which is an open problem, and an area of much research.
See my similar question on cs.SE: An incrementally-condensed transitive-reduction of a DAG, with efficient reachability queries, where I reference several related querstions across stackexchange:
Related:
What is the fastest deterministic algorithm for dynamic digraph reachability with no edge deletion?
What is the fastest deterministic algorithm for incremental DAG reachability?
Does an algorithm exist to efficiently maintain connectedness information for a DAG in presence of inserts/deletes?
Is there an online-algorithm to keep track of components in a changing undirected graph?
Dynamic shortest path data structure for DAG
Note that even though some algorithm might be for a DAG only, if it supports condensation (that is, collapsing strongly connected components into one node, since they are considered equal, ie. they relate back and forth), it is equivalent; after condensation, you can query the graph for the representative node in place of any of the condensed nodes (because they were both reachable from each-other, and thusly related to the rest of the graph in exactly the same way).
My conclusion is that as-of-yet there does not seem to be an efficient way to do this (on the order of O(log n) queries for a dynamic graph, with output-sensitive update times on the condensed graph). For less efficient ways, see the related links above.
The closest practical algorithm I found was here (source), which is an interesting read. I am not sure how easy/practical this data-structure or any data structure in any paper you will find, would be to adapt it to a database.
PS. Consider asking CS-related questions on cs.stackexchange.com in the future.

How to compute single source shortest paths with OSRM?

I've been playing around with the OSRM routing library recently. It seems to be highly efficient at solving the shortest path problem. However, I didn't see how to compute single source shortest paths with it. More precisely, given a fixed starting point, compute the shortest distances to all locations that can be reached within a given distance limit (e.g., reachable within 30 minutes).
OSRM uses contraction hierarchies internally. From my understanding, this technique is way superior to Dijkstra's algorithm when it comes to computing the distance between two locations in real world data. However, for my problem, Dijkstra's algorithm seems to fit better, doesn't it?
Does OSRM provide an API to compute single source shortest path problems (with a limit on the distance)? Are there other free routing libraries that are better suited for this type of problem? Preferably one with good support for OpenStreetMap data.
OSRM is using contraction hierarchies (CH) to be that fast for "one to one routing". To make CH working you need an adapted bidirectional algorithm (A*, Dijkstra, ...) so the single source case is more difficult. BUT a one to many algorithm is relative simple if you know up front which destinations you want.
Also have a look into the paper "Fast Detour Computation for Ride Sharing" or here if you want a solution for a "non goal-directed, bidirectional search" which uses lookup tables.
other free routing libraries?
I would suggest my Java GraphHopper project ;)
but there are of course more: http://wiki.openstreetmap.org/wiki/Routing

Random spanning trees of bipartite graphs

I'm working on making some code using metaheuristics for finding good solutions to the Fixed Charge Transportation Problem (FCTP).
The problem I'm having is to generate a starting solution, based on finding a spanning tree for the underlying bipartite graph.
I want it to be a random spanning tree, so that I can run the procedure on the same problem multiple times, possibly getting different solutions.
I'm having some difficulties doing this. The approach I've gone for so far is to make a random permutation of the arcs, then iterate through this list, sequentially putting them into basis if it won't create a cycle.
I need to find a fast method to check if including an arc will create a cycle. I don't want to "brute force" it, since this approach could take a large amount of time, given big problem instances.
Given that A is an array with a random permutation of the arcs, how would you go around making a selection procedure?
I've been working on this for a couple of hours now, and nothing I've tried has worked, most of it being nonsensical when it came to application...
Kruskals Algorithm is used for finding the minimum spanning tree. The fast-cycle detection is not actually part of Kruskals algorithm. The algorithm will work with a data structure that is able to find cycles fast as well as with a slow naive attempt (however the complexity will be different).
However Kruskals Algorithm is on track here, since it usually uses a so called union-find or disjoint-set datastructure for fast detection of cycles. This is the part of the Kruskals Algorithm page on wikipedia that you will need for your algorithm. This is also linked on wikipedia: http://en.wikipedia.org/wiki/Disjoint-set_data_structure
I found Kruskal's algorithm after long hours of research. I only needed to randomize the order in which I investigated the nodes of the graph.

Graph Isomorphism

Is there an algorithm or heuristics for graph isomorphism?
Corollary: A graph can be represented in different different drawings.
What s the best approach to find different drawing of a graph?
It is a hell of a problem.
In general, the basic idea is to simplify the graph into a canonical form, and then perform comparison of canonical forms. Spanning trees are generated with this objective, but spanning trees are not unique, so you need to have a canonical way to create them.
After you have canonical forms, you can perform isomorphism comparison (relatively) easy, but that's just the start, since non-isomorphic graphs can have the same spanning tree. (e.g. think about a spanning tree T and a single addition of an edge to it to create T'. These two graphs are not isomorph, but they have the same spanning tree).
Other techniques involve comparing descriptors (e.g. number of nodes, number of edges), which can produce false positive in general.
I suggest you to start with the wiki page about the graph isomorphism problem. I also have a book to suggest: "Graph Theory and its applications". It's a tome, but worth every page.
As from you corollary, every possible spatial distribution of a given graph's vertexes is an isomorph. So two isomorph graphs have the same topology and they are, in the end, the same graph, from the topological point of view. Another matter is, for example, to find those isomorph structures enjoying particular properties (e.g. with non crossing edges, if exists), and that depends on the properties you want.
One of the best algorithms out there for finding graph isomorphisms is VF2.
I've written a high-level overview of VF2 as applied to chemistry - where it is used extensively. The post touches on the differences between VF2 and Ullmann. There is also a from-scratch implementation of VF2 written in Java that might be helpful.
A very similar problem - graph automorphism - can be solved by saucy, which is available in source code. This finds all symmetries of a graph. If you have two graphs, join them into one and any isomorphism can be discovered as an automorphism of the join.
Disclaimer: I am one of co-authors of saucy.
There are algorithms to do this -- however, I have not had cause to seriously investigate them as of yet. I believe Donald Knuth is either writing or has written on this subject in his Art of Computing series during his second pass at (re)writing them.
As for a simple way to do something that might work in practice on small graphs, I would recommend counting degrees, then for each vertex, also note the set of degrees for those vertexes that are adjacent. This will then give you a set of potential vertex isomorphisms for each point. Then just try all those (via brute force, but choosing the vertexes in increasing order of potential vertex isomorphism sets) from this restricted set. Intuitively, most graph isomorphism can be practically computed this way, though clearly there would be degenerate cases that might take a long time.
I recently came across the following paper : http://arxiv.org/abs/0711.2010
This paper proposes "A Polynomial Time Algorithm for Graph Isomorphism"
My project - Griso - at sf.net: http://sourceforge.net/projects/griso/ with this description:
Griso is a graph isomorphism testing utility written in C++. It is based on my own POLYNOMIAL-TIME (in this point the salt of the project) algorithm. See Griso's sample input/output on http://funkybee.narod.ru/graphs.htm page.
nauty and Traces
nauty and Traces are programs for computing automorphism groups of graphs and digraphs [*]. They can also produce a canonical label. They are written in a portable subset of C, and run on a considerable number of different systems.
AutGroupGraph command in GRAPE's package of GAP.
bliss: another symmetry and canonical labeling program.
conauto: a graph ismorphism package.
As for heuristics: i've been fantasising about a modified Ullmann's algorithm, where you don't only use breadth first search but mix it with depth first search the way, that first you use breadth first search intensively, than you set a limit for breadth analysis and go deeper after checking a few neighbours, and you lower the breadh every step at some amount. This is practically how i find my way on a map: first locate myself with breadth first search, then search the route with depth first search - largely, and this is the best evolution of my brain has ever invented. :) On the long term some intelligence may be added for increasing breadth first search neighbour count at critical vertexes - for example where there are a large number of neighbouring vertexes with the same edge count. Like checking your actual route sometimes with the car (without a gps).
I've found out that the algorithm belongs in the category of k-dimension Weisfeiler-Lehman algorithms, and it fails with regular graphs. For more here:
http://dabacon.org/pontiff/?p=4148
Original post follows:
I've worked on the problem to find isomorphic graphs in a database of graphs (containing chemical compositions).
In brief, the algorithm creates a hash of a graph using the power iteration method. There might be false positive hash collisions but the probability of that is exceedingly small (i didn't had any such collisions with tens of thousands of graphs).
The way the algorithm works is this:
Do N (where N is the radius of the graph) iterations. On each iteration and for each node:
Sort the hashes (from the previous step) of the node's neighbors
Hash the concatenated sorted hashes
Replace node's hash with newly computed hash
On the first step, a node's hash is affected by the direct neighbors of it. On the second step, a node's hash is affected by the neighborhood 2-hops away from it. On the Nth step a node's hash will be affected by the neighborhood N-hops around it. So you only need to continue running the Powerhash for N = graph_radius steps. In the end, the graph center node's hash will have been affected by the whole graph.
To produce the final hash, sort the final step's node hashes and concatenate them together. After that, you can compare the final hashes to find if two graphs are isomorphic. If you have labels, then add them (on the first step) in the internal hashes that you calculate for each node.
There is more background here:
https://plus.google.com/114866592715069940152/posts/fmBFhjhQcZF
You can find the source code of it here:
https://github.com/madgik/madis/blob/master/src/functions/aggregate/graph.py

Resources