Suppose you have a directed graph G, with V vertices and E edges. There are L special edges that contain a medal, and the objective is to collect N medals in the least time. Note that vertices and edges (except special edges) can be visited any number of times. You are also given a starting vertex, but no end vertex.
I have looked at similar problems, namely this: find shortest path in a graph that compulsorily visits certain Edges while others are not compulsory to visit. Unfortunately, L is around 600, and N is around 100. I have also considered some modified version of Dijkstra's algorithm, but that only allows vertices to be visited once. Is there some solution that can run this in a reasonable amount of time?
Tracking the references mcdowella suggested, we end up at this paper (direct link to PDF, may require a Wiley online subscription). The described problem is the Rural Postman Problem and is indeed NP-hard; the paper mentions TSP (NP-hard), the Chinese Postman Problem (polytime), and the Rural Postman Problem (NP-hard). They reduce from Hamilton Circuit to RPP, and the reduction is pretty much what I suggested in a comment: split every node into two, connect them with an edge, assign suitable weights, make those edges the ones you want to visit.
They mention that the difference between CPP (where you have to visit all edges) and RPP is similar to the difference between finding an MST, where you have to find a minimum-weight tree that spans all nodes, and a Steiner tree, where you have to find a minimum-weight tree that spans a subset of the nodes.
There is something called the Chinese Postman problem, which Wikipedia calls https://en.wikipedia.org/wiki/Route_inspection_problem. This covers the case when you want to visit all edges, but if you at the end of the Wikipedia account you will see "Minimize the "Rural Postman Problem": solve the problem with some edges not required." which gives you at least a reference and perhaps a search term.
Related
Working on an algorithm for a game I am developing with a friend and we got stuck. Currently, we have a cyclic undirected graph, and we are trying to find the quickest path from starting node S that covers every edge. We are not looking for a tour and there can be repeated edges.
Any ideas on an algorithm or approximation? I'm sure this problem is NP-hard, but I don't believe it's TSP.
Route Inspection
This is known as the route inspection problem and it does have a polynomial solution.
The basic idea (see the link for more details) is that it is easy to solve for an Eulerian path (where we visit every edge once), but an Eulerian path is only possible for certain graphs.
In particular, a graph has to be connected and have either 0 or 2 vertices of odd degree.
However, it is possible to generalise this for other graphs by adding additional edges in the cheapest way that will produce a graph that does have an Eulerian path. (Note that we have added more edges so we may travel multiple times over edges in the original graph.)
The way of choosing the best way to add additional edges is a maximal matching problem that can be solved in O(n^3).
P.S.
Concidentally I wrote a simple demo earlier today (link to game) for a planar max-cut problem. The solution to this turns out to be based on exactly the same route inspection problem :)
EDIT
I just spotted from the comments that in your particular case your graph may be a tree.
If so, then I believe the answer is much simpler as you just need to do a DFS over the tree making sure to visit the shallowest subtree first.
For example, suppose you have a tree with edges S->A and S->A->B. S has two subtrees, and you should visit A first because it is shallower.
The total edges visited will equal the number of edges visited in a full DFS, minus the depth of the last leaf visited, which is why to minimise the total edges you want to maximise the depth of the last leaf, and hence visit the shallowest subtree first.
This is somewhat like the Eulerian Path. The main distinction is that there may be dead-ends and you may be able to modify the algorithm to suit your needs. Pruning dead-ends is one option or you may be able to reduce the graph into a number of connected components.
DFS will work here. However you must have a good evaluation function to prun the branch early. Otherwise you can not solve this problem fast. You can refer to my discussion and implementation in Java here http://www.capacode.com/?p=650
Detail of my evaluation function
My first try is if the length of the current path plus the distance from U to G is not shorter than the minimum length (stored in minLength variable) we found, we will not visit U next because it can not lead a shorter path.
Actually, the above evaluation function is not efficient because it only works when we already visit most of the cities. We need to compute more precise the minimum length to reach G with all cities visited.
Assume s is the length from S to U, from U to visit G and pass all cities, the length is at least sā = s + ā minDistance(K) where K is an unvisited city and different from U; minDistance(K) is the minimum distance from K to an unvisited state. Basically, for each unvisited state, we assume that we can reach that city with the shortest edge. Note that those shortest edges may not compose a valid path. Then, we will not visit U if sā ā„ minLength.
With that evaluation function, my program can handle the problem with 20 cities within 1 second. I also add another optimization to improve the performance more. Before running the program, I use greedy algorithm to get a good value for minLength. Specifically, for each city, we will visit the nearest city next. The reason is when we have a smaller minLength, we can prun more.
Given a directed graph, I need to find the minimum set of vertices from which all other vertices can be reached.
So the result of the function should be the smallest number of vertices, from which all other vertices can be reached by following the directed edges.
The largest result possible would be if there were no edges, so all nodes would be returned.
If there are cycles in the graph, for each cycle, one node is selected. It does not matter which one, but it should be consistent if the algorithm is run again.
I am not sure that there is an existing algorithm for this? If so does it have a name? I have tried doing my research and the closest thing seems to be finding a mother vertex
If it is that algorithm, could the actual algorithm be elaborated as the answer given in that link is kind of vague.
Given I have to implement this in javascript, the preference would be a .js library or javascript example code.
From my understanding, this is just finding the strongly connected components in a graph. Kosaraju's algorithm is one of the neatest approaches to do this. It uses two depth first searches as against some later algorithms that use just one, but I like it the most for its simple concept.
Edit: Just to expand on that, the minimum set of vertices is found as was suggested in the comments to this post :
1. Find the strongly connected components of the graph - reduce each component to a single vertex.
2. The remaining graph is a DAG (or set of DAGs if there were disconnected components), the root(s) of which form the required set of vertices.
[EDIT #2: As Jason Orendorff mentions in a comment, finding the feedback vertex set is overkill and will produce a vertex set larger than necessary in general. kyun's answer is (or will be, when he/she adds in the important info in the comments) the right way to do it.]
[EDIT: I had the two steps round the wrong way... Now we should guarantee minimality.]
Call all of the vertices with in-degree zero Z. No vertex in Z can be reached by any other vertex, so it must be included in the final set.
Using a depth-first (or breadth-first) traversal, trace out all the vertices reachable from each vertex in Z and delete them -- these are the vertices already "covered" by Z.
The graph now consists purely of directed cycles. Find a feedback vertex set F which gives you a smallest-possible set of vertices whose removal would break every cycle in the graph. Unfortunately as that Wikipedia link shows, this problem is NP-hard for directed graphs.
The set of vertices you're looking for is Z+F.
I have directed graph with lot of cycles, probably strongly connected, and I need to get a minimal cycle from it. I mean I need to get cycle, which is the shortest cycle in graph, and every edge is covered at least once.
I have been searching for some algorithm or some theoretical background, but only thing I have found is Chinese postman algorithm. But this solution is not for directed graph.
Can anybody help me? Thanks
Edit>> All edges of that graph have the same cost - for instance 1
Take a look at this paper - Directed Chinese Postman Problem. That is the correct problem classification though (assuming there are no more restrictions).
If you're just reading into theory, take a good read at this page, which is from the Algorithms Design Manual.
Key quote (the second half for the directed version):
The optimal postman tour can be constructed by adding the appropriate edges to the graph G so as to make it Eulerian. Specifically, we find the shortest path between each pair of odd-degree vertices in G. Adding a path between two odd-degree vertices in G turns both of them to even-degree, thus moving us closer to an Eulerian graph. Finding the best set of shortest paths to add to G reduces to identifying a minimum-weight perfect matching in a graph on the odd-degree vertices, where the weight of edge (i,j) is the length of the shortest path from i to j. For directed graphs, this can be solved using bipartite matching, where the vertices are partitioned depending on whether they have more ingoing or outgoing edges. Once the graph is Eulerian, the actual cycle can be extracted in linear time using the procedure described above.
I doubt that it's optimal, but you could do a queue based search assuming the graph is guaranteed to have a cycle. Each queue entry would contain a list of nodes representing paths. When you take an element off the queue, add all possible next steps to the queue, ensuring you are not re-visiting nodes. If the last node is the same as the first node, you've found the minimum cycle.
what you are looking for is called "Eulerian path". You can google it to find enough info, basics are here
And about algorithm, there is an algorithm called Fleury's algorithm, google for it or take a look here
I think it might be worth while just simply writing which vertices are odd and then find which combo of them will lead to the least amount of extra time (if the weights are for times or distances) then the total length will be every edge weight plus the extra. For example, if the odd order vertices are A,B,C,D try AB&CD then AC&BD and so on. (I'm not sure if this is a specifically named method, it just worked for me).
edit: just realised this mostly only works for undirected graphs.
The special case in which the network consists entirely of directed edges can be solved in polynomial time. I think the original paper is Matching, Euler tours and the Chinese postman (1973) - a clear description of the algorithm for the directed graph problem begins on page 115 (page 28 of the pdf):
When all of the edges of a connected graph are directed and the graph
is symmetric, there is a particularly simple and attractive algorithm for
specifying an Euler tour...
The algorithm to find an Euler tour in a directed, symmetric, connected graph G is to first find a spanning arborescence of G. Then, at
any node n, except the root r of the arborescence, specify any order for
the edges directed away from n so long as the edge of the arborescence
is last in the ordering. For the root r, specify any order at all for the
edges directed away from r.
This algorithm was used by van Aardenne-Ehrenfest and de Bruin to
enumerate all Euler tours in a certain directed graph [ 1 ].
I have a digraph which is strongly connected (i.e. there is a path from i to j and j to i for each pair of nodes (i, j) in the graph G). I wish to find a strongly connected graph out of this graph such that the sum of all edges is the least.
To put it differently, I need to get rid of edges in such a way that after removing them, the graph will still be strongly connected and of least cost for the sum of edges.
I think it's an NP hard problem. I'm looking for an optimal solution, not approximation, for a small set of data like 20 nodes.
Edit
A more general description: Given a grap G(V,E) find a graph G'(V,E') such that if there exists a path from v1 to v2 in G than there also exists a path between v1 and v2 in G' and sum of each ei in E' is the least possible. so its similar to finding a minimum equivalent graph, only here we want to minimize the sum of edge weights rather than sum of edges.
Edit:
My approach so far:
I thought of solving it using TSP with multiple visits, but it is not correct. My goal here is to cover each city but using a minimum cost path. So, it's more like the cover set problem, I guess, but I'm not exactly sure. I'm required to cover each and every city using paths whose total cost is minimum, so visiting already visited paths multiple times does not add to the cost.
Your problem is known as minimum spanning strong sub(di)graph (MSSS) or, more generally, minimum cost spanning sub(di)graph and is NP-hard indeed. See also another book: page 501 and page 480.
I'd start with removing all edges that don't satisfy the triangle inequality - you can remove edge a -> c if going a -> b -> c is cheaper. This reminds me of TSP, but don't know if that leads anywhere.
My previous answer was to use the Chu-Liu/Edmonds algorithm which solves Arborescence problem; as Kazoom and ShreevatsaR pointed out, this doesn't help.
I would try this in a dynamic programming kind of way.
0- put the graph into a list
1- make a list of new subgraphs of each graph in the previous list, where you remove one different edge for each of the new subgraphs
2- remove duplicates from the new list
3- remove all graphs from the new list that are not strongly connected
4- compare the best graph from the new list with the current best, if better, set new current best
5- if the new list is empty, the current best is the solution, otherwise, recurse/loop/goto 1
In Lisp, it could perhaps look like this:
(defun best-subgraph (digraphs &optional (current-best (best digraphs)))
(let* ((new-list (remove-if-not #'strongly-connected
(remove-duplicates (list-subgraphs-1 digraphs)
:test #'digraph-equal)))
(this-best (best (cons current-best new-list))))
(if (null new-list)
this-best
(best-subgraph new-list this-best))))
The definitions of strongly-connected, list-subgraphs-1, digraph-equal, best, and better are left as an exercise for the reader.
This problem is equivalent to the problem described here: http://www.facebook.com/careers/puzzles.php?puzzle_id=1
Few ideas that helped me to solve the famous facebull puzzle:
Preprocessing step:
Pruning: remove all edges a-b if there are cheaper or having the same cost path, for example: a-c-b.
Graph decomposition: you can solve subproblems if the graph has articulation points
Merge vertexes into one virtual vertex if there are only one outgoing edge.
Calculation step:
Get approximate solution using the directed TSP with repeated visits. Use Floyd Warshall and then solve Assignment problem O(N^3) using hungarian method. If we got once cycle - it's directed TSP solution, if not - use branch and bound TSP. After that we have upper bound value - the cycle of the minimum cost.
Exact solution - branch and bound approach. We remove the vertexes from the shortest cycle and try build strongly connected graph with less cost, than upper bound.
That's all folks. If you want to test your solutions - try it here: http://codercharts.com/puzzle/caribbean-salesman
Sounds like you want to use the Dijkstra algorithm
Seems like what you basically want is an optimal solution for traveling-salesman where it is permitted for nodes to be visited more than once.
Edit:
Hmm. Could you solve this by essentially iterating over each node i and then doing a minimum spanning tree of all the edges pointing to that node i, unioned with another minimum spanning tree of all the edges pointing away from that node?
A 2-approximation to the minimal strongly connected subgraph is obtained by taking a union of a minimal in-branching and minimal out-branching, both rooted at the same (but arbitrary) vertex.
An out-branching, also known as arborescence, is a directed tree rooted at a single vertex spanning all vertexes. An in-branching is the same with reverse edges. These can be found by Edmonds' algorithm in time O(VE), and there are speedups to O(E log(V)) (see the wiki page). There is even an open source implementation.
The original reference for the 2-approximation result is the paper by JaJa and Frederickson, but the paper is not freely accessible.
There is even a 3/2 approximation by Adrian Vetta (PDF), but more complicated than the above.
I have a undirected graph with about 100 nodes and about 200 edges. One node is labelled 'start', one is 'end', and there's about a dozen labelled 'mustpass'.
I need to find the shortest path through this graph that starts at 'start', ends at 'end', and passes through all of the 'mustpass' nodes (in any order).
( http://3e.org/local/maize-graph.png / http://3e.org/local/maize-graph.dot.txt is the graph in question - it represents a corn maze in Lancaster, PA)
Everyone else comparing this to the Travelling Salesman Problem probably hasn't read your question carefully. In TSP, the objective is to find the shortest cycle that visits all the vertices (a Hamiltonian cycle) -- it corresponds to having every node labelled 'mustpass'.
In your case, given that you have only about a dozen labelled 'mustpass', and given that 12! is rather small (479001600), you can simply try all permutations of only the 'mustpass' nodes, and look at the shortest path from 'start' to 'end' that visits the 'mustpass' nodes in that order -- it will simply be the concatenation of the shortest paths between every two consecutive nodes in that list.
In other words, first find the shortest distance between each pair of vertices (you can use Dijkstra's algorithm or others, but with those small numbers (100 nodes), even the simplest-to-code Floyd-Warshall algorithm will run in time). Then, once you have this in a table, try all permutations of your 'mustpass' nodes, and the rest.
Something like this:
//Precomputation: Find all pairs shortest paths, e.g. using Floyd-Warshall
n = number of nodes
for i=1 to n: for j=1 to n: d[i][j]=INF
for k=1 to n:
for i=1 to n:
for j=1 to n:
d[i][j] = min(d[i][j], d[i][k] + d[k][j])
//That *really* gives the shortest distance between every pair of nodes! :-)
//Now try all permutations
shortest = INF
for each permutation a[1],a[2],...a[k] of the 'mustpass' nodes:
shortest = min(shortest, d['start'][a[1]]+d[a[1]][a[2]]+...+d[a[k]]['end'])
print shortest
(Of course that's not real code, and if you want the actual path you'll have to keep track of which permutation gives the shortest distance, and also what the all-pairs shortest paths are, but you get the idea.)
It will run in at most a few seconds on any reasonable language :)
[If you have n nodes and k 'mustpass' nodes, its running time is O(n3) for the Floyd-Warshall part, and O(k!n) for the all permutations part, and 100^3+(12!)(100) is practically peanuts unless you have some really restrictive constraints.]
run Djikstra's Algorithm to find the shortest paths between all of the critical nodes (start, end, and must-pass), then a depth-first traversal should tell you the shortest path through the resulting subgraph that touches all of the nodes start ... mustpasses ... end
This is two problems... Steven Lowe pointed this out, but didn't give enough respect to the second half of the problem.
You should first discover the shortest paths between all of your critical nodes (start, end, mustpass). Once these paths are discovered, you can construct a simplified graph, where each edge in the new graph is a path from one critical node to another in the original graph. There are many pathfinding algorithms that you can use to find the shortest path here.
Once you have this new graph, though, you have exactly the Traveling Salesperson problem (well, almost... No need to return to your starting point). Any of the posts concerning this, mentioned above, will apply.
Actually, the problem you posted is similar to the traveling salesman, but I think closer to a simple pathfinding problem. Rather than needing to visit each and every node, you simply need to visit a particular set of nodes in the shortest time (distance) possible.
The reason for this is that, unlike the traveling salesman problem, a corn maze will not allow you to travel directly from any one point to any other point on the map without needing to pass through other nodes to get there.
I would actually recommend A* pathfinding as a technique to consider. You set this up by deciding which nodes have access to which other nodes directly, and what the "cost" of each hop from a particular node is. In this case, it looks like each "hop" could be of equal cost, since your nodes seem relatively closely spaced. A* can use this information to find the lowest cost path between any two points. Since you need to get from point A to point B and visit about 12 inbetween, even a brute force approach using pathfinding wouldn't hurt at all.
Just an alternative to consider. It does look remarkably like the traveling salesman problem, and those are good papers to read up on, but look closer and you'll see that its only overcomplicating things. ^_^ This coming from the mind of a video game programmer who's dealt with these kinds of things before.
This is not a TSP problem and not NP-hard because the original question does not require that must-pass nodes are visited only once. This makes the answer much, much simpler to just brute-force after compiling a list of shortest paths between all must-pass nodes via Dijkstra's algorithm. There may be a better way to go but a simple one would be to simply work a binary tree backwards. Imagine a list of nodes [start,a,b,c,end]. Sum the simple distances [start->a->b->c->end] this is your new target distance to beat. Now try [start->a->c->b->end] and if that's better set that as the target (and remember that it came from that pattern of nodes). Work backwards over the permutations:
[start->a->b->c->end]
[start->a->c->b->end]
[start->b->a->c->end]
[start->b->c->a->end]
[start->c->a->b->end]
[start->c->b->a->end]
One of those will be shortest.
(where are the 'visited multiple times' nodes, if any? They're just hidden in the shortest-path initialization step. The shortest path between a and b may contain c or even the end point. You don't need to care)
Andrew Top has the right idea:
1) Djikstra's Algorithm
2) Some TSP heuristic.
I recommend the Lin-Kernighan heuristic: it's one of the best known for any NP Complete problem. The only other thing to remember is that after you expanded out the graph again after step 2, you may have loops in your expanded path, so you should go around short-circuiting those (look at the degree of vertices along your path).
I'm actually not sure how good this solution will be relative to the optimum. There are probably some pathological cases to do with short circuiting. After all, this problem looks a LOT like Steiner Tree: http://en.wikipedia.org/wiki/Steiner_tree and you definitely can't approximate Steiner Tree by just contracting your graph and running Kruskal's for example.
Considering the amount of nodes and edges is relatively finite, you can probably calculate every possible path and take the shortest one.
Generally this known as the travelling salesman problem, and has a non-deterministic polynomial runtime, no matter what the algorithm you use.
http://en.wikipedia.org/wiki/Traveling_salesman_problem
The question talks about must-pass in ANY order. I have been trying to search for a solution about the defined order of must-pass nodes. I found my answer but since no question on StackOverflow had a similar question I'm posting here to let maximum people benefit from it.
If the order or must-pass is defined then you could run dijkstra's algorithm multiple times. For instance let's assume you have to start from s pass through k1, k2 and k3 (in respective order) and stop at e. Then what you could do is run dijkstra's algorithm between each consecutive pair of nodes. The cost and path would be given by:
dijkstras(s, k1) + dijkstras(k1, k2) + dijkstras(k2, k3) + dijkstras(k3, 3)
How about using brute force on the dozen 'must visit' nodes. You can cover all the possible combinations of 12 nodes easily enough, and this leaves you with an optimal circuit you can follow to cover them.
Now your problem is simplified to one of finding optimal routes from the start node to the circuit, which you then follow around until you've covered them, and then find the route from that to the end.
Final path is composed of :
start -> path to circuit* -> circuit of must visit nodes -> path to end* -> end
You find the paths I marked with * like this
Do an A* search from the start node to every point on the circuit
for each of these do an A* search from the next and previous node on the circuit to the end (because you can follow the circuit round in either direction)
What you end up with is a lot of search paths, and you can choose the one with the lowest cost.
There's lots of room for optimization by caching the searches, but I think this will generate good solutions.
It doesn't go anywhere near looking for an optimal solution though, because that could involve leaving the must visit circuit within the search.
One thing that is not mentioned anywhere, is whether it is ok for the same vertex to be visited more than once in the path. Most of the answers here assume that it's ok to visit the same edge multiple times, but my take given the question (a path should not visit the same vertex more than once!) is that it is not ok to visit the same vertex twice.
So a brute force approach would still apply, but you'd have to remove vertices already used when you attempt to calculate each subset of the path.