efficient algorithm to find the minimum cost path

efficient algorithm to find the minimum cost path - algorithm

There is a given set of cities lets say.. A,B,C,D,E,F,G. The problem is to find the minimum cost path that covers the cities A,B,C,F. It is essential that the path covers the cities A,B,C,F. The path can, (but does not have to) go through any of the other given cities (D,E,G). Repeating a path is allowed. The path should start and end at A.
How should i go about tackling a problem along similar lines?

That's a variant of Travelling Salesman Problem (TSP) in disguise.
You can see that, if you mark every city as "needed to be covered" (I'll call those "interesting" henceforth). The variant of TSP, where you are allowed to visit a node more than once, is still NP-complete.
So, knowing that the complexity of every exact solution to your problem would be exponential in the number of interesting cities, you can procede as follows:
First, precompute the shortest paths between interesting cities. This can be done with Dijkstra's algorithm run from every interesting city or Floyd-Warshall algorithm. Then either try every permutation of the order of visiting interesting cities; or use some existing TSP solver or heuristic algorithm.
So the simplest implementation goes like this:
Apply Floyd-Warshall to the city graph. It's much simpler to implement than Dijkstra's. I've found a nice PDF with their comparison. It gets you the matrix with all the shortest paths for AB, AC, AF, BC, BF and CF. If you need to get the actual path as in sequence of cities, look at Path reconstruction section in Wikipedia.
Try every permutation of the order of visiting interesting cities except A (i.e., only B, C and F). If you use C++, Python or Ruby, they have permutation function in the standard library. For other languages you may want to use a third-party library or search Stack Overflow for an algorithm.
Find the permutation with the lowest total cost of a path. E.g., for permutation C-F-B, the total cost is AC+CF+FB+BA. You already have all those values from Floyd-Warshall, so you can simply sum them.
If you have V total cities and N interesting cities, the runtime of this implementation will be about O(V3 + N!·N)

Related

Is there a more efficient algorithm for determining the existence of k internally vertex-disjoint paths?

I’m trying to find an algorithm that given two non-adjacent vertices s and t, and a k > 0, it determines only the existence of k internally vertex-disjoint paths (the vertex-disjoint paths themselves do not need to meet any specific requirements, we only care about existence).
I’ve found a couple possible candidates, namely Suurballe’s algorithm and this academic paper describing another algorithm that is similar.
The only thing I’ve found that directly addresses my question is this academic paper, which proves that there exists a polynomial time algorithm for determining that there are at least k vertex-disjoint paths for all k.
I’ll be using this algorithm to determine that the the k-vertex-connectivity of a graph is at least k, as described in this paper.
For the algorithm, the graph can be treated as undirected since for every edge going from vertices s to t, there also exists an edge going from t to s.
Can you please help me? Thank you.

Is there an efficient algorithm for finding or approximating the shortest walk of a graph which must visit some subset of vertices of the graph?

The title is a mouth-full, but put simply, I have a large, undirected, incomplete graph, and I need to visit some subset of vertices in the (approximately) shortest time possible. Note that this isn't TSP, as I don't need to visit all vertices.
The naive approach would be to simply brute-force the solution by trying every possible walk which includes the required vertices, using A*, for example, to calculate the walks between required vertices. However, this is O(n!) where n is the number of required vertices. This is unfeasible for me, as n > 40 in my average case, and n ≈ 80 in my worst case.
Is there a more efficient algorithm for this, perhaps one that approximates the solution?
This question is similar to the question here, but differs in the fact that my graph is larger than the one in the linked question. There are several other similar questions, but none that I've seen exactly solve my specific problem.

If you allow visiting the same nodes several times, find the shortest path between each pair of mandatory vertices. Then you solve the TSP between the mandatory vertices, using the above shortest path costs. If you disallow multiple visits, the problem is much worse.
I am afraid you cannot escape the TSP.

dijkstra algorithm, run only one time for shortest paths of some nodes (not two, not the whole graph)

So, dijkstra algorithm is (the best one) used to search for the shortest path of a weighted(without negative) and connected graph. Dijkstra algorithm can be used to find the shortest path of two points/vertices. AND it can be used to find the shortest path of all the vertices.
questions:
is my understanding correct?
Can it also be used to find the shortest paths of some pair of vertices ? for example, the graph has A, B, C, D, E, F, G, H, I, J, K. and we are only interested in the shortest paths of A,B ; C,K. it that possible we only turn the algorithm only one time to find out two paths?

You will need to run two Dijkstras. One starting from A and one from C.
What you could do is to run it from {A, C} (a set Dijkstra) until you have found paths to B and K. But that is no guarantee that the resulting paths are actually from A to B and C to K, it could as well be C, B and C, K. Actually, all combinations of {A, C}, B and {A, C}, K are then possible.
the best one
Not at all. It is a good concept and heavily used for many other similar algorithms. There are many variants, like A*, Arc-Flags, and others. But raw Dijkstra is super slow since it equally searches in all directions.
Imagine a query where you have modeled the whole world. Your destination is 1 hour away. Then Dijkstra will find shortest paths to all nodes that can be reached in 1 hour. So it will also consider a short flight to your neighboring country, even if it's the totally wrong direction. The algorithm A* is a simple modification of Dijkstra that tries to improve on that by introducing a heuristic function that is able to make (hopefully) good guesses about shortest path distances. By that your Dijkstra gets a sense of direction and tries to first prioritize a search into the direction of the destination.
A simple heuristic is as-the-crows-fly. Note that this heuristic does not perform well on road networks and especially bad on transit networks (you often need to drive 10 mins into the wrong direction to get on a highway that lets you arrive earlier in the end, or you need to first drive to some big city to get a good fast train). Other heuristics involve computing landmarks, they yield pretty good results but need a lot of pre-computation and space (usually not a problem).

First of all Dijkstra is not the best algorithm and many heuristics outperform it in most practical implementations.
we are only interested in the shortest paths of A,B ; C,K
Looks like you could take a look at A* algorithm which eliminates the nodes that will not lead you to your final destination with least cost. But it requires a good estimation (and hence the word heuristic) of the distances of various nodes to the destination.
Coming to your main question, the answer is both yes and no with some overheads. We all know that whenever a node gets removed from the min-heap its done for the algorithm.
So, as soon as your node gets removed from the min-heap, just terminate it, but that does not mean that it did find shortest distance only for the given pair. All the nodes that got removed from the min-heap before destination node are also the ones for which it has found the shortest path.
Also its possible that your destination node turns out to be the last node that is removed from the min-heap,which basically means that you have computed shortest paths from source to all nodes.

Is the result of Bellman-Ford "all pairs" or "from one node" shortest paths? / Is there an all-pairs Bellman-Ford version?

I am recently learning about graph algorithms and at my university we were taught, that the result of Bellman-Ford is a table of distances from all nodes to all other nodes (all-pairs shortest paths). However I did not understand how this is achieved by the algorithm and tried to understand it by watching YouTube videos and looking up definitions in Wikipedia and so forth...
Now here comes the problem:
I could not find resources that described the algorithm in a way that the result would be the all pairs shortest paths table, but only "from one node to all other nodes".
Can the Bellman-Ford algorithm be tweaked to achieve the all pairs shortest paths table or is my university-lecturer completely wrong about this? (He did explain some algorithm that delivered all pairs shortest paths and he called it Bellman-Ford, however I think this can not be Bellman Ford)
EDIT: I absolutely understand the Bellman-Ford algorithm for the Problem "shortest path from one node to all other nodes".
I also understand most of the algorithm that was taught at my university for "all pairs shortest paths".
I am just very confused since the algorithm at my university was also called "Bellman-Ford".
If you speak German: Here is a video where the university lecturer talks about his "Bellman-Ford" (which I think is not actually Bellman-Ford):
https://www.youtube.com/watch?v=3_zqU5GWo4w&t=715s

Bellman Ford is algorithm for finding shortest path from given start node to any other node in the graph.
Using Bellman Ford we can generate all pairs shortest paths if we run the bellman ford algorithm from each node and then get the shortest paths to all others, but the worse case time complexity of this algorithm will be O(V * V * E) and if we have complete graph this complexity will be O (V^4), where V is the number of vertexes (nodes) in the graph and E is the number of edges in the graph.
There is better algorithm for finding all pairs shortest paths which works in O(V^3) time complexity. That is the Floyd Warshall algorithm.
Here you can read more about it: https://en.wikipedia.org/wiki/Floyd%E2%80%93Warshall_algorithm

The aim of the algorithm is to find the shortest path from the starting point to the ending point.
To do that, it finds the shortest distance from the all points to every other point and then selects the path that leads to the solution and also adds up to the shortest.
To begin with, it starts with the starting point (A). Sets every point's cost to infinity.
Now it sees all the possible directions from A. And A's initial cost is set to zero.
Imagine it needs to travel to only B. One there might be a straight path that connects B to A and its cost is say, 10.
But there is also a path via C. From A to C it takes say cost 5 and from C to B it takes cost only 2. This means that there are two possible paths from A to B. One that has cost 10 and the other 5+2 i.e. 7 . So it shall update the cost of reaching B from A to 7 not 10 and hence the path shall be selected.
You can imagine this same situation but with many more points. It shall search from starting point to reach the end point traversing all the possible paths and updating/not updating the cost as and when needed. In the end it shall look for all the paths and select the one that has the smallest cost.
Now here comes the problem: a I could not find resources that
described the algorithm in a way that the result would be the all
pairs shortest paths table, but only "from one node to all other
nodes".
To understand that, imagine we have to reach A to D.
The individual cost of moving from one point to another is listed below
A to B :15
A to C :10
B to C :3
B to D :5
C to D :15
Initially set all points to infinity except A to zero.
First,
A->B : Cost=15(Update B's cost to 15)
A->C : Cost=10(Update C's cost to 10)
B->C : Cost=18(B's cost plus B->C alone cost, so do not update as C's cost as is already smaller than this)
C->B : Cost=13(C's cost plus C->B alone cost, update B's cost to 13 as this is smaller than 15)
B->D : Cost=18(B's new cost plus B->D cost alone, update D's cost as this smaller than infinity)
C->D : Cost=25(C's cost plus C->D cost alone, do not update D)
So the path that the algorithm chooses is the one that lead to D with the cost of 18 which comes out to be the smallest cost!
B
/ | \
A | D
\ | /
C
A->C->B->D Cost:18
Now you may read this link for better understanding. And things should be pretty clear.

I asked in our university forum and got the following answer:
Bellman-Ford is originally "from one node". The invariant (idea under the hood of the algorithm) however does not change when applying the original Bellman-Ford algorithm to every node of the Graph.
Complexity of the original Bellman-Ford is O(V^3) and if started from every Node it would be O(V^4). However there is a trick that one can use because the findings during the algorithm resemble matrix multiplications of the input matrix (containing direct path lengths) with itself. Because this is a mathematical ring one can cheat and simply calculate matrix^2, matrix^4, matrix^8 and so on (This is the part I did not completely understand though) and one can achieve O(V^3 * log V).
He called this algorithm Bellman-Ford as well, because the invariant/ idea behind the algorithm is still the same.
German answer in our public university forum

Find the shortest path in a graph which visits certain nodes

I have a undirected graph with about 100 nodes and about 200 edges. One node is labelled 'start', one is 'end', and there's about a dozen labelled 'mustpass'.
I need to find the shortest path through this graph that starts at 'start', ends at 'end', and passes through all of the 'mustpass' nodes (in any order).
( http://3e.org/local/maize-graph.png / http://3e.org/local/maize-graph.dot.txt is the graph in question - it represents a corn maze in Lancaster, PA)

Everyone else comparing this to the Travelling Salesman Problem probably hasn't read your question carefully. In TSP, the objective is to find the shortest cycle that visits all the vertices (a Hamiltonian cycle) -- it corresponds to having every node labelled 'mustpass'.
In your case, given that you have only about a dozen labelled 'mustpass', and given that 12! is rather small (479001600), you can simply try all permutations of only the 'mustpass' nodes, and look at the shortest path from 'start' to 'end' that visits the 'mustpass' nodes in that order -- it will simply be the concatenation of the shortest paths between every two consecutive nodes in that list.
In other words, first find the shortest distance between each pair of vertices (you can use Dijkstra's algorithm or others, but with those small numbers (100 nodes), even the simplest-to-code Floyd-Warshall algorithm will run in time). Then, once you have this in a table, try all permutations of your 'mustpass' nodes, and the rest.
Something like this:
//Precomputation: Find all pairs shortest paths, e.g. using Floyd-Warshall
n = number of nodes
for i=1 to n: for j=1 to n: d[i][j]=INF
for k=1 to n:
for i=1 to n:
for j=1 to n:
d[i][j] = min(d[i][j], d[i][k] + d[k][j])
//That *really* gives the shortest distance between every pair of nodes! :-)
//Now try all permutations
shortest = INF
for each permutation a[1],a[2],...a[k] of the 'mustpass' nodes:
shortest = min(shortest, d['start'][a[1]]+d[a[1]][a[2]]+...+d[a[k]]['end'])
print shortest
(Of course that's not real code, and if you want the actual path you'll have to keep track of which permutation gives the shortest distance, and also what the all-pairs shortest paths are, but you get the idea.)
It will run in at most a few seconds on any reasonable language :)
[If you have n nodes and k 'mustpass' nodes, its running time is O(n3) for the Floyd-Warshall part, and O(k!n) for the all permutations part, and 100^3+(12!)(100) is practically peanuts unless you have some really restrictive constraints.]

run Djikstra's Algorithm to find the shortest paths between all of the critical nodes (start, end, and must-pass), then a depth-first traversal should tell you the shortest path through the resulting subgraph that touches all of the nodes start ... mustpasses ... end

This is two problems... Steven Lowe pointed this out, but didn't give enough respect to the second half of the problem.
You should first discover the shortest paths between all of your critical nodes (start, end, mustpass). Once these paths are discovered, you can construct a simplified graph, where each edge in the new graph is a path from one critical node to another in the original graph. There are many pathfinding algorithms that you can use to find the shortest path here.
Once you have this new graph, though, you have exactly the Traveling Salesperson problem (well, almost... No need to return to your starting point). Any of the posts concerning this, mentioned above, will apply.

Actually, the problem you posted is similar to the traveling salesman, but I think closer to a simple pathfinding problem. Rather than needing to visit each and every node, you simply need to visit a particular set of nodes in the shortest time (distance) possible.
The reason for this is that, unlike the traveling salesman problem, a corn maze will not allow you to travel directly from any one point to any other point on the map without needing to pass through other nodes to get there.
I would actually recommend A* pathfinding as a technique to consider. You set this up by deciding which nodes have access to which other nodes directly, and what the "cost" of each hop from a particular node is. In this case, it looks like each "hop" could be of equal cost, since your nodes seem relatively closely spaced. A* can use this information to find the lowest cost path between any two points. Since you need to get from point A to point B and visit about 12 inbetween, even a brute force approach using pathfinding wouldn't hurt at all.
Just an alternative to consider. It does look remarkably like the traveling salesman problem, and those are good papers to read up on, but look closer and you'll see that its only overcomplicating things. ^_^ This coming from the mind of a video game programmer who's dealt with these kinds of things before.

This is not a TSP problem and not NP-hard because the original question does not require that must-pass nodes are visited only once. This makes the answer much, much simpler to just brute-force after compiling a list of shortest paths between all must-pass nodes via Dijkstra's algorithm. There may be a better way to go but a simple one would be to simply work a binary tree backwards. Imagine a list of nodes [start,a,b,c,end]. Sum the simple distances [start->a->b->c->end] this is your new target distance to beat. Now try [start->a->c->b->end] and if that's better set that as the target (and remember that it came from that pattern of nodes). Work backwards over the permutations:
[start->a->b->c->end]
[start->a->c->b->end]
[start->b->a->c->end]
[start->b->c->a->end]
[start->c->a->b->end]
[start->c->b->a->end]
One of those will be shortest.
(where are the 'visited multiple times' nodes, if any? They're just hidden in the shortest-path initialization step. The shortest path between a and b may contain c or even the end point. You don't need to care)

Andrew Top has the right idea:
1) Djikstra's Algorithm
2) Some TSP heuristic.
I recommend the Lin-Kernighan heuristic: it's one of the best known for any NP Complete problem. The only other thing to remember is that after you expanded out the graph again after step 2, you may have loops in your expanded path, so you should go around short-circuiting those (look at the degree of vertices along your path).
I'm actually not sure how good this solution will be relative to the optimum. There are probably some pathological cases to do with short circuiting. After all, this problem looks a LOT like Steiner Tree: http://en.wikipedia.org/wiki/Steiner_tree and you definitely can't approximate Steiner Tree by just contracting your graph and running Kruskal's for example.

Considering the amount of nodes and edges is relatively finite, you can probably calculate every possible path and take the shortest one.
Generally this known as the travelling salesman problem, and has a non-deterministic polynomial runtime, no matter what the algorithm you use.
http://en.wikipedia.org/wiki/Traveling_salesman_problem

The question talks about must-pass in ANY order. I have been trying to search for a solution about the defined order of must-pass nodes. I found my answer but since no question on StackOverflow had a similar question I'm posting here to let maximum people benefit from it.
If the order or must-pass is defined then you could run dijkstra's algorithm multiple times. For instance let's assume you have to start from s pass through k1, k2 and k3 (in respective order) and stop at e. Then what you could do is run dijkstra's algorithm between each consecutive pair of nodes. The cost and path would be given by:
dijkstras(s, k1) + dijkstras(k1, k2) + dijkstras(k2, k3) + dijkstras(k3, 3)

How about using brute force on the dozen 'must visit' nodes. You can cover all the possible combinations of 12 nodes easily enough, and this leaves you with an optimal circuit you can follow to cover them.
Now your problem is simplified to one of finding optimal routes from the start node to the circuit, which you then follow around until you've covered them, and then find the route from that to the end.
Final path is composed of :
start -> path to circuit* -> circuit of must visit nodes -> path to end* -> end
You find the paths I marked with * like this
Do an A* search from the start node to every point on the circuit
for each of these do an A* search from the next and previous node on the circuit to the end (because you can follow the circuit round in either direction)
What you end up with is a lot of search paths, and you can choose the one with the lowest cost.
There's lots of room for optimization by caching the searches, but I think this will generate good solutions.
It doesn't go anywhere near looking for an optimal solution though, because that could involve leaving the must visit circuit within the search.

One thing that is not mentioned anywhere, is whether it is ok for the same vertex to be visited more than once in the path. Most of the answers here assume that it's ok to visit the same edge multiple times, but my take given the question (a path should not visit the same vertex more than once!) is that it is not ok to visit the same vertex twice.
So a brute force approach would still apply, but you'd have to remove vertices already used when you attempt to calculate each subset of the path.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio