I'm looking for a solution to the following problem, related to shortest path.
Given a directed Graph G = (V,E), source s, targets t1, t2, ..., tk and the costs for traveling edge {i,j} is cij. Now I want to know the shortest paths from s to t1, ..., tk. BUT, if a vertext vi (not source or targets) is used then we have an additional cost of C. Note that is two paths use the same vertext vi, the costs C is only paid once.
If you are looking for the shortest path, and each path is penaltised if using c then:
Create a modified weightning function:
w'(u,v) = w(u,v) + C if v == c
w'(u,v) = w(u,v) otherwise
It is easy to see that when running dijkstra's algorithm or Bellman Ford, with w' any path that uses c is penaltized by exactly C, since if c appears in the path - it appears exactly once, so C is added to the total weight [note that c cannot appear more then once in a shortest path], and of course there is no penalty if c is not used in this path.
EDIT: I am not sure I understood correctly, if what #SaeedAmiri is mentioning is correct, and if you want to give the penalty only once [and minimize the total sum of paths to t1,...,tk] Then you should use a different solution - with a similar idea:
create a weightning function w' such that:
w'(u,v) = w(u,v) + C + epsilon if v == c
w'(u,v) = w(u,v) otherwise
Note that it is important epsilon is a small size that can be achieved only on w', and is the smallest possible size.
Run dijkstra or BF on the graph with w, let's denote the weights as
W1
Run dijkstra or BF in the graph with w' let's denote the weights as W2
If W1[ti] == W2[ti] for each ti ∈ { t1, ..., tk } - then you don't need c in the shortest paths, and the total result is SUM(W1[ti])
Otherwise - the result is min { SUM(W1[ti]) + C , SUM(W2[ti])`
The idea behind step 4 is you got two possibilities:
You can get to all of t1, ... , tk without using c, and it will be cheaper then using a path with it, so you return the sum of W2.
Or, if ignoring c - will only be more expansive - thus you use it freely [and return the sum of W1], and add the penalty only once.
The standard O(n^3) dynamic programming solution...
http://en.wikipedia.org/wiki/Dijkstras_algorithm
...still works, with only a minor adjustment:
Just calculate the adjacency matrix of direct costs and then iterate through considering shortcuts, but when calculating the cost of a shortcut add on vi.
You haven't told us exactly what the problem is yet, but there's an unambiguous fragment that admits an objective-preserving reduction from set cover.
For an arbitrary instance of set cover, set up a graph with a source, vertices for each set, and vertices for each element. The terminals t1, ..., tk are the element-vertices. Each set-vertex has edges of weight zero to the source and to the vertices corresponding to its elements. On this graph, we have to buy a set cover in order to reach the terminals from the source, and every set cover suffices.
Unless there's something you can tell us about your instances, the approximation ratios for polynomial-time algorithms won't be any better than Theta(log n), so I suggest integer programming.
Related
Can somebody tell me why Dijkstra's algorithm for single source shortest path assumes that the edges must be non-negative.
I am talking about only edges not the negative weight cycles.
Recall that in Dijkstra's algorithm, once a vertex is marked as "closed" (and out of the open set) - the algorithm found the shortest path to it, and will never have to develop this node again - it assumes the path developed to this path is the shortest.
But with negative weights - it might not be true. For example:
A
/ \
/ \
/ \
5 2
/ \
B--(-10)-->C
V={A,B,C} ; E = {(A,C,2), (A,B,5), (B,C,-10)}
Dijkstra from A will first develop C, and will later fail to find A->B->C
EDIT a bit deeper explanation:
Note that this is important, because in each relaxation step, the algorithm assumes the "cost" to the "closed" nodes is indeed minimal, and thus the node that will next be selected is also minimal.
The idea of it is: If we have a vertex in open such that its cost is minimal - by adding any positive number to any vertex - the minimality will never change.
Without the constraint on positive numbers - the above assumption is not true.
Since we do "know" each vertex which was "closed" is minimal - we can safely do the relaxation step - without "looking back". If we do need to "look back" - Bellman-Ford offers a recursive-like (DP) solution of doing so.
Consider the graph shown below with the source as Vertex A. First try running Dijkstra’s algorithm yourself on it.
When I refer to Dijkstra’s algorithm in my explanation I will be talking about the Dijkstra's Algorithm as implemented below,
So starting out the values (the distance from the source to the vertex) initially assigned to each vertex are,
We first extract the vertex in Q = [A,B,C] which has smallest value, i.e. A, after which Q = [B, C]. Note A has a directed edge to B and C, also both of them are in Q, therefore we update both of those values,
Now we extract C as (2<5), now Q = [B]. Note that C is connected to nothing, so line16 loop doesn't run.
Finally we extract B, after which . Note B has a directed edge to C but C isn't present in Q therefore we again don't enter the for loop in line16,
So we end up with the distances as
Note how this is wrong as the shortest distance from A to C is 5 + -10 = -5, when you go .
So for this graph Dijkstra's Algorithm wrongly computes the distance from A to C.
This happens because Dijkstra's Algorithm does not try to find a shorter path to vertices which are already extracted from Q.
What the line16 loop is doing is taking the vertex u and saying "hey looks like we can go to v from source via u, is that (alt or alternative) distance any better than the current dist[v] we got? If so lets update dist[v]"
Note that in line16 they check all neighbors v (i.e. a directed edge exists from u to v), of u which are still in Q. In line14 they remove visited notes from Q. So if x is a visited neighbour of u, the path is not even considered as a possible shorter way from source to v.
In our example above, C was a visited neighbour of B, thus the path was not considered, leaving the current shortest path unchanged.
This is actually useful if the edge weights are all positive numbers, because then we wouldn't waste our time considering paths that can't be shorter.
So I say that when running this algorithm if x is extracted from Q before y, then its not possible to find a path - which is shorter. Let me explain this with an example,
As y has just been extracted and x had been extracted before itself, then dist[y] > dist[x] because otherwise y would have been extracted before x. (line 13 min distance first)
And as we already assumed that the edge weights are positive, i.e. length(x,y)>0. So the alternative distance (alt) via y is always sure to be greater, i.e. dist[y] + length(x,y)> dist[x]. So the value of dist[x] would not have been updated even if y was considered as a path to x, thus we conclude that it makes sense to only consider neighbors of y which are still in Q (note comment in line16)
But this thing hinges on our assumption of positive edge length, if length(u,v)<0 then depending on how negative that edge is we might replace the dist[x] after the comparison in line18.
So any dist[x] calculation we make will be incorrect if x is removed before all vertices v - such that x is a neighbour of v with negative edge connecting them - is removed.
Because each of those v vertices is the second last vertex on a potential "better" path from source to x, which is discarded by Dijkstra’s algorithm.
So in the example I gave above, the mistake was because C was removed before B was removed. While that C was a neighbour of B with a negative edge!
Just to clarify, B and C are A's neighbours. B has a single neighbour C and C has no neighbours. length(a,b) is the edge length between the vertices a and b.
Dijkstra's algorithm assumes paths can only become 'heavier', so that if you have a path from A to B with a weight of 3, and a path from A to C with a weight of 3, there's no way you can add an edge and get from A to B through C with a weight of less than 3.
This assumption makes the algorithm faster than algorithms that have to take negative weights into account.
Correctness of Dijkstra's algorithm:
We have 2 sets of vertices at any step of the algorithm. Set A consists of the vertices to which we have computed the shortest paths. Set B consists of the remaining vertices.
Inductive Hypothesis: At each step we will assume that all previous iterations are correct.
Inductive Step: When we add a vertex V to the set A and set the distance to be dist[V], we must prove that this distance is optimal. If this is not optimal then there must be some other path to the vertex V that is of shorter length.
Suppose this some other path goes through some vertex X.
Now, since dist[V] <= dist[X] , therefore any other path to V will be atleast dist[V] length, unless the graph has negative edge lengths.
Thus for dijkstra's algorithm to work, the edge weights must be non negative.
Dijkstra's Algorithm assumes that all edges are positive weighted and this assumption helps the algorithm run faster ( O(E*log(V) ) than others which take into account the possibility of negative edges (e.g bellman ford's algorithm with complexity of O(V^3)).
This algorithm wont give the correct result in the following case (with a -ve edge) where A is the source vertex:
Here, the shortest distance to vertex D from source A should have been 6. But according to Dijkstra's method the shortest distance will be 7 which is incorrect.
Also, Dijkstra's Algorithm may sometimes give correct solution even if there are negative edges. Following is an example of such a case:
However, It will never detect a negative cycle and always produce a result which will always be incorrect if a negative weight cycle is reachable from the source, as in such a case there exists no shortest path in the graph from the source vertex.
Try Dijkstra's algorithm on the following graph, assuming A is the source node and D is the destination, to see what is happening:
Note that you have to follow strictly the algorithm definition and you should not follow your intuition (which tells you the upper path is shorter).
The main insight here is that the algorithm only looks at all directly connected edges and it takes the smallest of these edge. The algorithm does not look ahead. You can modify this behavior , but then it is not the Dijkstra algorithm anymore.
You can use dijkstra's algorithm with negative edges not including negative cycle, but you must allow a vertex can be visited multiple times and that version will lose it's fast time complexity.
In that case practically I've seen it's better to use SPFA algorithm which have normal queue and can handle negative edges.
Recall that in Dijkstra's algorithm, once a vertex is marked as "closed" (and out of the open set) -it assumes that any node originating from it will lead to greater distance so, the algorithm found the shortest path to it, and will never have to develop this node again, but this doesn't hold true in case of negative weights.
The other answers so far demonstrate pretty well why Dijkstra's algorithm cannot handle negative weights on paths.
But the question itself is maybe based on a wrong understanding of the weight of paths. If negative weights on paths would be allowed in pathfinding algorithms in general, then you would get permanent loops that would not stop.
Consider this:
A <- 5 -> B <- (-1) -> C <- 5 -> D
What is the optimal path between A and D?
Any pathfinding algorithm would have to continuously loop between B and C because doing so would reduce the weight of the total path. So allowing negative weights for a connection would render any pathfindig algorithm moot, maybe except if you limit each connection to be used only once.
So, to explain this in more detail, consider the following paths and weights:
Path | Total weight
ABCD | 9
ABCBCD | 7
ABCBCBCD | 5
ABCBCBCBCD | 3
ABCBCBCBCBCD | 1
ABCBCBCBCBCBCD | -1
...
So, what's the perfect path? Any time the algorithm adds a BC step, it reduces the total weight by 2.
So the optimal path is A (BC) D with the BC part being looped forever.
Since Dijkstra's goal is to find the optimal path (not just any path), it, by definition, cannot work with negative weights, since it cannot find the optimal path.
Dijkstra will actually not loop, since it keeps a list of nodes that it has visited. But it will not find a perfect path, but instead just any path.
Adding few points to the explanation, on top of the previous answers, for the following simple example,
Dijktra's algorithm being greedy, it first finds the minimum distance vertex C from the source vertex A greedily and assigns the distance d[C] (from vertex A) to the weight of the edge AC.
The underlying assumption is that since C was picked first, there is no other vertex V in the graph s.t. w(AV) < w(AC), otherwise V would have been picked instead of C, by the algorithm.
Since by above logic, w(AC) <= w(AV), for all vertex V different from the vertices A and C. Now, clearly any other path P that starts from A and ends in C, going through V , i.e., the path P = A -> V -> ... -> C, will be longer in length (>= 2) and total cost of the path P will be sum of the edges on it, i.e., cost(P) >= w(AV) >= w(AC), assuming all edges on P have non-negative weights, so that
C can be safely removed from the queue Q, since d[C] can never get smaller / relaxed further under this assumption.
Obviously, the above assumption does not hold when some.edge on P is negative, in a which case d[C] may decrease further, but the algorithm can't take care of this scenario, since by that time it has removed C from the queue Q.
In Unweighted graph
Dijkstra can even work without set or priority queue, even if you just use STACK the algorithm will work but with Stack its time of execution will increase
Dijkstra don't repeat a node once its processed becoz it always tooks the minimum route , which means if you come to that node via any other path it will certainly have greater distance
For ex -
(0)
/
6 5
/
(2) (1)
\ /
4 7
\ /
(9)
here once you get to node 1 via 0 (as its minimum out of 5 and 6)so now there is no way you can get a minimum value for reaching 1
because all other path will add value to 5 and not decrease it
more over with Negative weights it will fall into infinite loop
In Unweighted graph
Dijkstra Algo will fall into loop if it has negative weight
In Directed graph
Dijkstra Algo will give RIGHT ANSWER except in case of Negative Cycle
Who says Dijkstra never visit a node more than once are 500% wrong
also who says Dijkstra can't work with negative weight are wrong
The below picture is the representation from the question which was asked to me during samsung interview. I had to write the program to find the minimum distance between I and M. There was an additional constraint that we can change one of the edges. For example, The edge FM can be moved to join edge L and M and the edge value will still be 4.
If you notice, the distance between I and M via I-> E -> F -> G -> M is 20. However, if we change one of the edges such that L to M edge value is 4 now. We have to move edge FM to join L and M now. By this method, the distance between I and M is 20.
An arbitrary edge u, v can be changed to u, t or t,v. It can not be changed to x,y. So one of the vertices in the edge has to be same.
Please find the picture below to illustrate the scenario -
So my problem is that I had to write the program for this. To find the minimum distance between two vertices, I thought of using Djikstra's algorithm. However , I as not sure how to take care of the additional constraint where I had the option of changing one of the vertices. If I could get some help to solve this, I would really appreciate it.
If we move an edge (A, B), the new end should be either the start S or the target T vertex (otherwise, the answer is not optimal).
Let's assume that we move the edge (A, B) and the new end is T (the case when it's S is handled similarly). We need to know the shortest path from S to A that doesn't use this edge (once we know it, we can update the answer with the S->A->T path).
Let's compute the shortest path from S to all other vertices using Dijkstra's algorithm.
Let's fix a vertex A and compute the two minimums of dist[B] + weight(A, B) for all B adjacent to A. Let's iterate over all edges adjacent to A. Let the current edge be (A, B). If dist[B] + weight(A, B) is equal to the first minimum, let d be the second minimum. Otherwise, let d be the first minimum. We need to update the answer with d + weight(A, B) (it means that (A, B) becomes (A, T) now).
This solution is linear in the size of the graph (not counting the Dijkstra's algorithm run time).
To avoid code duplication, we can handle the case when the edge is redirected to S by swapping S and T and running the same algorithm (the final answer is the minimum of the results of these two runs).
In the graph you've shown, the shortest path I see is I -> E -> F -> M with a length of 13.
Moving the edge F -> M so that it connects L -> M just makes things worse. The new shortest path is I -> E-> F -> L -> M with a length of 18.
The obvious answer is to move edge F -> M so that it connects I directly to M, giving a length of 4.
In other words, find the shortest edge that's connected to I or M and use it to connect I directly to M.
For future reference, it's highly unlikely that you'll be asked to implement Djikstra's algorithm from memory in an interview. So you need to look for something simpler.
I'm quite surprised I couldn't find anything on this anywhere, it seems to be a problem that should be quite well known:
Consider the Euclidean shortest path problem, in two dimensions. Given a set of obstacle polygons P and two points a and b, we want to find the shortest path from a to b not intersecting the (interior of) any p in P.
To solve this, one can create the visibility graph for this problem, the graph whose nodes are the vertices of the elements of P, and where two nodes are connected if the straight line between them does not intersect any element of P. The edge weight for any such edge is simply the Euclidean distance between such two points. To solve this, one can then determine the shortest path from a to b in the graph, let's say with A*.
However, this is not a good approach. Creating the visibility graph in advance requires checking if any two vertices from any two polygons are connected, a check that has higher complexity than determining the distance between two nodes. So working with a modified version of A* that "does everything what it can before checking if two nodes are actually connected" actually speeds up the problem.
Still, A* and all other shortest path problems always start with an explicitly given graph for which adjacent vertices can be traversed cheaply. So my question is, is there a good (optimal?) algorithm for finding a shortest path between two nodes a and b in an "implicit graph" that minimizes checking if two nodes are connected?
Edit:
To clarify what I mean, this is an example of what I'm looking for:
Let V be a set, a, b elements of V. Suppose w: V x V -> D is a weighing function (to some linearly ordered set D) and c: V x V -> {true, false} returns true iff two elements of V are considered to be connected. Then the following algorithm finds the shortest path from a to b in V, i.e., returns a list [x_i | i < n] such that x_0 = a, x_{n-1} = b, and c(x_i, x_{i+1}) = true for all i < n - 1.
Let (V, E) be the complete graph with vertex set V.
do
Compute shortest path from a to b in (V, E) and put it in P = [p_0, ..., p_{n-1}]
if P = empty (there is no shortest path), return NoShortestPath
Let all_good = true
for i = 0 ... n - 2 do
if c(p_i, p_{i+1}) == false, remove edge (p_i, p_{i+1}) from E, set all_good = false and exit for loop
while all_good = false
For computing the shortest paths in the loop, one could use A* if an appropriate heuristic exists. Obviously this algorithm produces a shortest path from a to b.
Also, I suppose this algorithm is somehow optimal in calling c as rarely as possible. For its found shortest path, it must have ruled out all shorter paths that the function w would have allowed for.
But surely there is a better way?
Edit 2:
So I found a solution that works relatively well for what I'm trying to do: Using A*, when relaxing a node, instead of going through the neighbors and adding them to / updating them in the priority queue, I put all vertices into the priority queue, marked as hypothetical, together with hypothetical f and g values and the hypothetical parent. Then, when picking the next element from the priority queue, I check if the node's connection to its parent is actually given. If so, the node is progressed as normal, if not, it is discarded.
This greatly reduces the number of connectivity checks and improves performance for me a lot. But I'm sure there's still a more elegant way, in particular one where the "hypothetical new path" doesn't just extend by length one (parents are always actual, not hypothetical).
A* or Dijkstra's algorithm do not need an explicit graph to work, they actually only need:
source vertex (s)
A function next:V->2^V such that next(v)={u | there is an edge from v to u }
A function isGoal:V->{0,1} such that isGoal(v) = 1 iff v is a target node.
A weight function w:E->R such that w(u,v)= cost to move from u to v
And, of course, in addition A* is going to need a heuristic function h:V->R such that h(v) is the cost approximation.
With these functions, you can generate only the portion of the graph that is needed to find shortest path, on the fly.
In fact, A* algorithm is often used on infinite graphs (or huge graphs that do not fit in any existing storage) in artificial inteliigence problems using this approach.
The idea is, you only look on edges in A* from a given node (all (u,v) in E for some given u). You don't need the entire edges set E in order to do it, you can just use your next(u) function instead.
I'm trying to figure out something like this.
In a graph, there are 2 start nodes and 2 destination nodes. Find the optimal paths from the 1st start to the 1st destination, and the other start to the other destination, such that if two entities were to travel along these paths, they would never be at the same node.
My first thought (although I really have no idea) would be to use any shortest path algorithm, let's say Dijkstra's. Run the algorithm once for the first entity, and store the node chosen for every step in an array. Run the algorithm a second time for the second entity, and if the chosen node for a step is the same as it was for the first entity at that array index, then they would collide, so choose the next best node instead. There must be a better way to do this.
Could use some suggestions. Thanks!
I think you might consider a dynamic programming approach. At iteration 0, let s_0 = {(origin_0, origin_1)}. For iteration k+1, let s_k+1 = {(x,y) | x != y, exists an (prev_x, prev_y) in s_k s.t. e(prev_x, x) in E and e(prev_y, y) in E}. This should proceed forward with |s| < V^2 for every s. So if the best case distance is d, you should be able to do this in d*V^2 time. Good luck on doing better!
Update: The above solution actually runs in d * E^2, as per the comments below. Note that it will converge within d = V^2 iterations, so the total time is (VE)^2. But more importantly, this algorithm is actually the same as just running Bellman Ford on the product graph G' = (V', E') where V' = {(x,y) | x <- V, y <- V, x != y} and two nodes u = (x,y), v = (x',y') in V' share an edge if there is an edge e(x,x') and e(y,y') in the original edge set E. But now that we've defined our algorithm as Bellman Ford on the product graph, we might as well run Dijkstra's Algorithm! Note that the order of G' (number of vertices) is V^2 - V = O(V^2), and the number of edges in G' is O(E^2). Thus, running Dijkstra's using a Fibbonacci heap will take us at most O(E^2 + V^2 log V), which for anything other than the sparsest of graphs will essentially be O(E^2)! That's a major improvement. If you actually want to run this, you can use a graph library to build the product graph, and then just call shortest paths from (x0, y0) to (xT, yT).
The memory cost of this algorithm is O(E^2) because that's what it takes to explicitly form the product graph. You really don't need to do that though - Dijkstra's only needs to keep the vertices min cost in memory, and you could generate the edges on the fly to keep that down to O(V^2) memory. The code will be a lot uglier though, as you may have to roll Dijkstra's yourself. Also, if you're operating on a really big graph that you couldn't possible precompute the product of, you might consider running iterative deepening (http://en.wikipedia.org/wiki/Iterative_deepening_depth-first_search) on the product graph. If this problem is actually important for a real world use case of yours, feel free to ask more.
Here's what should be a quicker way. As above, run the two queries in parallel, but store two costs for each arc, one for each query. Initially both costs are the same for all arcs. When a query step reaches a node, set the costs for the other query, for all incoming arcs of that node, to infinity.
Can somebody tell me why Dijkstra's algorithm for single source shortest path assumes that the edges must be non-negative.
I am talking about only edges not the negative weight cycles.
Recall that in Dijkstra's algorithm, once a vertex is marked as "closed" (and out of the open set) - the algorithm found the shortest path to it, and will never have to develop this node again - it assumes the path developed to this path is the shortest.
But with negative weights - it might not be true. For example:
A
/ \
/ \
/ \
5 2
/ \
B--(-10)-->C
V={A,B,C} ; E = {(A,C,2), (A,B,5), (B,C,-10)}
Dijkstra from A will first develop C, and will later fail to find A->B->C
EDIT a bit deeper explanation:
Note that this is important, because in each relaxation step, the algorithm assumes the "cost" to the "closed" nodes is indeed minimal, and thus the node that will next be selected is also minimal.
The idea of it is: If we have a vertex in open such that its cost is minimal - by adding any positive number to any vertex - the minimality will never change.
Without the constraint on positive numbers - the above assumption is not true.
Since we do "know" each vertex which was "closed" is minimal - we can safely do the relaxation step - without "looking back". If we do need to "look back" - Bellman-Ford offers a recursive-like (DP) solution of doing so.
Consider the graph shown below with the source as Vertex A. First try running Dijkstra’s algorithm yourself on it.
When I refer to Dijkstra’s algorithm in my explanation I will be talking about the Dijkstra's Algorithm as implemented below,
So starting out the values (the distance from the source to the vertex) initially assigned to each vertex are,
We first extract the vertex in Q = [A,B,C] which has smallest value, i.e. A, after which Q = [B, C]. Note A has a directed edge to B and C, also both of them are in Q, therefore we update both of those values,
Now we extract C as (2<5), now Q = [B]. Note that C is connected to nothing, so line16 loop doesn't run.
Finally we extract B, after which . Note B has a directed edge to C but C isn't present in Q therefore we again don't enter the for loop in line16,
So we end up with the distances as
Note how this is wrong as the shortest distance from A to C is 5 + -10 = -5, when you go .
So for this graph Dijkstra's Algorithm wrongly computes the distance from A to C.
This happens because Dijkstra's Algorithm does not try to find a shorter path to vertices which are already extracted from Q.
What the line16 loop is doing is taking the vertex u and saying "hey looks like we can go to v from source via u, is that (alt or alternative) distance any better than the current dist[v] we got? If so lets update dist[v]"
Note that in line16 they check all neighbors v (i.e. a directed edge exists from u to v), of u which are still in Q. In line14 they remove visited notes from Q. So if x is a visited neighbour of u, the path is not even considered as a possible shorter way from source to v.
In our example above, C was a visited neighbour of B, thus the path was not considered, leaving the current shortest path unchanged.
This is actually useful if the edge weights are all positive numbers, because then we wouldn't waste our time considering paths that can't be shorter.
So I say that when running this algorithm if x is extracted from Q before y, then its not possible to find a path - which is shorter. Let me explain this with an example,
As y has just been extracted and x had been extracted before itself, then dist[y] > dist[x] because otherwise y would have been extracted before x. (line 13 min distance first)
And as we already assumed that the edge weights are positive, i.e. length(x,y)>0. So the alternative distance (alt) via y is always sure to be greater, i.e. dist[y] + length(x,y)> dist[x]. So the value of dist[x] would not have been updated even if y was considered as a path to x, thus we conclude that it makes sense to only consider neighbors of y which are still in Q (note comment in line16)
But this thing hinges on our assumption of positive edge length, if length(u,v)<0 then depending on how negative that edge is we might replace the dist[x] after the comparison in line18.
So any dist[x] calculation we make will be incorrect if x is removed before all vertices v - such that x is a neighbour of v with negative edge connecting them - is removed.
Because each of those v vertices is the second last vertex on a potential "better" path from source to x, which is discarded by Dijkstra’s algorithm.
So in the example I gave above, the mistake was because C was removed before B was removed. While that C was a neighbour of B with a negative edge!
Just to clarify, B and C are A's neighbours. B has a single neighbour C and C has no neighbours. length(a,b) is the edge length between the vertices a and b.
Dijkstra's algorithm assumes paths can only become 'heavier', so that if you have a path from A to B with a weight of 3, and a path from A to C with a weight of 3, there's no way you can add an edge and get from A to B through C with a weight of less than 3.
This assumption makes the algorithm faster than algorithms that have to take negative weights into account.
Correctness of Dijkstra's algorithm:
We have 2 sets of vertices at any step of the algorithm. Set A consists of the vertices to which we have computed the shortest paths. Set B consists of the remaining vertices.
Inductive Hypothesis: At each step we will assume that all previous iterations are correct.
Inductive Step: When we add a vertex V to the set A and set the distance to be dist[V], we must prove that this distance is optimal. If this is not optimal then there must be some other path to the vertex V that is of shorter length.
Suppose this some other path goes through some vertex X.
Now, since dist[V] <= dist[X] , therefore any other path to V will be atleast dist[V] length, unless the graph has negative edge lengths.
Thus for dijkstra's algorithm to work, the edge weights must be non negative.
Dijkstra's Algorithm assumes that all edges are positive weighted and this assumption helps the algorithm run faster ( O(E*log(V) ) than others which take into account the possibility of negative edges (e.g bellman ford's algorithm with complexity of O(V^3)).
This algorithm wont give the correct result in the following case (with a -ve edge) where A is the source vertex:
Here, the shortest distance to vertex D from source A should have been 6. But according to Dijkstra's method the shortest distance will be 7 which is incorrect.
Also, Dijkstra's Algorithm may sometimes give correct solution even if there are negative edges. Following is an example of such a case:
However, It will never detect a negative cycle and always produce a result which will always be incorrect if a negative weight cycle is reachable from the source, as in such a case there exists no shortest path in the graph from the source vertex.
Try Dijkstra's algorithm on the following graph, assuming A is the source node and D is the destination, to see what is happening:
Note that you have to follow strictly the algorithm definition and you should not follow your intuition (which tells you the upper path is shorter).
The main insight here is that the algorithm only looks at all directly connected edges and it takes the smallest of these edge. The algorithm does not look ahead. You can modify this behavior , but then it is not the Dijkstra algorithm anymore.
You can use dijkstra's algorithm with negative edges not including negative cycle, but you must allow a vertex can be visited multiple times and that version will lose it's fast time complexity.
In that case practically I've seen it's better to use SPFA algorithm which have normal queue and can handle negative edges.
Recall that in Dijkstra's algorithm, once a vertex is marked as "closed" (and out of the open set) -it assumes that any node originating from it will lead to greater distance so, the algorithm found the shortest path to it, and will never have to develop this node again, but this doesn't hold true in case of negative weights.
The other answers so far demonstrate pretty well why Dijkstra's algorithm cannot handle negative weights on paths.
But the question itself is maybe based on a wrong understanding of the weight of paths. If negative weights on paths would be allowed in pathfinding algorithms in general, then you would get permanent loops that would not stop.
Consider this:
A <- 5 -> B <- (-1) -> C <- 5 -> D
What is the optimal path between A and D?
Any pathfinding algorithm would have to continuously loop between B and C because doing so would reduce the weight of the total path. So allowing negative weights for a connection would render any pathfindig algorithm moot, maybe except if you limit each connection to be used only once.
So, to explain this in more detail, consider the following paths and weights:
Path | Total weight
ABCD | 9
ABCBCD | 7
ABCBCBCD | 5
ABCBCBCBCD | 3
ABCBCBCBCBCD | 1
ABCBCBCBCBCBCD | -1
...
So, what's the perfect path? Any time the algorithm adds a BC step, it reduces the total weight by 2.
So the optimal path is A (BC) D with the BC part being looped forever.
Since Dijkstra's goal is to find the optimal path (not just any path), it, by definition, cannot work with negative weights, since it cannot find the optimal path.
Dijkstra will actually not loop, since it keeps a list of nodes that it has visited. But it will not find a perfect path, but instead just any path.
Adding few points to the explanation, on top of the previous answers, for the following simple example,
Dijktra's algorithm being greedy, it first finds the minimum distance vertex C from the source vertex A greedily and assigns the distance d[C] (from vertex A) to the weight of the edge AC.
The underlying assumption is that since C was picked first, there is no other vertex V in the graph s.t. w(AV) < w(AC), otherwise V would have been picked instead of C, by the algorithm.
Since by above logic, w(AC) <= w(AV), for all vertex V different from the vertices A and C. Now, clearly any other path P that starts from A and ends in C, going through V , i.e., the path P = A -> V -> ... -> C, will be longer in length (>= 2) and total cost of the path P will be sum of the edges on it, i.e., cost(P) >= w(AV) >= w(AC), assuming all edges on P have non-negative weights, so that
C can be safely removed from the queue Q, since d[C] can never get smaller / relaxed further under this assumption.
Obviously, the above assumption does not hold when some.edge on P is negative, in a which case d[C] may decrease further, but the algorithm can't take care of this scenario, since by that time it has removed C from the queue Q.
In Unweighted graph
Dijkstra can even work without set or priority queue, even if you just use STACK the algorithm will work but with Stack its time of execution will increase
Dijkstra don't repeat a node once its processed becoz it always tooks the minimum route , which means if you come to that node via any other path it will certainly have greater distance
For ex -
(0)
/
6 5
/
(2) (1)
\ /
4 7
\ /
(9)
here once you get to node 1 via 0 (as its minimum out of 5 and 6)so now there is no way you can get a minimum value for reaching 1
because all other path will add value to 5 and not decrease it
more over with Negative weights it will fall into infinite loop
In Unweighted graph
Dijkstra Algo will fall into loop if it has negative weight
In Directed graph
Dijkstra Algo will give RIGHT ANSWER except in case of Negative Cycle
Who says Dijkstra never visit a node more than once are 500% wrong
also who says Dijkstra can't work with negative weight are wrong