Can somebody tell me why Dijkstra's algorithm for single source shortest path assumes that the edges must be non-negative.
I am talking about only edges not the negative weight cycles.
Recall that in Dijkstra's algorithm, once a vertex is marked as "closed" (and out of the open set) - the algorithm found the shortest path to it, and will never have to develop this node again - it assumes the path developed to this path is the shortest.
But with negative weights - it might not be true. For example:
A
/ \
/ \
/ \
5 2
/ \
B--(-10)-->C
V={A,B,C} ; E = {(A,C,2), (A,B,5), (B,C,-10)}
Dijkstra from A will first develop C, and will later fail to find A->B->C
EDIT a bit deeper explanation:
Note that this is important, because in each relaxation step, the algorithm assumes the "cost" to the "closed" nodes is indeed minimal, and thus the node that will next be selected is also minimal.
The idea of it is: If we have a vertex in open such that its cost is minimal - by adding any positive number to any vertex - the minimality will never change.
Without the constraint on positive numbers - the above assumption is not true.
Since we do "know" each vertex which was "closed" is minimal - we can safely do the relaxation step - without "looking back". If we do need to "look back" - Bellman-Ford offers a recursive-like (DP) solution of doing so.
Consider the graph shown below with the source as Vertex A. First try running Dijkstra’s algorithm yourself on it.
When I refer to Dijkstra’s algorithm in my explanation I will be talking about the Dijkstra's Algorithm as implemented below,
So starting out the values (the distance from the source to the vertex) initially assigned to each vertex are,
We first extract the vertex in Q = [A,B,C] which has smallest value, i.e. A, after which Q = [B, C]. Note A has a directed edge to B and C, also both of them are in Q, therefore we update both of those values,
Now we extract C as (2<5), now Q = [B]. Note that C is connected to nothing, so line16 loop doesn't run.
Finally we extract B, after which . Note B has a directed edge to C but C isn't present in Q therefore we again don't enter the for loop in line16,
So we end up with the distances as
Note how this is wrong as the shortest distance from A to C is 5 + -10 = -5, when you go .
So for this graph Dijkstra's Algorithm wrongly computes the distance from A to C.
This happens because Dijkstra's Algorithm does not try to find a shorter path to vertices which are already extracted from Q.
What the line16 loop is doing is taking the vertex u and saying "hey looks like we can go to v from source via u, is that (alt or alternative) distance any better than the current dist[v] we got? If so lets update dist[v]"
Note that in line16 they check all neighbors v (i.e. a directed edge exists from u to v), of u which are still in Q. In line14 they remove visited notes from Q. So if x is a visited neighbour of u, the path is not even considered as a possible shorter way from source to v.
In our example above, C was a visited neighbour of B, thus the path was not considered, leaving the current shortest path unchanged.
This is actually useful if the edge weights are all positive numbers, because then we wouldn't waste our time considering paths that can't be shorter.
So I say that when running this algorithm if x is extracted from Q before y, then its not possible to find a path - which is shorter. Let me explain this with an example,
As y has just been extracted and x had been extracted before itself, then dist[y] > dist[x] because otherwise y would have been extracted before x. (line 13 min distance first)
And as we already assumed that the edge weights are positive, i.e. length(x,y)>0. So the alternative distance (alt) via y is always sure to be greater, i.e. dist[y] + length(x,y)> dist[x]. So the value of dist[x] would not have been updated even if y was considered as a path to x, thus we conclude that it makes sense to only consider neighbors of y which are still in Q (note comment in line16)
But this thing hinges on our assumption of positive edge length, if length(u,v)<0 then depending on how negative that edge is we might replace the dist[x] after the comparison in line18.
So any dist[x] calculation we make will be incorrect if x is removed before all vertices v - such that x is a neighbour of v with negative edge connecting them - is removed.
Because each of those v vertices is the second last vertex on a potential "better" path from source to x, which is discarded by Dijkstra’s algorithm.
So in the example I gave above, the mistake was because C was removed before B was removed. While that C was a neighbour of B with a negative edge!
Just to clarify, B and C are A's neighbours. B has a single neighbour C and C has no neighbours. length(a,b) is the edge length between the vertices a and b.
Dijkstra's algorithm assumes paths can only become 'heavier', so that if you have a path from A to B with a weight of 3, and a path from A to C with a weight of 3, there's no way you can add an edge and get from A to B through C with a weight of less than 3.
This assumption makes the algorithm faster than algorithms that have to take negative weights into account.
Correctness of Dijkstra's algorithm:
We have 2 sets of vertices at any step of the algorithm. Set A consists of the vertices to which we have computed the shortest paths. Set B consists of the remaining vertices.
Inductive Hypothesis: At each step we will assume that all previous iterations are correct.
Inductive Step: When we add a vertex V to the set A and set the distance to be dist[V], we must prove that this distance is optimal. If this is not optimal then there must be some other path to the vertex V that is of shorter length.
Suppose this some other path goes through some vertex X.
Now, since dist[V] <= dist[X] , therefore any other path to V will be atleast dist[V] length, unless the graph has negative edge lengths.
Thus for dijkstra's algorithm to work, the edge weights must be non negative.
Dijkstra's Algorithm assumes that all edges are positive weighted and this assumption helps the algorithm run faster ( O(E*log(V) ) than others which take into account the possibility of negative edges (e.g bellman ford's algorithm with complexity of O(V^3)).
This algorithm wont give the correct result in the following case (with a -ve edge) where A is the source vertex:
Here, the shortest distance to vertex D from source A should have been 6. But according to Dijkstra's method the shortest distance will be 7 which is incorrect.
Also, Dijkstra's Algorithm may sometimes give correct solution even if there are negative edges. Following is an example of such a case:
However, It will never detect a negative cycle and always produce a result which will always be incorrect if a negative weight cycle is reachable from the source, as in such a case there exists no shortest path in the graph from the source vertex.
Try Dijkstra's algorithm on the following graph, assuming A is the source node and D is the destination, to see what is happening:
Note that you have to follow strictly the algorithm definition and you should not follow your intuition (which tells you the upper path is shorter).
The main insight here is that the algorithm only looks at all directly connected edges and it takes the smallest of these edge. The algorithm does not look ahead. You can modify this behavior , but then it is not the Dijkstra algorithm anymore.
You can use dijkstra's algorithm with negative edges not including negative cycle, but you must allow a vertex can be visited multiple times and that version will lose it's fast time complexity.
In that case practically I've seen it's better to use SPFA algorithm which have normal queue and can handle negative edges.
Recall that in Dijkstra's algorithm, once a vertex is marked as "closed" (and out of the open set) -it assumes that any node originating from it will lead to greater distance so, the algorithm found the shortest path to it, and will never have to develop this node again, but this doesn't hold true in case of negative weights.
The other answers so far demonstrate pretty well why Dijkstra's algorithm cannot handle negative weights on paths.
But the question itself is maybe based on a wrong understanding of the weight of paths. If negative weights on paths would be allowed in pathfinding algorithms in general, then you would get permanent loops that would not stop.
Consider this:
A <- 5 -> B <- (-1) -> C <- 5 -> D
What is the optimal path between A and D?
Any pathfinding algorithm would have to continuously loop between B and C because doing so would reduce the weight of the total path. So allowing negative weights for a connection would render any pathfindig algorithm moot, maybe except if you limit each connection to be used only once.
So, to explain this in more detail, consider the following paths and weights:
Path | Total weight
ABCD | 9
ABCBCD | 7
ABCBCBCD | 5
ABCBCBCBCD | 3
ABCBCBCBCBCD | 1
ABCBCBCBCBCBCD | -1
...
So, what's the perfect path? Any time the algorithm adds a BC step, it reduces the total weight by 2.
So the optimal path is A (BC) D with the BC part being looped forever.
Since Dijkstra's goal is to find the optimal path (not just any path), it, by definition, cannot work with negative weights, since it cannot find the optimal path.
Dijkstra will actually not loop, since it keeps a list of nodes that it has visited. But it will not find a perfect path, but instead just any path.
Adding few points to the explanation, on top of the previous answers, for the following simple example,
Dijktra's algorithm being greedy, it first finds the minimum distance vertex C from the source vertex A greedily and assigns the distance d[C] (from vertex A) to the weight of the edge AC.
The underlying assumption is that since C was picked first, there is no other vertex V in the graph s.t. w(AV) < w(AC), otherwise V would have been picked instead of C, by the algorithm.
Since by above logic, w(AC) <= w(AV), for all vertex V different from the vertices A and C. Now, clearly any other path P that starts from A and ends in C, going through V , i.e., the path P = A -> V -> ... -> C, will be longer in length (>= 2) and total cost of the path P will be sum of the edges on it, i.e., cost(P) >= w(AV) >= w(AC), assuming all edges on P have non-negative weights, so that
C can be safely removed from the queue Q, since d[C] can never get smaller / relaxed further under this assumption.
Obviously, the above assumption does not hold when some.edge on P is negative, in a which case d[C] may decrease further, but the algorithm can't take care of this scenario, since by that time it has removed C from the queue Q.
In Unweighted graph
Dijkstra can even work without set or priority queue, even if you just use STACK the algorithm will work but with Stack its time of execution will increase
Dijkstra don't repeat a node once its processed becoz it always tooks the minimum route , which means if you come to that node via any other path it will certainly have greater distance
For ex -
(0)
/
6 5
/
(2) (1)
\ /
4 7
\ /
(9)
here once you get to node 1 via 0 (as its minimum out of 5 and 6)so now there is no way you can get a minimum value for reaching 1
because all other path will add value to 5 and not decrease it
more over with Negative weights it will fall into infinite loop
In Unweighted graph
Dijkstra Algo will fall into loop if it has negative weight
In Directed graph
Dijkstra Algo will give RIGHT ANSWER except in case of Negative Cycle
Who says Dijkstra never visit a node more than once are 500% wrong
also who says Dijkstra can't work with negative weight are wrong
I want to know the distance between all pairs (ex. dijkstra. Specifically I'm using networkx)
Then when an edge is added to graph I want to update the distances without recomputing from scratch.
How can this be done?
Thanks
It is possible without recomputing all shortest paths but still pretty costly O(n^2).
So lets assume you have a distance matrix M with size n*n where each entry M_{i,j} contains the distance from node i to node j. M is assumed to be precalculated by some algorithm.
Now if a new edge e_{i,j} is added to the graph between node i and node j with cost w_{i,j}, you check if w_{i,j} < M_{i,j}. If not, then nothing has to be changed. But if it holds, then the shortest paths in the graph might improve.
Then for each node pair k, l you check if the path going through the new edge is shorter than the previously calculated one. This can be done by evaluating the following.
M_{k,l} > min (M_{k,i} + w_{i,j} + M_{j,l} , M_{k,j} + w_{j,i} + M_{i,l})
If this holds then you can replace M_{k,l} by min (M_{k,i} + w_{i,j} + M_{j,l} , M_{k,j} + w_{j,i} + M_{i,l})
The above works for unidirectional graphs but can be adapted to bi-directional graphs as well.
Edit 1
I firmly believe that \Omega(n^2) is also the lower bound of this problem. Assume you have two disconnected areas of the graph both containing n/2 vertices. Then if you add a new edge connecting these areas you will have to update n/2 * n/2 shortest paths resulting in at least O(n^2) runtime.
Edit 2
A second idea would try to exploit the above equation but run through the graph first in order to find all pairs of vertices that have to be updated first. The sketch of the idea is as follows:
Start a Dijkstra from node i. Whenever you reach a vertex k you check if M_{k, i} + w_{i, j} < M_{k, j} if yes then add k to the set U of vertices that have to be updated. If not, then you can stop exploring further paths following k, since no vertex "beyond" k will use e_{i, j} for the shortest path.
Then do the same for node j. And then perform the update of M for all pairs of vertices in U according to the ideas above.
Let us say that we have an undirected graph, where each edge has a real number value. Let us define the "sum" of a cycle as the sum of the values of each edge in that cycle.
Is there a reasonably fast way to check if there exists a cycle within the graph containing a certain edge E where the sum is greater than/smaller than 0? Right now my (extremely crude and horribly inefficient) solution is to check for every cycle the edge is in.
The algorithm does not need to find an exact cycle, it only needs to check for existence of such a cycle.
Assuming you allow only simple cycles, no, there is no efficient algorithm to do so, as it will let us solve Hamiltonian Path Problem efficiently. (In other words, this problem is NP-Hard).
Reduction:
(We will use a variant of your problem where we find if there is such simple cycle with weight greater/equals zero).
Given a graph G=(V,E), build a new graph:
G' = (V',E')
V' = V U {s,t}
E' = E U { (s,v), (v,t), (s,t) }
And add weights to the graph:
w(s,t) = -|V'| + 1
w(u,v) = 1 for u!=s and v!= t
Intuitively, we add a "source" and "target" nodes, connect them to all other nodes, and make the two nodes connected with negative weight of all paths.
The reduction is sending (G', (s,t)) to the new algorithm.
Now, if the original graph has hamiltonian path v1->v2->...->vn, then the new graph has a cycle s->v1->v2->...->vn->t->s, which sums to 0, and is a simple cycle.
If there is a simple cycle in G' that uses (s,t), and sums to any number greater than 0, then it means total weights of all other edges except (s,t) used is at least |V'|-1.
From the construction, that means there are exactly |V'| nodes in this cycle, which is the entire graph, so we know the cycle is: s->t->v1->v2->...->vn->s, and since this is simple, v1,v2,...,vn are all the nodes in the original V, which means there is a Hamiltonian Path v1->v2->...->vn.
Conclusion: We have shown a polynomial time reduction from Hamiltionian Path to your problem, and since HP is NP-Hard, this problem also is.
We have to find the route from a source to a sink in a graph in which the difference between the maximum costing edge and the minimum costing edge is minimum.
I tried using a recursive solution but it would fail in condition of cycles and a modified dijkstra which also failed.
Is there an algorithm where i will not have to find all routes and then find the minimum?
Sort the edges by weight (in non-decreasing order), then for each edge do the next: Add the next edges (the ones with greater or equal weight in non-decreasing order) until the source and sink are connected, then update your answer with de difference between the last edge, like this:
ans = INFINITE
for each Edge e1 in E (sorted by weight in non-decreasing order)
clear the temporal graph
for each Edge e2 in E
if e2.weight >= e1.weight
add e2 to the temporal graph
if sink and source are connected in the temporal graph
ans = min(ans, e2.weight - e1.weight)
break
print ans
If you use the UNION-FIND structure to add edges and check connectivity between source and sink you should get an overall time of O(edges^2)
OK, so you have a graph, and you need to find a path from one node (source) to another (sink) such that max edge weight on the path minus min edge weight on the path is minimized. You don't say whether your graph is directed, can have negative edge weights, or has cycles, so lets assume a "yes" answer to all these questions.
When computing your path "score" (maximum difference between edge weights), we observe these are similar to path distances: you could have a path from A to B that scores higher (undesirable) or lower (desirable). If we treat path scores like path weights we observe that as we build a path by adding new edges, the path score (=weight) can only increase: given a path A->B->C where weight(A->B)=1 and weight(B->C)=5, yielding a path score of 4, if I add edge C->D to the path A->B->C, the path score can only increase or stay the same: the difference between minimum edge weight and maximum edge weight will not be lower than 4.
The conclusion from all of this is that we can explore the graph looking for best paths as if we are looking for optimal path in a graph with no negative edges. However, there could be (and likely to be, given the connectivity described) cycles. This means that Dijkstra's algorithm, properly implemented, will have optimal performance relative to this graph's topology given what we know today.
No solution without full graph exploration
One may be misled into thinking that we can make locally good decisions about which edge should belong to the optimal path without exploring the whole graph. The following subgraph illustrates the problem with this:
Say you've need a path from A to F, and you are at node B. Given everything you know, you'd choose a path to C, as it minimizes the path score. What you don't know (yet) is that the next edge in that path will cause the score of this path to increase substantially. If you knew that you would have chosen the edge B->D as a next element in an optimal path.
I was looking through "Fundamentals of Computer Algorithms" book for multi stage graph problem.
It says:
Algorithm Graph(G,k,n,p)
{
cost[n]=0;
for j=n-1 to 1 step -1 do
{
Let r be a vertex such that<j,r> is an edge of G and c[j,r]+cost[r] is minimum
cost[j]=c[j,r]+cost[r]
}
}
The author says that the complexity is O(|V| + |E|). Where the |V| is the number of vertices and |E| is the number of edges.
I know the for loop runs for total number of vertices and the inside line has to select a near edge.
I couldn't understand the logic behind
To further your understanding, look at Dijsktra's algorithm on an abritrary digraph, each edge is only considered once as well. The running time is O(|E| + |V lg V|).
Because a multistage graph is partitioned into sets, you can find the shortest path by set because you know that the vertexes in set X to the target node must be visited before set X-1. You also know that vertexes in the same set don't have edges between each other. In short, you know the order in which to process them and don't have to consider all possible vertexes each time as in Dijsktra