All paths of length k from a given vertice - algorithm

Let's say i have a directed graph G(V,E) with edge cost(real number) ∈ (0,1).For given i,I need to find all the couples of vertices (i,j) starting from i that "match".Two vertices (i,j) match if there is a directed path from i to j with length exactly k(k is a given number that is relatively small and could be considered as constant)with cost >=C(C is a given number).Cost of a path is calculated as the product of it's edges.For example if a path starting from i and ending in j of lenght 2 consists edges e1 and e2 then CostOfpath=cost(e1)*cost(e2).
This has to be done in O(E+V*k).So what i thought is modifying the DFS algorithm updating the distances from given starting vertice i until they reach the length of k.If they don't then we can't have a match.However i am having a hard time finding what exactly i can modify in the DFS.Any ideas?

When you need to consider paths with a fixed number of edges in it, dynamic programming often comes to help (while other approaches often fail).
Let's denote dp[v][j] the maximal cost of the path from vertex i (fixed) to vertex v that has exactly j edges.
For a starting values, you can set values for j==1: dp[v][1] is the cost of edge from i to v (or 0 if no such edge exists). Or if you think on it it will be obvious that you can set values for j==0, not j==1: dp[i][0] is 1, while dp[v][0] can be set to zero for v!=i.
Now, if you have values for some j, it is easy to calculate values for j+1:
dp[v][j+1] = max( dp[v'][j] * cost((v', v)) )
This is very similar to Ford-Bellman's algorithm, only that the latter does not need to track the number of edges and thus can use one-dimensional array.
This gives you the solution in O((E+V)*k). Not exactly what you have requested, but I doubt that there exists solution in O(E+V*k).
(In the solution above, I assume that the constant C is positive, and so a zero cost path is equivalent to the path being absolutely absent. If you need, you can specifically account for the C==0 case.)

Related

Why priority-queue based Dijkstra shortest-path algorithm cannot work for negative-weights graph? [duplicate]

Can somebody tell me why Dijkstra's algorithm for single source shortest path assumes that the edges must be non-negative.
I am talking about only edges not the negative weight cycles.
Recall that in Dijkstra's algorithm, once a vertex is marked as "closed" (and out of the open set) - the algorithm found the shortest path to it, and will never have to develop this node again - it assumes the path developed to this path is the shortest.
But with negative weights - it might not be true. For example:
A
/ \
/ \
/ \
5 2
/ \
B--(-10)-->C
V={A,B,C} ; E = {(A,C,2), (A,B,5), (B,C,-10)}
Dijkstra from A will first develop C, and will later fail to find A->B->C
EDIT a bit deeper explanation:
Note that this is important, because in each relaxation step, the algorithm assumes the "cost" to the "closed" nodes is indeed minimal, and thus the node that will next be selected is also minimal.
The idea of it is: If we have a vertex in open such that its cost is minimal - by adding any positive number to any vertex - the minimality will never change.
Without the constraint on positive numbers - the above assumption is not true.
Since we do "know" each vertex which was "closed" is minimal - we can safely do the relaxation step - without "looking back". If we do need to "look back" - Bellman-Ford offers a recursive-like (DP) solution of doing so.
Consider the graph shown below with the source as Vertex A. First try running Dijkstra’s algorithm yourself on it.
When I refer to Dijkstra’s algorithm in my explanation I will be talking about the Dijkstra's Algorithm as implemented below,
So starting out the values (the distance from the source to the vertex) initially assigned to each vertex are,
We first extract the vertex in Q = [A,B,C] which has smallest value, i.e. A, after which Q = [B, C]. Note A has a directed edge to B and C, also both of them are in Q, therefore we update both of those values,
Now we extract C as (2<5), now Q = [B]. Note that C is connected to nothing, so line16 loop doesn't run.
Finally we extract B, after which . Note B has a directed edge to C but C isn't present in Q therefore we again don't enter the for loop in line16,
So we end up with the distances as
Note how this is wrong as the shortest distance from A to C is 5 + -10 = -5, when you go .
So for this graph Dijkstra's Algorithm wrongly computes the distance from A to C.
This happens because Dijkstra's Algorithm does not try to find a shorter path to vertices which are already extracted from Q.
What the line16 loop is doing is taking the vertex u and saying "hey looks like we can go to v from source via u, is that (alt or alternative) distance any better than the current dist[v] we got? If so lets update dist[v]"
Note that in line16 they check all neighbors v (i.e. a directed edge exists from u to v), of u which are still in Q. In line14 they remove visited notes from Q. So if x is a visited neighbour of u, the path is not even considered as a possible shorter way from source to v.
In our example above, C was a visited neighbour of B, thus the path was not considered, leaving the current shortest path unchanged.
This is actually useful if the edge weights are all positive numbers, because then we wouldn't waste our time considering paths that can't be shorter.
So I say that when running this algorithm if x is extracted from Q before y, then its not possible to find a path - which is shorter. Let me explain this with an example,
As y has just been extracted and x had been extracted before itself, then dist[y] > dist[x] because otherwise y would have been extracted before x. (line 13 min distance first)
And as we already assumed that the edge weights are positive, i.e. length(x,y)>0. So the alternative distance (alt) via y is always sure to be greater, i.e. dist[y] + length(x,y)> dist[x]. So the value of dist[x] would not have been updated even if y was considered as a path to x, thus we conclude that it makes sense to only consider neighbors of y which are still in Q (note comment in line16)
But this thing hinges on our assumption of positive edge length, if length(u,v)<0 then depending on how negative that edge is we might replace the dist[x] after the comparison in line18.
So any dist[x] calculation we make will be incorrect if x is removed before all vertices v - such that x is a neighbour of v with negative edge connecting them - is removed.
Because each of those v vertices is the second last vertex on a potential "better" path from source to x, which is discarded by Dijkstra’s algorithm.
So in the example I gave above, the mistake was because C was removed before B was removed. While that C was a neighbour of B with a negative edge!
Just to clarify, B and C are A's neighbours. B has a single neighbour C and C has no neighbours. length(a,b) is the edge length between the vertices a and b.
Dijkstra's algorithm assumes paths can only become 'heavier', so that if you have a path from A to B with a weight of 3, and a path from A to C with a weight of 3, there's no way you can add an edge and get from A to B through C with a weight of less than 3.
This assumption makes the algorithm faster than algorithms that have to take negative weights into account.
Correctness of Dijkstra's algorithm:
We have 2 sets of vertices at any step of the algorithm. Set A consists of the vertices to which we have computed the shortest paths. Set B consists of the remaining vertices.
Inductive Hypothesis: At each step we will assume that all previous iterations are correct.
Inductive Step: When we add a vertex V to the set A and set the distance to be dist[V], we must prove that this distance is optimal. If this is not optimal then there must be some other path to the vertex V that is of shorter length.
Suppose this some other path goes through some vertex X.
Now, since dist[V] <= dist[X] , therefore any other path to V will be atleast dist[V] length, unless the graph has negative edge lengths.
Thus for dijkstra's algorithm to work, the edge weights must be non negative.
Dijkstra's Algorithm assumes that all edges are positive weighted and this assumption helps the algorithm run faster ( O(E*log(V) ) than others which take into account the possibility of negative edges (e.g bellman ford's algorithm with complexity of O(V^3)).
This algorithm wont give the correct result in the following case (with a -ve edge) where A is the source vertex:
Here, the shortest distance to vertex D from source A should have been 6. But according to Dijkstra's method the shortest distance will be 7 which is incorrect.
Also, Dijkstra's Algorithm may sometimes give correct solution even if there are negative edges. Following is an example of such a case:
However, It will never detect a negative cycle and always produce a result which will always be incorrect if a negative weight cycle is reachable from the source, as in such a case there exists no shortest path in the graph from the source vertex.
Try Dijkstra's algorithm on the following graph, assuming A is the source node and D is the destination, to see what is happening:
Note that you have to follow strictly the algorithm definition and you should not follow your intuition (which tells you the upper path is shorter).
The main insight here is that the algorithm only looks at all directly connected edges and it takes the smallest of these edge. The algorithm does not look ahead. You can modify this behavior , but then it is not the Dijkstra algorithm anymore.
You can use dijkstra's algorithm with negative edges not including negative cycle, but you must allow a vertex can be visited multiple times and that version will lose it's fast time complexity.
In that case practically I've seen it's better to use SPFA algorithm which have normal queue and can handle negative edges.
Recall that in Dijkstra's algorithm, once a vertex is marked as "closed" (and out of the open set) -it assumes that any node originating from it will lead to greater distance so, the algorithm found the shortest path to it, and will never have to develop this node again, but this doesn't hold true in case of negative weights.
The other answers so far demonstrate pretty well why Dijkstra's algorithm cannot handle negative weights on paths.
But the question itself is maybe based on a wrong understanding of the weight of paths. If negative weights on paths would be allowed in pathfinding algorithms in general, then you would get permanent loops that would not stop.
Consider this:
A <- 5 -> B <- (-1) -> C <- 5 -> D
What is the optimal path between A and D?
Any pathfinding algorithm would have to continuously loop between B and C because doing so would reduce the weight of the total path. So allowing negative weights for a connection would render any pathfindig algorithm moot, maybe except if you limit each connection to be used only once.
So, to explain this in more detail, consider the following paths and weights:
Path | Total weight
ABCD | 9
ABCBCD | 7
ABCBCBCD | 5
ABCBCBCBCD | 3
ABCBCBCBCBCD | 1
ABCBCBCBCBCBCD | -1
...
So, what's the perfect path? Any time the algorithm adds a BC step, it reduces the total weight by 2.
So the optimal path is A (BC) D with the BC part being looped forever.
Since Dijkstra's goal is to find the optimal path (not just any path), it, by definition, cannot work with negative weights, since it cannot find the optimal path.
Dijkstra will actually not loop, since it keeps a list of nodes that it has visited. But it will not find a perfect path, but instead just any path.
Adding few points to the explanation, on top of the previous answers, for the following simple example,
Dijktra's algorithm being greedy, it first finds the minimum distance vertex C from the source vertex A greedily and assigns the distance d[C] (from vertex A) to the weight of the edge AC.
The underlying assumption is that since C was picked first, there is no other vertex V in the graph s.t. w(AV) < w(AC), otherwise V would have been picked instead of C, by the algorithm.
Since by above logic, w(AC) <= w(AV), for all vertex V different from the vertices A and C. Now, clearly any other path P that starts from A and ends in C, going through V , i.e., the path P = A -> V -> ... -> C, will be longer in length (>= 2) and total cost of the path P will be sum of the edges on it, i.e., cost(P) >= w(AV) >= w(AC), assuming all edges on P have non-negative weights, so that
C can be safely removed from the queue Q, since d[C] can never get smaller / relaxed further under this assumption.
Obviously, the above assumption does not hold when some.edge on P is negative, in a which case d[C] may decrease further, but the algorithm can't take care of this scenario, since by that time it has removed C from the queue Q.
In Unweighted graph
Dijkstra can even work without set or priority queue, even if you just use STACK the algorithm will work but with Stack its time of execution will increase
Dijkstra don't repeat a node once its processed becoz it always tooks the minimum route , which means if you come to that node via any other path it will certainly have greater distance
For ex -
(0)
/
6 5
/
(2) (1)
\ /
4 7
\ /
(9)
here once you get to node 1 via 0 (as its minimum out of 5 and 6)so now there is no way you can get a minimum value for reaching 1
because all other path will add value to 5 and not decrease it
more over with Negative weights it will fall into infinite loop
In Unweighted graph
Dijkstra Algo will fall into loop if it has negative weight
In Directed graph
Dijkstra Algo will give RIGHT ANSWER except in case of Negative Cycle
Who says Dijkstra never visit a node more than once are 500% wrong
also who says Dijkstra can't work with negative weight are wrong

least cost path, destination unknown

Question
How would one going about finding a least cost path when the destination is unknown, but the number of edges traversed is a fixed value? Is there a specific name for this problem, or for an algorithm to solve it?
Note that maybe the term "walk" is more appropriate than "path", I'm not sure.
Explanation
Say you have a weighted graph, and you start at vertex V1. The goal is to find a path of length N (where N is the number of edges traversed, can cross the same edge multiple times, can revisit vertices) that has the smallest cost. This process would need to be repeated for all possible starting vertices.
As an additional heuristic, consider a turn-based game where there are rooms connected by corridors. Each corridor has a cost associated with it, and your final score is lowered by an amount equal to each cost 'paid'. It takes 1 turn to traverse a corridor, and the game lasts 10 turns. You can stay in a room (self-loop), but staying put has a cost associated with it too. If you know the cost of all corridors (and for staying put in each room; i.e., you know the weighted graph), what is the optimal (highest-scoring) path to take for a 10-turn (or N-turn) game? You can revisit rooms and corridors.
Possible Approach (likely to fail)
I was originally thinking of using Dijkstra's algorithm to find least cost path between all pairs of vertices, and then for each starting vertex subset the LCP's of length N. However, I realized that this might not give the LCP of length N for a given starting vertex. For example, Dijkstra's LCP between V1 and V2 might have length < N, and Dijkstra's might have excluded an unnecessary but low-cost edge, which, if included, would have made the path length equal N.
It's an interesting fact that if A is the adjacency matrix and you compute Ak using addition and min in place of the usual multiply and sum used in normal matrix multiplication, then Ak[i,j] is the length of the shortest path from node i to node j with exactly k edges. Now the trick is to use repeated squaring so that Ak needs only log k matrix multiply ops.
If you need the path in addition to the minimum length, you must track where the result of each min operation came from.
For your purposes, you want the location of the min of each row of the result matrix and corresponding path.
This is a good algorithm if the graph is dense. If it's sparse, then doing one bread-first search per node to depth k will be faster.

Find Path of a Specific Weight in a Weighted DAG

Given a DAG where are Edges have a Positive Edge Weight. Given a Value N.
Algorithm to calculate a simple (no cycles or node repetitions) Path with the Total weight N?
I am aware of the Algorithm where we have to find a Path of Given Path Length (number of Edges) but somewhat confused about for the Given Path Weight?
Can Dijkstra be modified for this case? Or anything else?
This is NP-complete, so don't expect any reasonably fast (polynomial-time) algorithm. Here's a reduction from the NP-complete Subset Sum problem, where we are given a multiset of n integers X = {x_1, x_2, ..., x_n} and a number k, and asked if there is any submultiset of the n numbers that sum to exactly k:
Create a graph G with n+1 vertices v_1, v_2, ..., v_{n+1}. For each vertex v_i, add edges to every higher-numbered vertex v_j, and give all these edges weight x_i. This graph has O(n^2) edges and can be constructed in O(n^2) time. Clearly it contains no cycles.
Suppose the answer to the Subset Sum problem is YES: That is, there exists a submultiset Y of X such that the numbers in Y total to exactly k. Actually, let Y = {y_1, y_2, ..., y_m} consist of the m <= n indices 1 <= i <= n of the selected elements of X. Then there is a corresponding path in the graph G with exactly the same weight -- namely the path that starts at v_{y_1}, takes the edge to v_{y_2} (which is of weight x_{y_1}), then takes the edge to v_{y_3}, and so on, finally arriving at v_{y_m} and taking a final edge (which is of weight x_{y_m}) to the terminal vertex v_{n+1}.
In the other direction, suppose that there is a simple path in G of total weight exactly k. Since the path is simple, each vertex appears at most once. Thus each edge in the path leaves a unique vertex. For each vertex v_i in the path except the last, add x_i to a set of chosen numbers: these numbers correspond to the edge weights in the path, so clearly they sum to exactly k, implying that the solution to the Subset Sum problem is also YES. (Notice that the position of the final vertex in the path doesn't matter, since we only care about the vertex that it leaves, and all edges leaving a vertex have the same weight.)
A YES answer to either problem implies a YES answer to the other problem, so a NO answer to either problem implies a NO answer to the other problem. Thus the answer to any Subset Sum problem instance can be found by first constructing the specified instance of your problem in polynomial time, and then using any algorithm for your problem to solve that instance -- so if an algorithm exists that can solve any instance of your problem in polynomial time, the NP-hard Subset Sum problem can also be solved in polynomial time.

Path from s to e in a weighted DAG graph with limitations

Consider a directed graph with n nodes and m edges. Each edge is weighted. There is a start node s and an end node e. We want to find the path from s to e that has the maximum number of nodes such that:
the total distance is less than some constant d
starting from s, each node in the path is closer than the previous one to the node e. (as in, when you traverse the path you are getting closer to your destination e. in terms of the edge weight of the remaining path.)
We can assume there are no cycles in the graph. There are no negative weights. Does an efficient algorithm already exist for this problem? Is there a name for this problem?
Whatever you end up doing, do a BFS/DFS starting from s first to see if e can even be reached; this only takes you O(n+m) so it won't add to the complexity of the problem (since you need to look at all vertices and edges anyway). Also, delete all edges with weight 0 before you do anything else since those never fulfill your second criterion.
EDIT: I figured out an algorithm; it's polynomial, depending on the size of your graphs it may still not be sufficiently efficient though. See the edit further down.
Now for some complexity. The first thing to think about here is an upper bound on how many paths we can actually have, so depending on the choice of d and the weights of the edges, we also have an upper bound on the complexity of any potential algorithm.
How many edges can there be in a DAG? The answer is n(n-1)/2, which is a tight bound: take n vertices, order them from 1 to n; for two vertices i and j, add an edge i->j to the graph iff i<j. This sums to a total of n(n-1)/2, since this way, for every pair of vertices, there is exactly one directed edge between them, meaning we have as many edges in the graph as we would have in a complete undirected graph over n vertices.
How many paths can there be from one vertex to another in the graph described above? The answer is 2n-2. Proof by induction:
Take the graph over 2 vertices as described above; there is 1 = 20 = 22-2 path from vertex 1 to vertex 2: (1->2).
Induction step: assuming there are 2n-2 paths from the vertex with number 1 of an n vertex graph as described above to the vertex with number n, increment the number of each vertex and add a new vertex 1 along with the required n edges. It has its own edge to the vertex now labeled n+1. Additionally, it has 2i-2 paths to that vertex for every i in [2;n] (it has all the paths the other vertices have to the vertex n+1 collectively, each "prefixed" with the edge 1->i). This gives us 1 + Σnk=2 (2k-2) = 1 + Σn-2k=0 (2k-2) = 1 + (2n-1 - 1) = 2n-1 = 2(n+1)-2.
So we see that there are DAGs that have 2n-2 distinct paths between some pairs of their vertices; this is a bit of a bleak outlook, since depending on weights and your choice of d, you may have to consider them all. This in itself doesn't mean we can't choose some form of optimum (which is what you're looking for) efficiently though.
EDIT: Ok so here goes what I would do:
Delete all edges with weight 0 (and smaller, but you ruled that out), since they can never fulfill your second criterion.
Do a topological sort of the graph; in the following, let's only consider the part of the topological sorting of the graph from s to e, let's call that the integer interval [s;e]. Delete everything from the graph that isn't strictly in that interval, meaning all vertices outside of it along with the incident edges. During the topSort, you'll also be able to see whether there is a
path from s to e, so you'll know whether there are any paths s-...->e. Complexity of this part is O(n+m).
Now the actual algorithm:
traverse the vertices of [s;e] in the order imposed by the topological
sorting
for every vertex v, store a two-dimensional array of information; let's call it
prev[][] since it's gonna store information about the predecessors
of a node on the paths leading towards it
in prev[i][j], store how long the total path of length (counted in
vertices) i is as a sum of the edge weights, if j is the predecessor of the
current vertex on that path. For example, pres+1[1][s] would have
the weight of the edge s->s+1 in it, while all other entries in pres+1
would be 0/undefined.
when calculating the array for a new vertex v, all we have to do is check
its incoming edges and iterate over the arrays for the start vertices of those
edges. For example, let's say vertex v has an incoming edge from vertex w,
having weight c. Consider what the entry prev[i][w] should be.
We have an edge w->v, so we need to set prev[i][w] in v to
min(prew[i-1][k] for all k, but ignore entries with 0) + c (notice the subscript of the array!); we effectively take the cost of a
path of length i - 1 that leads to w, and add the cost of the edge w->v.
Why the minimum? The vertex w can have many predecessors for paths of length
i - 1; however, we want to stay below a cost limit, which greedy minimization
at each vertex will do for us. We will need to do this for all i in [1;s-v].
While calculating the array for a vertex, do not set entries that would give you
a path with cost above d; since all edges have positive weights, we can only get
more costly paths with each edge, so just ignore those.
Once you reached e and finished calculating pree, you're done with this
part of the algorithm.
Iterate over pree, starting with pree[e-s]; since we have no cycles, all
paths are simple paths and therefore the longest path from s to e can have e-s edges. Find the largest
i such that pree[i] has a non-zero (meaning it is defined) entry; if non exists, there is no path fitting your criteria. You can reconstruct
any existing path using the arrays of the other vertices.
Now that gives you a space complexity of O(n^3) and a time complexity of O(n²m) - the arrays have O(n²) entries, we have to iterate over O(m) arrays, one array for each edge - but I think it's very obvious where the wasteful use of data structures here can be optimized using hashing structures and other things than arrays. Or you could just use a one-dimensional array and only store the current minimum instead of recomputing it every time (you'll have to encapsulate the sum of edge weights of the path together with the predecessor vertex though since you need to know the predecessor to reconstruct the path), which would change the size of the arrays from n² to n since you now only need one entry per number-of-nodes-on-path-to-vertex, bringing down the space complexity of the algorithm to O(n²) and the time complexity to O(nm). You can also try and do some form of topological sort that gets rid of the vertices from which you can't reach e, because those can be safely ignored as well.

Why doesn't Dijkstra's algorithm work for negative weight edges?

Can somebody tell me why Dijkstra's algorithm for single source shortest path assumes that the edges must be non-negative.
I am talking about only edges not the negative weight cycles.
Recall that in Dijkstra's algorithm, once a vertex is marked as "closed" (and out of the open set) - the algorithm found the shortest path to it, and will never have to develop this node again - it assumes the path developed to this path is the shortest.
But with negative weights - it might not be true. For example:
A
/ \
/ \
/ \
5 2
/ \
B--(-10)-->C
V={A,B,C} ; E = {(A,C,2), (A,B,5), (B,C,-10)}
Dijkstra from A will first develop C, and will later fail to find A->B->C
EDIT a bit deeper explanation:
Note that this is important, because in each relaxation step, the algorithm assumes the "cost" to the "closed" nodes is indeed minimal, and thus the node that will next be selected is also minimal.
The idea of it is: If we have a vertex in open such that its cost is minimal - by adding any positive number to any vertex - the minimality will never change.
Without the constraint on positive numbers - the above assumption is not true.
Since we do "know" each vertex which was "closed" is minimal - we can safely do the relaxation step - without "looking back". If we do need to "look back" - Bellman-Ford offers a recursive-like (DP) solution of doing so.
Consider the graph shown below with the source as Vertex A. First try running Dijkstra’s algorithm yourself on it.
When I refer to Dijkstra’s algorithm in my explanation I will be talking about the Dijkstra's Algorithm as implemented below,
So starting out the values (the distance from the source to the vertex) initially assigned to each vertex are,
We first extract the vertex in Q = [A,B,C] which has smallest value, i.e. A, after which Q = [B, C]. Note A has a directed edge to B and C, also both of them are in Q, therefore we update both of those values,
Now we extract C as (2<5), now Q = [B]. Note that C is connected to nothing, so line16 loop doesn't run.
Finally we extract B, after which . Note B has a directed edge to C but C isn't present in Q therefore we again don't enter the for loop in line16,
So we end up with the distances as
Note how this is wrong as the shortest distance from A to C is 5 + -10 = -5, when you go .
So for this graph Dijkstra's Algorithm wrongly computes the distance from A to C.
This happens because Dijkstra's Algorithm does not try to find a shorter path to vertices which are already extracted from Q.
What the line16 loop is doing is taking the vertex u and saying "hey looks like we can go to v from source via u, is that (alt or alternative) distance any better than the current dist[v] we got? If so lets update dist[v]"
Note that in line16 they check all neighbors v (i.e. a directed edge exists from u to v), of u which are still in Q. In line14 they remove visited notes from Q. So if x is a visited neighbour of u, the path is not even considered as a possible shorter way from source to v.
In our example above, C was a visited neighbour of B, thus the path was not considered, leaving the current shortest path unchanged.
This is actually useful if the edge weights are all positive numbers, because then we wouldn't waste our time considering paths that can't be shorter.
So I say that when running this algorithm if x is extracted from Q before y, then its not possible to find a path - which is shorter. Let me explain this with an example,
As y has just been extracted and x had been extracted before itself, then dist[y] > dist[x] because otherwise y would have been extracted before x. (line 13 min distance first)
And as we already assumed that the edge weights are positive, i.e. length(x,y)>0. So the alternative distance (alt) via y is always sure to be greater, i.e. dist[y] + length(x,y)> dist[x]. So the value of dist[x] would not have been updated even if y was considered as a path to x, thus we conclude that it makes sense to only consider neighbors of y which are still in Q (note comment in line16)
But this thing hinges on our assumption of positive edge length, if length(u,v)<0 then depending on how negative that edge is we might replace the dist[x] after the comparison in line18.
So any dist[x] calculation we make will be incorrect if x is removed before all vertices v - such that x is a neighbour of v with negative edge connecting them - is removed.
Because each of those v vertices is the second last vertex on a potential "better" path from source to x, which is discarded by Dijkstra’s algorithm.
So in the example I gave above, the mistake was because C was removed before B was removed. While that C was a neighbour of B with a negative edge!
Just to clarify, B and C are A's neighbours. B has a single neighbour C and C has no neighbours. length(a,b) is the edge length between the vertices a and b.
Dijkstra's algorithm assumes paths can only become 'heavier', so that if you have a path from A to B with a weight of 3, and a path from A to C with a weight of 3, there's no way you can add an edge and get from A to B through C with a weight of less than 3.
This assumption makes the algorithm faster than algorithms that have to take negative weights into account.
Correctness of Dijkstra's algorithm:
We have 2 sets of vertices at any step of the algorithm. Set A consists of the vertices to which we have computed the shortest paths. Set B consists of the remaining vertices.
Inductive Hypothesis: At each step we will assume that all previous iterations are correct.
Inductive Step: When we add a vertex V to the set A and set the distance to be dist[V], we must prove that this distance is optimal. If this is not optimal then there must be some other path to the vertex V that is of shorter length.
Suppose this some other path goes through some vertex X.
Now, since dist[V] <= dist[X] , therefore any other path to V will be atleast dist[V] length, unless the graph has negative edge lengths.
Thus for dijkstra's algorithm to work, the edge weights must be non negative.
Dijkstra's Algorithm assumes that all edges are positive weighted and this assumption helps the algorithm run faster ( O(E*log(V) ) than others which take into account the possibility of negative edges (e.g bellman ford's algorithm with complexity of O(V^3)).
This algorithm wont give the correct result in the following case (with a -ve edge) where A is the source vertex:
Here, the shortest distance to vertex D from source A should have been 6. But according to Dijkstra's method the shortest distance will be 7 which is incorrect.
Also, Dijkstra's Algorithm may sometimes give correct solution even if there are negative edges. Following is an example of such a case:
However, It will never detect a negative cycle and always produce a result which will always be incorrect if a negative weight cycle is reachable from the source, as in such a case there exists no shortest path in the graph from the source vertex.
Try Dijkstra's algorithm on the following graph, assuming A is the source node and D is the destination, to see what is happening:
Note that you have to follow strictly the algorithm definition and you should not follow your intuition (which tells you the upper path is shorter).
The main insight here is that the algorithm only looks at all directly connected edges and it takes the smallest of these edge. The algorithm does not look ahead. You can modify this behavior , but then it is not the Dijkstra algorithm anymore.
You can use dijkstra's algorithm with negative edges not including negative cycle, but you must allow a vertex can be visited multiple times and that version will lose it's fast time complexity.
In that case practically I've seen it's better to use SPFA algorithm which have normal queue and can handle negative edges.
Recall that in Dijkstra's algorithm, once a vertex is marked as "closed" (and out of the open set) -it assumes that any node originating from it will lead to greater distance so, the algorithm found the shortest path to it, and will never have to develop this node again, but this doesn't hold true in case of negative weights.
The other answers so far demonstrate pretty well why Dijkstra's algorithm cannot handle negative weights on paths.
But the question itself is maybe based on a wrong understanding of the weight of paths. If negative weights on paths would be allowed in pathfinding algorithms in general, then you would get permanent loops that would not stop.
Consider this:
A <- 5 -> B <- (-1) -> C <- 5 -> D
What is the optimal path between A and D?
Any pathfinding algorithm would have to continuously loop between B and C because doing so would reduce the weight of the total path. So allowing negative weights for a connection would render any pathfindig algorithm moot, maybe except if you limit each connection to be used only once.
So, to explain this in more detail, consider the following paths and weights:
Path | Total weight
ABCD | 9
ABCBCD | 7
ABCBCBCD | 5
ABCBCBCBCD | 3
ABCBCBCBCBCD | 1
ABCBCBCBCBCBCD | -1
...
So, what's the perfect path? Any time the algorithm adds a BC step, it reduces the total weight by 2.
So the optimal path is A (BC) D with the BC part being looped forever.
Since Dijkstra's goal is to find the optimal path (not just any path), it, by definition, cannot work with negative weights, since it cannot find the optimal path.
Dijkstra will actually not loop, since it keeps a list of nodes that it has visited. But it will not find a perfect path, but instead just any path.
Adding few points to the explanation, on top of the previous answers, for the following simple example,
Dijktra's algorithm being greedy, it first finds the minimum distance vertex C from the source vertex A greedily and assigns the distance d[C] (from vertex A) to the weight of the edge AC.
The underlying assumption is that since C was picked first, there is no other vertex V in the graph s.t. w(AV) < w(AC), otherwise V would have been picked instead of C, by the algorithm.
Since by above logic, w(AC) <= w(AV), for all vertex V different from the vertices A and C. Now, clearly any other path P that starts from A and ends in C, going through V , i.e., the path P = A -> V -> ... -> C, will be longer in length (>= 2) and total cost of the path P will be sum of the edges on it, i.e., cost(P) >= w(AV) >= w(AC), assuming all edges on P have non-negative weights, so that
C can be safely removed from the queue Q, since d[C] can never get smaller / relaxed further under this assumption.
Obviously, the above assumption does not hold when some.edge on P is negative, in a which case d[C] may decrease further, but the algorithm can't take care of this scenario, since by that time it has removed C from the queue Q.
In Unweighted graph
Dijkstra can even work without set or priority queue, even if you just use STACK the algorithm will work but with Stack its time of execution will increase
Dijkstra don't repeat a node once its processed becoz it always tooks the minimum route , which means if you come to that node via any other path it will certainly have greater distance
For ex -
(0)
/
6 5
/
(2) (1)
\ /
4 7
\ /
(9)
here once you get to node 1 via 0 (as its minimum out of 5 and 6)so now there is no way you can get a minimum value for reaching 1
because all other path will add value to 5 and not decrease it
more over with Negative weights it will fall into infinite loop
In Unweighted graph
Dijkstra Algo will fall into loop if it has negative weight
In Directed graph
Dijkstra Algo will give RIGHT ANSWER except in case of Negative Cycle
Who says Dijkstra never visit a node more than once are 500% wrong
also who says Dijkstra can't work with negative weight are wrong

Resources