Recompute distances after node removal - algorithm

I have a graph with computed distances from the "start" node. Now I'd like to remove one node (except the starting one) and recompute distances (ideally without running Shortest Path First on the whole graph).
I don't know how to google such an algorithm and my attempts seem to be quite complicated (especially when compared to adding a new node).

One way to implement Dijkstra's algorithm is to maintain a set of nodes whose distance from the start needs to be updated. When a node's distance is effectively updated, that node is removed from the set, but all of its neighbours are added to the set. When a node's update has no effect, i.e., when that node's distance doesn't change, the node is removed from the set and no node is added. The halting condition for the algorithm is "no node needs to be updated".
When you remove a node from the graph, all of its neighbours need to be updated to reflect the removal.
So you can simply "relaunch" Dijkstra's algorithm on your graph, with the initial distances from the start node which you already have, and with the set of nodes to be updated initialised with the neighbours of the node that was removed. The updates will naturally propagate to any node that will eventually need to be updated.
Note: If your graph is oriented, only the nodes with an incoming edge from the removed node need to be added to the set of nodes to be updated.

Kudos to #Stef, whose answer I am now extending.
Removing a node that is part of a set of minimal paths may require recalculation of all distances to their end-nodes. Using the pseudocode from Dijkstra's algorithm as found here:
1 function Dijkstra(Graph, source):
2
3 create vertex set Q
4
5 for each vertex v in Graph:
6 dist[v] ← INFINITY
7 prev[v] ← UNDEFINED
8 add v to Q
10 dist[source] ← 0
11
12 while Q is not empty:
13 u ← vertex in Q with min dist[u]
14
15 remove u from Q
16
17 for each neighbor v of u: // only v that are still in Q
18 alt ← dist[u] + length(u, v)
19 if alt < dist[v]:
20 dist[v] ← alt
21 prev[v] ← u
22
23 return dist[], prev[]
the prev array contains the previous node to each node in the minimum spanning tree (= the tree of all shortest paths). So, to remove node r, assuming you still have the dist and prev array from before removal, you could change it to:
function DijkstraRemove(Graph, dist, prev, removed):
create vertex set Q
for each vertex v in Graph:
while (prev[v] != UNDEFINED):
if prev[v] == removed:
add v to Q
dist[v] = UNDEFINED
prev[v] = UNDEFINED
break
else:
v = prev[v]
// continue with line 12 above

Related

Can Dijkstra's Algorithm work on a graph with weights of 0?

If there exists a weighted graph G, and all weights are 0, does Dijkstra's algorithm still find the shortest path? If so, why?
As per my understanding of the algorithm, Dijsktra's algorithm will run like a normal BFS if there are no edge weights, but I would appreciate some clarification.
Explanation
Dijkstra itself has no problem with 0 weight, per definition of the algorithm. It only gets problematic with negative weights.
Since in every round Dijkstra will settle a node. If you later find a negative weighted edge, this could lead to a shorter path to that settled node. The node would then need to be unsettled, which Dijkstras algorithm does not allow (and it would break the complexity of the algorithm). It gets clear if you take a look at the actual algorithm and some illustration.
The behavior of Dijkstra on such an all zero-graph is the same as if all edges would have a different, but same, value, like 1 (except of the resulting shortest path length). Dijkstra will simply visit all nodes, in no particular order. Basically, like an ordinary Breadth-first search.
Details
Take a look at the algorithm description from Wikipedia:
1 function Dijkstra(Graph, source):
2
3 create vertex set Q
4
5 for each vertex v in Graph: // Initialization
6 dist[v] ← INFINITY // Unknown distance from source to v
7 prev[v] ← UNDEFINED // Previous node in optimal path from source
8 add v to Q // All nodes initially in Q (unvisited nodes)
9
10 dist[source] ← 0 // Distance from source to source
11
12 while Q is not empty:
13 u ← vertex in Q with min dist[u] // Node with the least distance
14 // will be selected first
15 remove u from Q
16
17 for each neighbor v of u: // where v is still in Q.
18 alt ← dist[u] + length(u, v)
19 if alt < dist[v]: // A shorter path to v has been found
20 dist[v] ← alt
21 prev[v] ← u
22
23 return dist[], prev[]
The problem with negative values lies in line 15 and 17. When you remove node u, you settle it. That is, you say that the shortest path to this node is now known. But that means you won't consider u again in line 17 as a neighbor of some other node (since it's not contained in Q anymore).
With negative values it could happen that you later find a shorter path (due to negative weights) to that node. You would need to consider u again in the algorithm and re-do all the computation that depended on the previous shortest path to u. So you would need to add u and every other node that was removed from Q that had u on its shortest path back to Q.
Especially, you would need to consider all edges that could lead to your destination, since you never know where some nasty -1_000_000 weighted edge hides.
The following example illustrates the problem:
Dijkstra will declare the red way as shortest path from A to C, with length 0. However, there is a shorter path. It is marked blue and has a length of 99 - 300 + 1 = -200.
With negative weights you could even create a more dangerous scenario, negative cycles. That is a cycle in the graph with a negative total weight. You then need a way to stop moving along the cycle all the time, endlessly dropping your current weight.
Notes
In an undirected graph edges with weight 0 can be eliminated and the nodes be merged. A shortest path between them will always have length 0. If the whole graph only has 0 weights, then the graph could just be merged to one node. The result to every shortest path query is simply 0.
The same holds for directed graphs if you have such an edge in both directions. If not, you can't do that optimization as you would change reach-ability of nodes.

Is there a true single-pair shortest path algorithm?

I came across this term today "single-pair shortest path problem". I was wondering if a single-pair shortest path algorithm exists for weighted graphs. My reasoning might be flawed, but I imagine that if you want to find the shortest path between A and Z, you absolutely have to know the shortest path from A to B, C, D, ... Y.
If you do not know the latter you can not be sure that your path is in fact the shortest one. Thus for me any shortest path algorithm has to compute the shortest path from A to every other vertex in the graph, in order to get the shortest path from A to Z.
Is this correct?
PS: If yes, any research paper properly proving this?
For non-negative weighted edges graph problem Dijkstra itself solves given problem.
A quote from wiki
The algorithm exists in many variants; Dijkstra's original variant
found the shortest path between two nodes, but a more common variant
fixes a single node as the "source" node and finds shortest paths from
the source to all other nodes in the graph, producing a shortest-path
tree.
Consider following pseudo code from wiki:
1 function Dijkstra(Graph, source):
2
3 create vertex set Q
4
5 for each vertex v in Graph: // Initialization
6 dist[v] ← INFINITY // Unknown distance from source to v
7 prev[v] ← UNDEFINED // Previous node in optimal path from source
8 add v to Q // All nodes initially in Q (unvisited nodes)
9
10 dist[source] ← 0 // Distance from source to source
11
12 while Q is not empty:
13 u ← vertex in Q with min dist[u] // Node with the least distance will be selected first
14 remove u from Q
15
16 for each neighbor v of u: // where v is still in Q.
17 alt ← dist[u] + length(u, v)
18 if alt < dist[v]: // A shorter path to v has been found
19 dist[v] ← alt
20 prev[v] ← u
21
22 return dist[], prev[]
with each new iteration of while (12), first step it to pick the vertex u with shortest distance from the remaining set Q (13) and then that vertex is removed from the Q (14) notifying that shortest distance to u has been achieved. If u is your destination then you can halt without considering further edges.
Note that all vertices were used but not all edges and shortest path to all vertices was not yet found.
Quoting CLRS, 3rd Edition, Chapter 24:
Single-pair shortest-path problem: Find a shortest path from u to v for given vertices u and v. If we solve the single-source problem with source vertex u, we solve this problem also. Moreover, all known algorithms for this problem have the same worst-case asymptotic running time as the best single-source algorithms

Dijkstra with negative edges. Don't understand the examples, they work according to CLRS pseudocode

EDIT 2: It seems this isn't from CLRS (I assumed it was because it followed the same format of CLRS code that was given to us in this Algos and DS course).
Still, in this course we were given this code as being "Dijkstra's Algorithm".
I read Why doesn't Dijkstra's algorithm work for negative weight edges? and Negative weights using Dijkstra's Algorithm (second one is specific to the OP's algorithm I think).
Looking at the Pseudocode from CLRS ("Intro to Algorithms"), I don't understand why Dijkstra wouldn't work on those examples of graphs with negative edges.
In the pseudocode (below), we Insert nodes back onto the heap if the new distance to them is shorter than the previous distance to them, so it seems to me that the distances would eventually be updated to the correct distances.
For example:
The claim here is that (A,C) will be set to 1 and never updated to the correct distance -2.
But the pseudocode from CLRS says that we first put C and B on the Heap with distances 1 and 2 respectively; then we pop C, see no outgoing edges; then we pop B, look at the edge (B,C), see that Dist[C] > Dist[B] + w(B,C), update Dist[C] to -2, put C back on the heap, see no outgoing edges and we're done.
So it worked fine.
Same for the example in the first answer to this question: Negative weights using Dijkstra's Algorithm
The author of the answer claims that the distance to C will not be updated to -200, but according to this pseudocode that's not true, since we would put B back on the heap and then compute the correct shortest distance to C.
(pseudocode from CLRS)
Dijkstra(G(V, E, ω), s ∈ V )
for v in V do
dist[v] ← ∞
prev[v] ← nil
end for
dist[s] = 0
H←{(s,0)}
while H̸=∅ do
v ← DeleteMin(H)
for (v, w) ∈ E do
if dist[w] > dist[v] + ω(v, w) then
dist[w] ← dist[v] + ω(v, w)
prev[w] ← v
Insert((w, dist[w]), H)
end if
end for
end while
EDIT: I understand that we assume that once a node is popped off the heap, the shortest distance has been found; but still, it seems (according to CLRS) that we do put nodes back on the heapif the distance is shorter than previously computed, so in the end when the algorithm is done running we should get the correct shortest distance regardless.
That implementation is technically not Dijkstra's algorithm, which is described by Dijkstra here (could not find any better link): the set A he talks about are the nodes for which the minimum path is known. So once you add a node to this set, it's fixed. You know the minimum path to it, and it no longer participates in the rest of the algorithm. It also talks about transferring nodes, so they cannot be in two sets at once.
This is in line with Wikipedia's pseudocode:
1 function Dijkstra(Graph, source):
2
3 create vertex set Q
4
5 for each vertex v in Graph: // Initialization
6 dist[v] ← INFINITY // Unknown distance from source to v
7 prev[v] ← UNDEFINED // Previous node in optimal path from source
8 add v to Q // All nodes initially in Q (unvisited nodes)
9
10 dist[source] ← 0 // Distance from source to source
11
12 while Q is not empty:
13 u ← vertex in Q with min dist[u] // Node with the least distance will be selected first
14 remove u from Q
15
16 for each neighbor v of u: // where v is still in Q.
17 alt ← dist[u] + length(u, v)
18 if alt < dist[v]: // A shorter path to v has been found
19 dist[v] ← alt
20 prev[v] ← u
21
22 return dist[], prev[]
And its heap pseudocode as well.
However, note that Wikipedia also states, at the time of this answer:
Instead of filling the priority queue with all nodes in the initialization phase, it is also possible to initialize it to contain only source; then, inside the if alt < dist[v] block, the node must be inserted if not already in the queue (instead of performing a decrease_priority operation).[3]:198
Doing this would still lead to reinserting a node in some cases with negative valued edges, such as the example graph given in the accepted answer to the second linked question.
So it seems that some authors make this confusion. In this case, they should clearly state that either this implementation works with negative edges or that it's not a proper Dijkstra's implementation.
I guess the original paper might be interpreted as a bit vague. Nowhere in it does Dijkstra make any mention of negative or positive edges, nor does he make it clear beyond any alternative interpretation that a node cannot be updated once in the A set. I don't know if he himself further clarified things in any subsequent works or speeches, or if the rest is just a matter of interpretation by others.
So from my point of view, you could argue that it's also a valid Dijkstra's.
As to why you might implement it this way: because it will likely be no slower in practice if we only have positive edges, and because it is quicker to write without having to perform additional checks or not-so-standard heap operations.

Djikstra's algorithm: Do I iterate through the neighbors or the children?

I'm looking at Djikstra's algorithm in pseudo-code on Wikipedia
1 function Dijkstra(Graph, source):
2
3 create vertex set Q
4
5 for each vertex v in Graph: // Initialization
6 dist[v] ← INFINITY // Unknown distance from source to v
7 prev[v] ← UNDEFINED // Previous node in optimal path from source
8 add v to Q // All nodes initially in Q (unvisited nodes)
9
10 dist[source] ← 0 // Distance from source to source
11
12 while Q is not empty:
13 u ← vertex in Q with min dist[u] // Source node will be selected first
14 remove u from Q
15
16 for each neighbor v of u: // where v is still in Q.
17 alt ← dist[u] + length(u, v)
18 if alt < dist[v]: // A shorter path to v has been found
19 dist[v] ← alt
20 prev[v] ← u
21
22 return dist[], prev[]
https://en.wikipedia.org/wiki/Dijkstra%27s_algorithm
and the part that's confusing me is line 16. It says for each neighbor but shouldn't that be for each child (i.e. for each neighbor where neighbor != parent). Otherwise I don't see the point of setting the parent in line 20.
The previous node is set on line 20:
prev[v] ← u
This can only happen if line 14 is executed:
remove u from Q
So, for any v, prev[v] cannot be in Q - it was previously removed, and it will never return to Q (within the loop starting at 12, items are not added anymore to Q). This is the same as saying for any u, prev[u] cannot be in Q - asides from changing the name of the variable, it says the same thing.
In the question you say that about line 16:
it says for each neighbor
But, if you look at the pseudocode, it actually says
for each neighbor v of u: // where v is still in Q.
So, prev[u] will not be iterated over - it's not in Q.
For what it's worth, I think the pseudocode is a bit sloppy and confusing // where v is still in Q should not be a comment. It doesn't clarify or explain the rest of the code - it alters the meaning, and should be part of the code. Perhaps that confused you.
Ultimately, Dijkstra's algorithm computes something called a shortest-path tree, a tree structure rooted at some starting node where the paths in the tree give the shortest paths to each node in the graph. The logic you're seeing that sets the parent of each node is the part of Dijkstra's algorithm that builds the tree one node at a time.
Although Dijkstra's algorithm builds the shortest-path tree, it doesn't walk over it. Rather, it works by processing the nodes of the original path in a particular order, constantly updating candidate distances of nodes adjacent to processed nodes. This means that in the pseudocode, the logic that says "loop over all the adjacent nodes" is correct because it means "loop over all the adjacent nodes in the original graph." It wouldn't work to iterate over all the child nodes in the generated shortest-path tree because that tree hasn't been completely assembled at that point in the algorithm.

Do we really need the "visited or not" info of a vertex in Dijkstra's algorithm?

In Dijkstra's algorithm on Wikipedia (older version, now corrected by me), the implementation with priority queue will check vertex is visited or not before check the shorter path.
Is it really necessary? or even correct?
function Dijkstra(Graph, source):
dist[source] ← 0 // Initialization
for each vertex v in Graph:
if v ≠ source
dist[v] ← infinity // Unknown distance from source to v
prev[v] ← undefined // Predecessor of v
end if
Q.add_with_priority(v,dist[v])
end for
while Q is not empty: // The main loop
u ← Q.extract_min() // Remove and return best vertex
mark u as scanned
for each neighbor v of u:
if v is not yet scanned: // **is this in need?**
alt = dist[u] + length(u, v)
if alt < dist[v]
dist[v] ← alt
prev[v] ← u
Q.decrease_priority(v,alt)
end if
end if
end for
end while
return prev[]
I think checking v is scanned already or not will prevent any chances in the future if v's path need to be updated.
Update:
I've edited the page and the current Dijkstra's algorithm wiki page is now correct.
The flag is not needed. Just look at this section of pseudocode:
if v is not yet scanned:
alt = dist[u] + length(u, v)
if alt < dist[v]
dist[v] ← alt
prev[v] ← u
Q.decrease_priority(v,alt)
end if
end if
If you check the different parts:
alt seems to be a local variable that is only used as temporary here, so writing to it doesn't make any difference anywhere else.
If v was already removed from the queue, the path to it was at most as long as the path to u, otherwise u would have been extracted first.
If v was visited, then dist[v] <= dist[u] <= alt. In that case the comparison alt < dist[v] is false and the rest of the code is skipped anyways.
Just to explain a bit more, think about the role of the priority queue. It contains the nodes, ordered by the shortest known path to them. When extracting a node from the queue, all nodes before where at most as far away as that node and all nodes after will be at least as far away as that node. Since all nodes that were closer were already processed, any new path discovered to one of these nodes there will be via a node that is at least as far away. This means that there can't be any shorter route to an extracted node coming from a node that is still in the queue, so the mere check alt < dist[v] will already exclude those nodes that were scanned.
It is require to check V is scanned already or not.
At first cycle we find the vertex(V1) which is minimum cost of neighbor vertexes of source vertex(S1).
At Second cycle, we should not reverse back to the source vertex(S1) from vertex(V1).
Reverse back to the source Vertex (S1) occurs when length(S1,V1) is minimum of neighbor cost vertex(V1).
Code without checking V scanned or not will result in loop for certain case.

Resources