Consider this graph
If we consider A to be the source node and C to be the destination, Dijkstra’s algorithm will first move to D as it is the shorter path and then begin looking for nodes connecting to D.
Now consider the same graph but without edges from D to B and from D to E.
When the current node is A (initial state), The algorithm finds that the shorter path between B and D is D, So it moves to D. However, now that there are no edges remaining from D and A is already explored, Wouldn't the algorithm get stuck at D. Wouldn't it be better if we decided to move to be B instead of D while at A?
How does the algorithm handle this situation?
Thanks
The algorithm is not physically moving in the graph. It's enqueuing nodes that it sees into a priority queue. This allows it to "jump" around arbitrarily, to whichever node has the next shortest path.
Related
I have a type of directed acyclic graph, with some constraints.
There is only one "entry" vertex
There can be multiple leaf vertices
Once a path splits, anything under that path cannot reach into the other path (this will become clearer with some examples below)
There can be any number of "split" vertices. They can be nested.
A "split" vertex can split into any number of paths. The examples below only show 2 paths for each, but it could be more.
My challenge is the following: for each "split" vertex (any vertex that has at least 2 outgoing edges), find the vertices where its paths reconnect - if such a vertex exists. The solution should be as efficient as possible.
Example A:
example a
In this example, vertex A is a "split" vertex, and its "reconnect vertex" is F.
Example B:
example b
Here, there are two split vertices: A and E. For both of them vertex G is the reconnect vertex.
Example C:
example c
Now there are three split vertices: A, D and E. The corresponding reconnect vertices are:
A -> K
D -> K
E -> J
Example D:
example d
Here we have three split vertices again: A, D and E. But this time, vertex E doesn't have a reconnect vertex because one of the paths terminates early.
Sounds like what you want is:
Connect each vertex with out-degree 0 to a single terminal vertex
Construct the dominator tree of the edge-reversed graph. The linked wikipedia article points to a couple algorithms for doing this.
The "reconnect vertex" for a split vertex is its immediate dominator in the edge-reversed graph, i.e., its parent in that dominator tree. This is called its "postdominator" in your original graph. If it's the terminal vertex that you added, then it doesn't have a reconnect vertex in your original graph.
This is the problem of identifying post-dominators in compilers and program analysis. This is often used in the context of calculating control dependences in control flow graphs. "Advanced Compiler Design and Implementation" is a good reference on these topics.
If the graph does not have cycles, then the solution (a) suggested by #matt-timmermans will work.
If the graph has cycles, then solution (a) can report spurious post-dominators. In such cases, a network-flow based approach works better. The algorithm to calculate non-termination sensitive control dependence in this paper using this approach. The basic idea is
at every split node, inject a unique token into the graph along each outgoing edge and
propagate the tokens thru the graph subject to this constraint: if node n is reachable from split node m, then tokens arriving at node m pass thru node n only if all tokens of node m have arrived at node n.
At the end, node n post-dominates node m if all tokens of node m have arrived at node n.
I got the exercise to write a software in which a shortest path should be calculated. You have a specific start node A and a specific ending node Z. These nodes are always the same. Between this nodes there is an undefined amount of "intermediate" Nodes. Now is the question is there any algorithm or any modification of e.g. The dijkstra to calculate the best order of intermediate nodes on the way from A to Z. E.g. is it better to go A B C D Z or A D B C Z. For realizing the graph I use JGraphT. Thanks in advance.
Consider the following graph G and consider that at an execution of the algorithm DFS at G, the edges of the graph are characterized as tree edges(t), back edges(b) , forward edges(f) and cross edges(c) as at the following graph. For each node of the graph find the discovery time and the finishing time of the node.
In other words, for each node v of the graph, find the values d[v] and f[v] that associates the algorithm DFS with this node.
Notice that there is only one possible assignment of the values d[v] and f[v].
Could you give me a hint how we can find the initial node in order to start applying the Depth first search algorithm?
Look at node a - what could DFS do in node a? It could go either to b or e. We see that it chose b, because a->b is a tree edge and a->e is a forward edge (check the definition of tree/forward edge). In b the only choice was to visit f. In f DFS could go either to a, e or g. We can assume that it tried to visit a (f->a is marked as back edge, so everything is correct until now), than it visited e and than tried to visit b. However, we now have a problem with edge f->g. It is marked as a cross edge, which means that DFS had already visited g before. Otherwise, this edge would have been marked as a tree edge. So, we know that a was not the initial node. We need to try other options. What about c? Again, all of edges coming out of c are marked as cross, not tree, so c was not the initial node.
What about d? If DFS started in d, it could go from d to g and that's what happened because d->g is marked as tree edge. There were no nodes to go from g so it backtraced to d and visited h. From h it tried to visit g but it has already visited earlier, so h->g is marked as cross - correct. Great, so d was the initial node for this DFS execution. After visiting a connected component which contains d, g and h, DFS could start again either from a or c but we already know that it has not started from c because of those cross edges. So it started from a and after visiting b, f and e it started from c.
Tree edges should form a forest. A node at wich the DFS could have started is a node that has no incoming tree edges.
I know the answer to this particular question is O(V + E) and for a Graph like a tree, it makes sense because each Vertex is being explored once only.
However let's say there is a cycle in the graph.
For example, let's take up an undirected graph with four vertices A-B-C-D.
A is connected to both B and C, and Both B and C are connected to D. So there are four edges in total. A->B, A->C, B->D, C->D and vice versa.
Let's do DFS(A).
It will explore B first and B's neighbor D and D's neighbor C. After that C will not have any edges so it will come back to D and B and then A.
Then A will traverse its second edge and try to explore C and since it is already explored it will not do anything and DFS will end.
But over here Vertex "C" has been traversed twice, not once. Clearly worst case time complexity can be directly proportional to V.
Any ideas?
If you do not maintain a visited set, that you use to avoid revisitting already visited nodes, DFS is not O(V+E). In fact, it is not complete algorithm - thus it might not even find a path if there is a one, because it will be stuck in an infinite loop.
Note that for infinite graphs, if you are looking for a path from s to t, even with maintaining a visited set, it is not guaranteed to complete, since you might get stuck in an infinite branch.
If you are interested in keeping DFS's advantage of efficient space consumption, while still being complete - you might use iterative deepening DFS, but it will not trivially solve the problem if you are looking to discover the whole graph, and not a path to a specific node.
EDIT: DFS pseudo code with visited set.
DFS(v,visited):
for each u such that (v,u) is an edge:
if (u is not in visited):
visited.add(u)
DFS(u,visited)
It is easy to see that you invoke the recursion on a vertex if and only if it is not yet visited, thus the answer is indeed linear in the number of vertices and edges.
You can visit each vertex and edge of the graph a constant number of times and still be O(V+E). An alternative way of looking at it is that the cost is charged to the edge, not to the vertex.
What if the only negative edge costs are coming from the initial node? Will the algorithm still work?
I feel like yes, because I can't think of a counter-example, but I'm having trouble proving it. Is there a counter-example?
Negative edges are a problem for Dijkstra's because there's no guarantee that the edge you pick produces the shortest path if there is an edge you can pick later that is largely negatively weighted. But if the only negative edges are coming out of the initial node, I don't see the problem.
I'm not looking for an algorithm. I'm looking for some insight into the Dijkstra's.
I'm talking about a directed graph, if that makes a difference.
The trouble with having a negative-cost edge is that you can go back and forth along it as many times as you like.
If you impose a rule that an edge may not be used more than once, you still have a problem. Dijkstra's algorithm involves marking a node as "visited", when it's distance from the initial node is considered know once and for all. This happens before all of the edges have been examined; the shortest path from the initial node to node X has been found, all other paths from the initial node are already longer than that, nothing that is discovered later can make those paths shorter. But if there are negative-cost edges somewhere, then a later discovery can make a path shorter, so it may be that a shorter path exists which Dijkstra will not discover.
If only the edges that connect to the initial node may have negative costs, then you still have problem, because the shortest path might involve revisiting the initial node to take advantage of the negative costs, something Dijkstra cannot do.
If you impose another rule that a node may not be visited more than once, then Dijkstra's algorithm works. Notice that in Dijkstra's algorithm, the initial node is given an initial distance of zero. If you give it some other initial distance, the algorithm will still find the shortest path-- but all of the distances will be off by that same amount. (If you want the real distance at the end, you must subtract the value you put in.)
So take your graph, call it A, find the smallest cost of any edge connected to the initial node, call it k which will be negative in this case).
Make a new graph B which you get by subtracting k from the cost of each edge connected to the initial node. Note that all of these costs are now non-negative. So Dijkstra works on B. Also note that the shortest path in B is also the shortest path in A.
Assign the initial node of B the distance k, then run Dijkstra (this will give the same path as running with an initial distance of zero). Compare this to running Dijkstra naively on A: once you leave the initial node everything's the same in the two graphs. The distances are the same, the decisions are the same, the two will produce the same path. And in the case of A the distace will be correct, since it started at zero.
Counter-example:
Graph G = (V, E), with vertices V = {A, B}, edges E = {(A, B), (B, A)} and weight function w(A, B) = -2, w(B, A) = +1.
There's a negative weight cycle, hence minimum distances are undefined (even using A as initial node).
Dijkstra's algorithm doesn't produce correct answer for graph with negative edge weights (even if graph doesn't have any negative weight cycle). For e.g. it computes incorrect shortest path value between (A, C) for following graph with source vertex A,
A -> B : 6
A -> C : 5
B -> D : 2
B -> E : 1
D -> E : -5
E -> C : -2