Can't we find Shortest Path by DFS(Modified DFS) in an unweighted Graph? and if not then Why? - algorithm

It is said that DFS can't be used to find the shortest path in the unweighted graph. I have read multiple post and blogs but not get satisfied as a little modification in DFS can make it possible.
I think if we use Modified DFS in this way, then we can find the shortest distances from the source.
Initialise a array of distances from root with infinity and distance of root from itself as 0.
While traversing, we keep track of no. of edges. On moving forward increment no. of edges and while back track decrement no. of edges. And each time check if(dist(v) > dist(u) + 1 ) then dist(v) = dist(u) + 1.
In this way we can find the shortest distances from the root using DFS. And in this way, we can find it in O(V+E) instead of O(ElogV) by Dijkstra.
If I am wrong at some point. Please tell me.

Yes, if the DFS algorithm is modified in the way you mentioned, it can be used to find the shortest paths from a root in an unweighted graph. The problem is that in modifying the algorithm you have fundamentally changed what it is.
It may seem like I am exaggerating as the change looks minor superficially but it changes it more than you might think.
Consider a graph with n nodes numbered 1 to n. Let there be an edge between each k and k + 1. Also, let 1 be connected to every node.
Since DFS can pick adjacent neighbors in any order, let's also assume that this algorithm always picks them in increasing numerical order.
Now try running algorithm in your head or your computer with root 1.
First the algorithm will reach n in n-1 steps using edges between 1-2, 2-3 and so on. Then after backtracking, the algorithm moves on to the second neighbor of 1, namely 3. This time there will be n-2 steps.
The same process will repeat until the algorithm finally sees 1-n.
The algorithm will need O(n ^ 2) rather than O(n) steps to finish. Remember that V = n & E = 2 * n - 3. So it is not O(V + E).
Actually, the algorithm you have described will always finish in O(V^2) on unweighted graphs. I will leave the proof of this claim as an exercise for the reader.
O(V^2) is not that bad. Especially if a graph is dense. But since BFS already provides an answer in O(V + E), nobody uses DFS for shortest distance calculation.

In an unweighted graph, you can use a breadth-first search (not DFS) to find shortest paths in O(E) time.
In fact, if all edges have the same weight, then Dijkstra's algorithm and breadth-first search are pretty much equivalent -- reduceKey() is never called, and the priority queue can be replaced with a FIFO queue, since newly added vertices never have smaller weight than previously-added ones.
Your modification to DFS does not work, because once you visit a vertex, you will not examine its children again, even if its weight changes. You will get the wrong answer for this graph if you follow S->A before S->B
S---->A---->C---->D---->E---->T
\ /
------->B-----/

The way Depth First Search on graphs is defined, it only visits each node once. When it encounters a node that was visited before, it backtracks.
So assume you have a triangle with nodes A, B, C and you want to find the shortest path from A to B. One possibility of a DFS traversal is A -> C -> B and you are done. This however is not the shortest path.

Related

Weighted Directed Graph best method for shortest path

For a question I was doing I'm confused about why the answer would be a BFS and not Dijkstra's algorithm.
The question was : There is a weighted digraph G=(V,E) with n nodes and m edges. Each node has a weight of 1 or 2. The question was to figure out which algorithm to use to find the shortest path in G from a given vetex u to a given vertex v. The options were:
a) O(n+m) time using a modified BFS
b) O(n+m) time using a modified DFS
c) O(mlogn) time using Dijkstra's Algorithm
d) O(n^3) time using modified Floyd-Warshall algorithm
The answer is a) O(n+m) time using a modified BFS,
I know that when comparing BFS to DFS, BFS is better for shorter paths. I also know Dijkstra's algorithm is similar to a BFS and if I'm not mistaken Dijkstra's algorithm is better for weighted graphs like in this case. I'm assuming BFS is better because it says modified BFS but what would modified exactly mean or is there another reason BFS would be better.
Since all paths are limited to either a distance of 1 or 2, for every edge of length 2 from nodes a to b you can just create a new node c with an edge from a to c of length 1 and an edge from c to b of length 1, and then this becomes a graph with only edges of weight 1 which can be BFS'd normally to find shortest path from u to v. Since you only add O(m) new nodes and O(m) new edges, this keeps the BFS's time complexity of O(n+m).
Another possibility is to, at each layer of BFS, store another list of nodes that are attained by edges with a weight of 2 from the current layer, and consider them at the same time as nodes attained two layers later. This approach is a bit more finicky though.

Find all cyclic paths in a directed graph

The title is self explanatory. Here's a solution that I found in the internet that can help do this. Here's the link
I don't understand why not visiting a vertex having weight below the given threshold will solve the problem.
Additionally, I have no idea how to solve this using/not using this.
Let's restrict this to simple cycles - those which contain no subcycles. For each node in the graph, begin a depth-first search for that node. record each branch of the recursion tree which results in a match. While searching, never cross over nodes already traversed in the branch.
Consider the complete directed graph on n vertices. There are n(n-1) arcs and n! simple cycles of length n. The algorithm above isn't much worse than this at all. Simply constructing a new copy of the answer would take nearly as much time as running the above algorithm to do it, in the worst case at least.
If you want to find cycles in a directed (or even undirected) graph there is an intuitive way to do it:
For each edge (u, v) in the graph
1. Temporarily ignore the edge (u, v) in step 2
2. Run an algorithm to find all paths from v to u (using a backtrackig algorithm)
3. Output the computed paths in step 2 along with the edge (u, v) as cycles in the graph
Note that you will get duplicate cycles this way since a cycle of length k will be found k times.
You can play with this idea to find cycles with specific properties, as well. For example, if you are aiming to find the shortest weighted cycle in the graph instead of finding all cycles. You can use a Dijkstra in step 2, and take the minimum over all the cycles you find. If you wanna finding the cycle with the least number of edges you can use a BFS in step 2.
If you are more struggling with finding all paths in a graph, this question might help you. Although it's for a slightly different problem.
Counting/finding paths with backtracking

graph - How to find Minimum Directed Cycle (minimum total weight)?

Here is an excise:
Let G be a weighted directed graph with n vertices and m edges, where all edges have positive weight. A directed cycle is a directed path that starts and ends at the same vertex and contains at least one edge. Give an O(n^3) algorithm to find a directed cycle in G of minimum total weight. Partial credit will be given for an O((n^2)*m) algorithm.
Here is my algorithm.
I do a DFS. Each time when I find a back edge, I know I've got a directed cycle.
Then I will temporarily go backwards along the parent array (until I travel through all vertices in the cycle) and calculate the total weights.
Then I compare the total weight of this cycle with min. min always takes the minimum total weights. After the DFS finishes, our minimum directed cycle is also found.
Ok, then about the time complexity.
To be honest, I don't know the time complexity of my algorithm.
For DFS, the traversal takes O(m+n) (if m is the number of edges, and n is the number of vertices). For each vertex, it might point back to one of its ancestors and thus forms a cycle. When a cycle is found, it takes O(n) to summarise the total weights.
So I think the total time is O(m+n*n). But obviously it is wrong, as stated in the excise the optimal time is O(n^3) and the normal time is O(m*n^2).
Can anyone help me with:
Is my algorithm correct?
What is the time complexity if my algorithm is correct?
Is there any better algorithm for this problem?
You can use Floyd-Warshall algorithm here.
The Floyd-Warshall algorithm finds shortest path between all pairs of vertices.
The algorithm is then very simple, go over all pairs (u,v), and find the pair that minimized dist(u,v)+dist(v,u), since this pair indicates on a cycle from u to u with weight dist(u,v)+dist(v,u). If the graph also allows self-loops (an edge (u,u)) , you will also need to check them alone, because those cycles (and only them) were not checked by the algorithm.
pseudo code:
run Floyd Warshall on the graph
min <- infinity
vertex <- None
for each pair of vertices u,v
if (dist(u,v) + dist(v,u) < min):
min <- dist(u,v) + dist(v,u)
pair <- (u,v)
return path(u,v) + path(v,u)
path(u,v) + path(v,u) is actually the path found from u to v and then from v to u, which is a cycle.
The algorithm run time is O(n^3), since floyd-warshall is the bottle neck, since the loop takes O(n^2) time.
I think correctness in here is trivial, but let me know if you disagree with me and I'll try to explain it better.
Is my algorithm correct?
No. Let me give a counter example. Imagine you start DFS from u, there are two paths p1 and p2 from u to v and 1 path p3 from v back to u, p1 is shorter than p2.
Assume you start by taking the p2 path to v, and walk back to u by path p3. One cycle found but apparently it's not minimum. Then you continue exploring u by taking the p1 path, but since v is fully explored, the DFS ends without finding the minimum cycle.
"For each vertex, it might point back to one of its ancestors and thus forms a cycle"
I think it might point back to any of its ancestors which means N
Also, how are u going to mark vertexes when you came out of its dfs, you may come there again from other vertex and its going to be another cycle. So this is not (n+m) dfs anymore.
So ur algo is incomplete
same here
3.
During one dfs, I think the vertex should be either unseen, or check, and for checked u can store the minimum weight for the path to the starting vertex. So if on some other stage u find an edge to that vertex u don't have to search for this path any more.
This dfs will find the minimum directed cycle containing first vertex. and it's O(n^2) (O(n+m) if u store the graph as list)
So if to do it from any other vertex its gonna be O(n^3) (O(n*(n+m))
Sorry, for my english and I'm not good at terminology
I did a similar kind of thing but i did not use any visited array for dfs (which was needed for my algorithm to work correctly) and hence i realised that my algorithm was of exponential complexity.
Since, you are finding all cycles it is not possible to find all cycles in less than exponential time since there can be 2^(e-v+1) cycles.

Negative weights using Dijkstra's Algorithm

I am trying to understand why Dijkstra's algorithm will not work with negative weights. Reading an example on Shortest Paths, I am trying to figure out the following scenario:
2
A-------B
\ /
3 \ / -2
\ /
C
From the website:
Assuming the edges are all directed from left to right, If we start
with A, Dijkstra's algorithm will choose the edge (A,x) minimizing
d(A,A)+length(edge), namely (A,B). It then sets d(A,B)=2 and chooses
another edge (y,C) minimizing d(A,y)+d(y,C); the only choice is (A,C)
and it sets d(A,C)=3. But it never finds the shortest path from A to
B, via C, with total length 1.
I can not understand why using the following implementation of Dijkstra, d[B] will not be updated to 1 (When the algorithm reaches vertex C, it will run a relax on B, see that the d[B] equals to 2, and therefore update its value to 1).
Dijkstra(G, w, s) {
Initialize-Single-Source(G, s)
S ← Ø
Q ← V[G]//priority queue by d[v]
while Q ≠ Ø do
u ← Extract-Min(Q)
S ← S U {u}
for each vertex v in Adj[u] do
Relax(u, v)
}
Initialize-Single-Source(G, s) {
for each vertex v  V(G)
d[v] ← ∞
π[v] ← NIL
d[s] ← 0
}
Relax(u, v) {
//update only if we found a strictly shortest path
if d[v] > d[u] + w(u,v)
d[v] ← d[u] + w(u,v)
π[v] ← u
Update(Q, v)
}
Thanks,
Meir
The algorithm you have suggested will indeed find the shortest path in this graph, but not all graphs in general. For example, consider this graph:
Let's trace through the execution of your algorithm.
First, you set d(A) to 0 and the other distances to ∞.
You then expand out node A, setting d(B) to 1, d(C) to 0, and d(D) to 99.
Next, you expand out C, with no net changes.
You then expand out B, which has no effect.
Finally, you expand D, which changes d(B) to -201.
Notice that at the end of this, though, that d(C) is still 0, even though the shortest path to C has length -200. This means that your algorithm doesn't compute the correct distances to all the nodes. Moreover, even if you were to store back pointers saying how to get from each node to the start node A, you'd end taking the wrong path back from C to A.
The reason for this is that Dijkstra's algorithm (and your algorithm) are greedy algorithms that assume that once they've computed the distance to some node, the distance found must be the optimal distance. In other words, the algorithm doesn't allow itself to take the distance of a node it has expanded and change what that distance is. In the case of negative edges, your algorithm, and Dijkstra's algorithm, can be "surprised" by seeing a negative-cost edge that would indeed decrease the cost of the best path from the starting node to some other node.
Note, that Dijkstra works even for negative weights, if the Graph has no negative cycles, i.e. cycles whose summed up weight is less than zero.
Of course one might ask, why in the example made by templatetypedef Dijkstra fails even though there are no negative cycles, infact not even cycles. That is because he is using another stop criterion, that holds the algorithm as soon as the target node is reached (or all nodes have been settled once, he did not specify that exactly). In a graph without negative weights this works fine.
If one is using the alternative stop criterion, which stops the algorithm when the priority-queue (heap) runs empty (this stop criterion was also used in the question), then dijkstra will find the correct distance even for graphs with negative weights but without negative cycles.
However, in this case, the asymptotic time bound of dijkstra for graphs without negative cycles is lost. This is because a previously settled node can be reinserted into the heap when a better distance is found due to negative weights. This property is called label correcting.
TL;DR: The answer depends on your implementation. For the pseudo code you posted, it works with negative weights.
Variants of Dijkstra's Algorithm
The key is there are 3 kinds of implementation of Dijkstra's algorithm, but all the answers under this question ignore the differences among these variants.
Using a nested for-loop to relax vertices. This is the easiest way to implement Dijkstra's algorithm. The time complexity is O(V^2).
Priority-queue/heap based implementation + NO re-entrance allowed, where re-entrance means a relaxed vertex can be pushed into the priority-queue again to be relaxed again later.
Priority-queue/heap based implementation + re-entrance allowed.
Version 1 & 2 will fail on graphs with negative weights (if you get the correct answer in such cases, it is just a coincidence), but version 3 still works.
The pseudo code posted under the original problem is the version 3 above, so it works with negative weights.
Here is a good reference from Algorithm (4th edition), which says (and contains the java implementation of version 2 & 3 I mentioned above):
Q. Does Dijkstra's algorithm work with negative weights?
A. Yes and no. There are two shortest paths algorithms known as Dijkstra's algorithm, depending on whether a vertex can be enqueued on the priority queue more than once. When the weights are nonnegative, the two versions coincide (as no vertex will be enqueued more than once). The version implemented in DijkstraSP.java (which allows a vertex to be enqueued more than once) is correct in the presence of negative edge weights (but no negative cycles) but its running time is exponential in the worst case. (We note that DijkstraSP.java throws an exception if the edge-weighted digraph has an edge with a negative weight, so that a programmer is not surprised by this exponential behavior.) If we modify DijkstraSP.java so that a vertex cannot be enqueued more than once (e.g., using a marked[] array to mark those vertices that have been relaxed), then the algorithm is guaranteed to run in E log V time but it may yield incorrect results when there are edges with negative weights.
For more implementation details and the connection of version 3 with Bellman-Ford algorithm, please see this answer from zhihu. It is also my answer (but in Chinese). Currently I don't have time to translate it into English. I really appreciate it if someone could do this and edit this answer on stackoverflow.
you did not use S anywhere in your algorithm (besides modifying it). the idea of dijkstra is once a vertex is on S, it will not be modified ever again. in this case, once B is inside S, you will not reach it again via C.
this fact ensures the complexity of O(E+VlogV) [otherwise, you will repeat edges more then once, and vertices more then once]
in other words, the algorithm you posted, might not be in O(E+VlogV), as promised by dijkstra's algorithm.
Since Dijkstra is a Greedy approach, once a vertice is marked as visited for this loop, it would never be reevaluated again even if there's another path with less cost to reach it later on. And such issue could only happen when negative edges exist in the graph.
A greedy algorithm, as the name suggests, always makes the choice that seems to be the best at that moment. Assume that you have an objective function that needs to be optimized (either maximized or minimized) at a given point. A Greedy algorithm makes greedy choices at each step to ensure that the objective function is optimized. The Greedy algorithm has only one shot to compute the optimal solution so that it never goes back and reverses the decision.
Consider what happens if you go back and forth between B and C...voila
(relevant only if the graph is not directed)
Edited:
I believe the problem has to do with the fact that the path with AC* can only be better than AB with the existence of negative weight edges, so it doesn't matter where you go after AC, with the assumption of non-negative weight edges it is impossible to find a path better than AB once you chose to reach B after going AC.
"2) Can we use Dijksra’s algorithm for shortest paths for graphs with negative weights – one idea can be, calculate the minimum weight value, add a positive value (equal to absolute value of minimum weight value) to all weights and run the Dijksra’s algorithm for the modified graph. Will this algorithm work?"
This absolutely doesn't work unless all shortest paths have same length. For example given a shortest path of length two edges, and after adding absolute value to each edge, then the total path cost is increased by 2 * |max negative weight|. On the other hand another path of length three edges, so the path cost is increased by 3 * |max negative weight|. Hence, all distinct paths are increased by different amounts.
You can use dijkstra's algorithm with negative edges not including negative cycle, but you must allow a vertex can be visited multiple times and that version will lose it's fast time complexity.
In that case practically I've seen it's better to use SPFA algorithm which have normal queue and can handle negative edges.
I will be just combining all of the comments to give a better understanding of this problem.
There can be two ways of using Dijkstra's algorithms :
Marking the nodes that have already found the minimum distance from the source (faster algorithm since we won't be revisiting nodes whose shortest path have been found already)
Not marking the nodes that have already found the minimum distance from the source (a bit slower than the above)
Now the question arises, what if we don't mark the nodes so that we can find shortest path including those containing negative weights ?
The answer is simple. Consider a case when you only have negative weights in the graph:
)
Now, if you start from the node 0 (Source), you will have steps as (here I'm not marking the nodes):
0->0 as 0, 0->1 as inf , 0->2 as inf in the beginning
0->1 as -1
0->2 as -5
0->0 as -8 (since we are not relaxing nodes)
0->1 as -9 .. and so on
This loop will go on forever, therefore Dijkstra's algorithm fails to find the minimum distance in case of negative weights (considering all the cases).
That's why Bellman Ford Algo is used to find the shortest path in case of negative weights, as it will stop the loop in case of negative cycle.

How to compute the critical path of a directional acyclic graph?

What is the best (regarding performance) way to compute the critical path of a directional acyclic graph when the nodes of the graph have weight?
For example, if I have the following structure:
Node A (weight 3)
/ \
Node B (weight 4) Node D (weight 7)
/ \
Node E (weight 2) Node F (weight 3)
The critical path should be A->B->F (total weight: 10)
I would solve this with dynamic programming. To find the maximum cost from S to T:
Topologically sort the nodes of the graph as S = x_0, x_1, ..., x_n = T. (Ignore any nodes that can reach S or be reached from T.)
The maximum cost from S to S is the weight of S.
Assuming you've computed the maximum cost from S to x_i for all i < k, the maximum cost from S to x_k is the cost of x_k plus the maximum cost to any node with an edge to x_k.
I have no clue about "critical paths", but I assume you mean this.
Finding the longest path in an acyclic graph with weights is only possible by traversing the whole tree and then comparing the lengths, as you never really know how the rest of the tree is weighted. You can find more about tree traversal at Wikipedia. I suggest, you go with pre-order traversal, as it's easy and straight forward to implement.
If you're going to query often, you may also wish to augment the edges between the nodes with information about the weight of their subtrees at insertion. This is relatively cheap, while repeated traversal can be extremely expensive.
But there's nothing to really save you from a full traversal if you don't do it. The order doesn't really matter, as long as you do a traversal and never go the same path twice.
There's a paper that purports to have an algorithm for this: "Critical path in an activity network with time constraints". Sadly, I couldn't find a link to a free copy. Short of that, I can only second the idea of modifying http://en.wikipedia.org/wiki/Dijkstra%27s_algorithm or http://en.wikipedia.org/wiki/A*
UPDATE: I apologize for the crappy formatting—the server-side markdown engine is apparently broken.
My first answer so please excuse for any non-standard thing by the culture of stackoverflow.
I think the solution is simple. Just negate the weights and run the classic shortest path for DAG (modified for weights of vertices of course). It should run fairly fast. (Time complexity of O(V+E) maybe)
I think it should work as when you will negate the weights, the biggest one will become smallest, second biggest will be second smallest and so on as if a > b then -a < -b. Then running DAG should suffice as it will find the solution for the smallest path of the negated one and thus finding longest path for the original one
Try the A* method.
A* Search Algorithm
At the end, to deal with the leaves, just make all of them lead on to a final point, to set as the goal.

Resources