Finding length of shortest cycle in undirected graph - algorithm

I tried the following :
1) DFS, keeping track of level of each vertex in my DFS tree
2) Each time a back edge (x,y) is seen, I calculate cycle length = level[x] - level[y] + 1, and save it if it is smaller than the shortest
Can someone tell a counter example for which this approach is wrong ?
What could be a better way to find shortest cycle in undirected graphs ?
Thanks.

Why DFS won't work
You cannot use DFS to find a shortest circle. We can easily create a counter example, where DFS leads finds only the longest circle. Lets have a look at the following graph:
As you can see we have nine nodes. If we start at the leftmost node A, the following DFS level could be possible:
We have two back edges while iterating:
(B , A), therefore we found a circle with length 8
(D , A), therefore we found a circle with length 8
However, the shortest circle has length 5. It's shown in blue in the next picture, whereas one of the previously found circles is shown in red:
You didn't see the blue circle because your DFS path doesn't contain it.
Dagupa et al also mention this behaviour in their book:
But it also means that DFS can end up taking a long and convoluted route to a vertex that is actually very close by.
Why BFS won't work
Well, that's not entirely true, one can use BFS (see next subsection), but you cannot use your formula. Take the following graph:
No fancy picture for this graph yet.
Every "o" is a node.
o---o
| |
+-------o---o-------+
| |
o----o----o----o----o
Lets see what levels are possible in BFS. If I start at the node in the middle, I get the following levels:
5~~~5 ~~~ are back-edges
| |
+-------4~~~4-------+
| |
3----2----1----2----3
And if I start at the left node, I get the following levels:
3~~~4
| |
+-------2---3-------+
| |
1----2----3----4~~~~4
Therefore, you cannot use your level formula.
Solution
Although not efficient, using an all-pair shortest path algorithm and checking the distance (i,i) for every node is a valid solution.

I think this is what you are looking for : https://web.archive.org/web/20170829175217/http://webcourse.cs.technion.ac.il/234247/Winter2003-2004/ho/WCFiles/Girth.pdf
You make a BFS from each node, thus you have complexity O(V*E)

Let's say we've the graph with following edges,
1<->4,
4<->2,
4<->3,
2<->3,
3<->1
Then cycle 1, 4, 2, 3, 1 could be traversed before 1, 4, 3, 1 and as we are considering DFS, no node will be visited twice. So if 1, 4, 2, 3, 1 is traversed first, no chance that 1, 4, 3, 1 or 4, 2, 3, 3 will be traversed at all. So with DFS it can NOT be assured that we will get the shortest cycle always.
Possible Improvement: A BFS tree should work fine as it goes level by level and for BFS tree distance from root to any node is fixed, no matter in which order nodes are picked. Runtime: O(V+E) while modified Floyd-Warshall's algorithm would run in O(V^3) in worst case.

Related

Can't we find Shortest Path by DFS(Modified DFS) in an unweighted Graph? and if not then Why?

It is said that DFS can't be used to find the shortest path in the unweighted graph. I have read multiple post and blogs but not get satisfied as a little modification in DFS can make it possible.
I think if we use Modified DFS in this way, then we can find the shortest distances from the source.
Initialise a array of distances from root with infinity and distance of root from itself as 0.
While traversing, we keep track of no. of edges. On moving forward increment no. of edges and while back track decrement no. of edges. And each time check if(dist(v) > dist(u) + 1 ) then dist(v) = dist(u) + 1.
In this way we can find the shortest distances from the root using DFS. And in this way, we can find it in O(V+E) instead of O(ElogV) by Dijkstra.
If I am wrong at some point. Please tell me.
Yes, if the DFS algorithm is modified in the way you mentioned, it can be used to find the shortest paths from a root in an unweighted graph. The problem is that in modifying the algorithm you have fundamentally changed what it is.
It may seem like I am exaggerating as the change looks minor superficially but it changes it more than you might think.
Consider a graph with n nodes numbered 1 to n. Let there be an edge between each k and k + 1. Also, let 1 be connected to every node.
Since DFS can pick adjacent neighbors in any order, let's also assume that this algorithm always picks them in increasing numerical order.
Now try running algorithm in your head or your computer with root 1.
First the algorithm will reach n in n-1 steps using edges between 1-2, 2-3 and so on. Then after backtracking, the algorithm moves on to the second neighbor of 1, namely 3. This time there will be n-2 steps.
The same process will repeat until the algorithm finally sees 1-n.
The algorithm will need O(n ^ 2) rather than O(n) steps to finish. Remember that V = n & E = 2 * n - 3. So it is not O(V + E).
Actually, the algorithm you have described will always finish in O(V^2) on unweighted graphs. I will leave the proof of this claim as an exercise for the reader.
O(V^2) is not that bad. Especially if a graph is dense. But since BFS already provides an answer in O(V + E), nobody uses DFS for shortest distance calculation.
In an unweighted graph, you can use a breadth-first search (not DFS) to find shortest paths in O(E) time.
In fact, if all edges have the same weight, then Dijkstra's algorithm and breadth-first search are pretty much equivalent -- reduceKey() is never called, and the priority queue can be replaced with a FIFO queue, since newly added vertices never have smaller weight than previously-added ones.
Your modification to DFS does not work, because once you visit a vertex, you will not examine its children again, even if its weight changes. You will get the wrong answer for this graph if you follow S->A before S->B
S---->A---->C---->D---->E---->T
\ /
------->B-----/
The way Depth First Search on graphs is defined, it only visits each node once. When it encounters a node that was visited before, it backtracks.
So assume you have a triangle with nodes A, B, C and you want to find the shortest path from A to B. One possibility of a DFS traversal is A -> C -> B and you are done. This however is not the shortest path.

Dijkstra's algorithm on directed acyclic graph with negative edges

Will Dijkstra's algorithm work on a graph with negative edges if it is acyclic (DAG)? I think it would because since there are no cycles there cannot be a negative loop. Is there any other reason why this algorithm would fail?
Thanks [midterm tomorrow]
Consider the graph (directed 1 -> 2, 2-> 4, 4 -> 3, 1 -> 3, 3 -> 5):
1---(2)---3--(2)--5
| |
(3) (2)
| |
2--(-10)--4
The minimum path is 1 - 2 - 4 - 3 - 5, with cost -3. However, Dijkstra will set d[3] = 2, d[2] = 3 in the first step, then extract node 3 from its priority queue and set d[5] = 4. Since node 3 was extracted from the priority queue, and Dijkstra does not push a given node to its priority queue more than once, it will never end up in it again, so the algorithm won't work.
Dijkstra's algorithm does not work with negative edges, period. The absence of a cycle changes nothing. Bellman-Ford is the one that can detect negative cost cycles and works with negative edges. Dijkstra will not work if you can have negative edges.
If you change Dijkstra's algorithm such that it can push a node to the priority queue more than once, then the algorithm will work with negative cost edges. But it is debatable if the new algorithm is still Dijkstra's: I would say you get Bellman-Ford that way, which is often implemented exactly like that (well, usually a FIFO queue is used and not a priority queue).
I think Dijkstra's algorithm will work for DAG if there is no negative weight. Because Dijkstra's algorithm can't give the right answer for negative weighted edges graph. But sometimes it does based on graph type.
Pure implementation of Dijkstra's will fail , whenever there is a negative edge weight. The following variant will still work for given problem scenario.
Every time an edge u -> v is relaxed, push a pair of (newer/shorter distance to v from source) into queue. This causes more than one copy of the same vertex in queue with different distances from source.
Continue to update the distance until queue is empty.
The above variant works, even if negative edges are present. But not in case if there is negative weight cycle. DAG is acyclic so, we don't have to worry about negative cycles.
There is more efficient way to calculate shortest path distances O(V+E) time for DAGs using topological ordering. More details can be found here

What if I do not use G transpose in calculating Strongly Connected Components?

I am reading Introduction to Algorithms. In 22.5 Strongly Connected Component, the algorithm STRONGLY-CONNECTED-COMPONENT(G) is defined as:
Call DFS(G) to compute finishing times u.f for each vertex u
Compute G transpose
Call DFS(G transpose), but in the main loop of DFS, consider the vertices in order of decreasing u.f(as computed in line 1)
Output the vertices of each tree in the depth-first forest formed in line 3 as a separate strongly connected component
If I change the alogrithm to just using G, without calculating G transpose. Also consider the vertices in order of Increasing u.f(Reverse order of topological sort):
Call DFS(G) to compute finishing times u.f for each vertex u
Call DFS(G), but in the main loop of DFS, consider the vertices in order of increasing u.f(as computed in line 1)
Output the vertices of each tree in the depth-first forest formed in line 2
Why is this algorithm wrong?
Your question is actually exercise 22.5-3 in the book. A counter example to the correctness of your alternative algorithm is given here:
http://sites.math.rutgers.edu/~ajl213/CLRS/Ch22.pdf
Professor Bacon’s suggestion doesn’t work out. As an example, suppose that
our graph is on the three vertices {1, 2, 3} and consists of the edges (2, 1),(2, 3),(3, 2).
Then, we should end up with {2, 3} and {1} as our SCC’s. However, a possible
DFS starting at 2 could explore 3 before 1, this would mean that the finish
time of 3 is lower than of 1 and 2. This means that when we first perform the
DFS starting at 3. However, a DFS starting at 3 will be able to reach all other
vertices. This means that the algorithm would return that the entire graph is a
single SCC, even though this is clearly not the case since there is neither a path
from 1 to 2 of from 1 to 3.
The vertices in strongly connected component are, by definiton, connected to each other (by a path, not necessarily by direct edge). if you make first DFS call on vertex X, you find out "which vertices is X connected to" (X -> N). To make sure that all those vertices are connected to X (N -> X) and therefore validate strong connectivity you need to traverse the edges in reversed directions. The easiest way to do such is by transposing the graph.
If you look for proof of the algorithm, i am sure you will find some. It may not be the easiest to understand but check this source for example:
Correctness of the algorithm for finding strongly connected components

Modified breadth-first search on a graph with edge weights of 2, 3 or 5

Suppose that we are given a directed graph H = (V, E). For each edge e, the weight of the edge, w(e) is either 2, 3 or 5. Modify the BFS so that it will compute the length of the shortest path from a single source vertex s. Explain why your algorithm is correct and determine its worst-case running time (You may assume that H is represented via an adjacency list).
How would you go about this? What makes the specific weight edges different from just any?
You can consider imaginary nodes between the edges. So if between 2 nodes there is an edge of length 2. You make an intermediary node and add edges of length 1 between them. Then use the normal breadth first search. (You also need to do this for nodes of length 3 and 5, adding 2 and 4 nodes). Since you only add a O(E) nodes it's the same complexity.

Dijkstra and Negative Edges

I'm having trouble understanding why Dijkstra's algorithm does not work on acyclic directed graphs with negative edges. As I understand it, Dijkstra does a breadth-first traversal of the graph, relaxing when appropriate. For instance, consider the graph:
S->A (3)
S->B (4)
B->A (-2)
Where S is the source node. This is how I imagine it working:
1) Mark A with a distance of 3. Mark B with a distance of 4.
2) Recurse on A. Since A points to no nodes, do nothing.
3) Recurse on B. Since B points to A, check to see if B's distance + B->A is less than the current distance of A. 2 < 3, so mark A with a distance of 2.
Yet apparently this is not how it works, as the book I use gives this very graph to show why negatives DON'T work. I cannot follow the book's explanation. How would Dijkstra work on this graph and why would they not use the method I am imagining?
The problem is, once you process a node, you cannot afterwards update its distance, since that would require recursive updates and would throw off the whole thing (read: go against the assumption of the algorithm that the nodes are processed in monotonously increasing distance to the source; see the proof of correctness for the algorithm to see where that is required). So once A was processed, you can't later change its distance, which means you can't have negative edges since they might give you shorter distances to previously processed nodes. The assumption of monotonously increasing distances is why you mark nodes black once they have been processed, and you disregard black nodes afterwards. So even though in that graph A would have a distance of 2 to S, Dijkstra's algorithm would give you a distance of 3 since it disregards any edges leading towards A after A was processed.
EDIT: Here's what Dijkstra's algorithm would do:
1) Mark A with a distance of 3, put it into the queue of nodes awaiting processing; Mark B with a distance of 4, put it into the queue.
2) Take A out of the queue since it's at the front. Since A points to no nodes, don't update any distances, don't add anything to the queue. Mark A as processed.
3) Take B out of the queue. B points to A, but A is marked as already processed; ignore the edge B->A. Since there are no more outgoing edges from B, we're done.
EDIT 2:
Regarding DAGs, you don't need Dijkstra's algorithm at all. DAGs always have a topological ordering, which can be calculated in O(|V| + |E|), and processing the vertices in the topological order, using d(w) = min {d(w); d(v) + c(v, w)} as the rule for updating distances where d(v) is the distance of vertex v from the source and c(v,w) is the length of edge (v,w) will give you the correct distances, again in O(|V| + |E|). Altogether you have two steps each requiring O(|V| + |E|), so that's the total complexity of calculating single source shortest path in DAGs with arbitrary edge lengths.

Resources