Worst Case Time Complexity of Depth First Search - algorithm

I know the answer to this particular question is O(V + E) and for a Graph like a tree, it makes sense because each Vertex is being explored once only.
However let's say there is a cycle in the graph.
For example, let's take up an undirected graph with four vertices A-B-C-D.
A is connected to both B and C, and Both B and C are connected to D. So there are four edges in total. A->B, A->C, B->D, C->D and vice versa.
Let's do DFS(A).
It will explore B first and B's neighbor D and D's neighbor C. After that C will not have any edges so it will come back to D and B and then A.
Then A will traverse its second edge and try to explore C and since it is already explored it will not do anything and DFS will end.
But over here Vertex "C" has been traversed twice, not once. Clearly worst case time complexity can be directly proportional to V.
Any ideas?

If you do not maintain a visited set, that you use to avoid revisitting already visited nodes, DFS is not O(V+E). In fact, it is not complete algorithm - thus it might not even find a path if there is a one, because it will be stuck in an infinite loop.
Note that for infinite graphs, if you are looking for a path from s to t, even with maintaining a visited set, it is not guaranteed to complete, since you might get stuck in an infinite branch.
If you are interested in keeping DFS's advantage of efficient space consumption, while still being complete - you might use iterative deepening DFS, but it will not trivially solve the problem if you are looking to discover the whole graph, and not a path to a specific node.
EDIT: DFS pseudo code with visited set.
DFS(v,visited):
for each u such that (v,u) is an edge:
if (u is not in visited):
visited.add(u)
DFS(u,visited)
It is easy to see that you invoke the recursion on a vertex if and only if it is not yet visited, thus the answer is indeed linear in the number of vertices and edges.

You can visit each vertex and edge of the graph a constant number of times and still be O(V+E). An alternative way of looking at it is that the cost is charged to the edge, not to the vertex.

Related

Can't we find Shortest Path by DFS(Modified DFS) in an unweighted Graph? and if not then Why?

It is said that DFS can't be used to find the shortest path in the unweighted graph. I have read multiple post and blogs but not get satisfied as a little modification in DFS can make it possible.
I think if we use Modified DFS in this way, then we can find the shortest distances from the source.
Initialise a array of distances from root with infinity and distance of root from itself as 0.
While traversing, we keep track of no. of edges. On moving forward increment no. of edges and while back track decrement no. of edges. And each time check if(dist(v) > dist(u) + 1 ) then dist(v) = dist(u) + 1.
In this way we can find the shortest distances from the root using DFS. And in this way, we can find it in O(V+E) instead of O(ElogV) by Dijkstra.
If I am wrong at some point. Please tell me.
Yes, if the DFS algorithm is modified in the way you mentioned, it can be used to find the shortest paths from a root in an unweighted graph. The problem is that in modifying the algorithm you have fundamentally changed what it is.
It may seem like I am exaggerating as the change looks minor superficially but it changes it more than you might think.
Consider a graph with n nodes numbered 1 to n. Let there be an edge between each k and k + 1. Also, let 1 be connected to every node.
Since DFS can pick adjacent neighbors in any order, let's also assume that this algorithm always picks them in increasing numerical order.
Now try running algorithm in your head or your computer with root 1.
First the algorithm will reach n in n-1 steps using edges between 1-2, 2-3 and so on. Then after backtracking, the algorithm moves on to the second neighbor of 1, namely 3. This time there will be n-2 steps.
The same process will repeat until the algorithm finally sees 1-n.
The algorithm will need O(n ^ 2) rather than O(n) steps to finish. Remember that V = n & E = 2 * n - 3. So it is not O(V + E).
Actually, the algorithm you have described will always finish in O(V^2) on unweighted graphs. I will leave the proof of this claim as an exercise for the reader.
O(V^2) is not that bad. Especially if a graph is dense. But since BFS already provides an answer in O(V + E), nobody uses DFS for shortest distance calculation.
In an unweighted graph, you can use a breadth-first search (not DFS) to find shortest paths in O(E) time.
In fact, if all edges have the same weight, then Dijkstra's algorithm and breadth-first search are pretty much equivalent -- reduceKey() is never called, and the priority queue can be replaced with a FIFO queue, since newly added vertices never have smaller weight than previously-added ones.
Your modification to DFS does not work, because once you visit a vertex, you will not examine its children again, even if its weight changes. You will get the wrong answer for this graph if you follow S->A before S->B
S---->A---->C---->D---->E---->T
\ /
------->B-----/
The way Depth First Search on graphs is defined, it only visits each node once. When it encounters a node that was visited before, it backtracks.
So assume you have a triangle with nodes A, B, C and you want to find the shortest path from A to B. One possibility of a DFS traversal is A -> C -> B and you are done. This however is not the shortest path.

Designing an Algorithm to find the length of a simple cycle in a d-regular graph

I understand the question in general but don't know how to design and analyze the algorithm in the question. I was thinking of applying some sort of graph search algorithm like depth-first / breadth-first search.
UPDATE: This is what I have tried, starting from any Node of the graph (call it N), visit each of that node's d neighbors. Now, the last neighbor we just visited of N (call it L) visit any other neighbor of L that is not N ?
Others have already hinted on a possible solution in comments, let's elaborate. When d<=1, the solutions are immediate (and depend on your exact definition of cycle), so I'll assume d>1.
One such algorithm would be:
Build a path starting in any vertex V. Until the path has length d, don't allow vertices you've already visited.
Once the path is d vertices long, keep adding vertices to the path, but now only allow vertices different from the last d vertices of the path.
When you add a vertex that's already been used in the path, stop. You create the resulting cycle from a segment of the path starting and ending in that vertex.
In both (1) and (2), the existence of such a vertex is guaranteed by the fact that G is d-regular. When searching for the vertex to add, we only exclude the last d vertices, namely the last vertex (U) and its d-1 predecessors. U has d neighbors, so at least one of them has to be available.
The algorithm will stop, because of the condition (3) and the fact that G is finite.
It makes sense to prefer already visited vertices in (2), but it doesn't change the worst-case complexity.
This gives us the worst-case complexity of n*d - for we may have to visit once every vertex and check all of its edges.

Find all critical edges of an MST

I have this question from Robert Sedgewick's book on algorithms.
Critical edges. An MST edge whose deletion from the graph would cause the
MST weight to increase is called a critical edge. Show how to find all critical edges in a
graph in time proportional to E log E. Note: This question assumes that edge weights
are not necessarily distinct (otherwise all edges in the MST are critical).
Please suggest an algorithm that solves this problem.
One approach I can think of does the job in time E.V.
My approach is to run the kruskal's algorithm.
But whenever we encounter an edge whose insertion in the MST creates a cycle and if that
cycle already contains an edge with the same edge weight, then, the edge already inserted will not be a critical edge (otherwise all other MST edges are critical edges).
Is this algorithm correct? How can I extend this algorithm to do the job in time E log E.
The condition you suggest for when an edge is critical is correct I think. But it's not necessary to actually find a cycle and test each of its edges.
The Kruskal algorithm adds edges in increasing weight order, so the sequence of edge additions can be broken into blocks of equal-weight edge additions. Within each equal-weight block, if there is more than one edge that joins the same two components, then all of these edges are non-critical, because any one of the other edges could be chosen instead. (I say they are all non-critical because we are not actually given a specific MST as part of the input -- if we were then this would identify a particular edge to call non-critical. The edge that Kruskal actually chooses is just an artefact of initial edge ordering or how sorting was implemented.)
But this is not quite sufficient: it might be that after adding all edges of weight 4 or less to the MST, we find that there are 3 weight-5 edges, connecting component pairs (1, 2), (2, 3) and (1, 3). Although no component pair is joined by more than 1 of these 3 edges, we only need (any) 2 of them -- using all 3 would create a cycle.
For each equal-weight block, having weight say w, what we actually need to do is (conceptually) create a new graph in which each component of the MST so far (i.e. using edges having weight < w) is a vertex, and there is an edge between 2 vertices whenever there is a weight-w edge between these components. (This may result in multi-edges.) We then run DFS on each component of this graph to find any cycles, and mark every edge belonging to such a cycle as non-critical. DFS takes O(nEdges) time, so the sum of the DFS times for each block (whose sizes sum to E) will be O(E).
Note that Kruskal's algorithm takes time O(Elog E), not O(E) as you seem to imply -- although people like Bernard Chazelle have gotten close to linear-time MST construction, TTBOMK no one has got there yet! :)
Yes, your algorithm is correct. We can prove that by comparing the execution of Kruskal's algorithm to a similar execution where the cost of some MST edge e is changed to infinity. Until the first execution considers e, both executions are identical. After e, the first execution has one fewer connected component than the second. This condition persists until an edge e' is considered that, in the second execution, joins the components that e would have. Since edge e is the only difference between the forests constructed so far, it must belong to the cycle created by e'. After e', the executions make identical decisions, and the difference in the forests is that the first execution has e, and the second, e'.
One way to implement this algorithm is using a dynamic tree, a data structure that represents a labelled forest. One configuration of this ADT supports the following methods in logarithmic time.
MakeVertex() - constructs and returns a fresh vertex.
Link(u, c, v) - vertices u and v must not be connected. Creates an unmarked edge from vertex u to vertex v with cost c.
Mark(u, v) - vertices u and v must be endpoints of an edge e. Marks e.
Connected(u, v) - indicates whether vertices u and v are connected.
FindMax(u, v) - vertices u and v must be connected. Returns the endpoints of an unmarked edge on the unique path from u to v with maximum cost, together with that cost. The endpoints of this edge are given in the order that they appear on the path.
I make no claim that this is a good algorithm in practice. Dynamic trees, like Swiss Army knives, are versatile but complicated and often not the best tool for the job. I encourage you to think about how to take advantage of the fact that we can wait until all of the edges are processed to figure out what the critical edges are.

graph - How to find Minimum Directed Cycle (minimum total weight)?

Here is an excise:
Let G be a weighted directed graph with n vertices and m edges, where all edges have positive weight. A directed cycle is a directed path that starts and ends at the same vertex and contains at least one edge. Give an O(n^3) algorithm to find a directed cycle in G of minimum total weight. Partial credit will be given for an O((n^2)*m) algorithm.
Here is my algorithm.
I do a DFS. Each time when I find a back edge, I know I've got a directed cycle.
Then I will temporarily go backwards along the parent array (until I travel through all vertices in the cycle) and calculate the total weights.
Then I compare the total weight of this cycle with min. min always takes the minimum total weights. After the DFS finishes, our minimum directed cycle is also found.
Ok, then about the time complexity.
To be honest, I don't know the time complexity of my algorithm.
For DFS, the traversal takes O(m+n) (if m is the number of edges, and n is the number of vertices). For each vertex, it might point back to one of its ancestors and thus forms a cycle. When a cycle is found, it takes O(n) to summarise the total weights.
So I think the total time is O(m+n*n). But obviously it is wrong, as stated in the excise the optimal time is O(n^3) and the normal time is O(m*n^2).
Can anyone help me with:
Is my algorithm correct?
What is the time complexity if my algorithm is correct?
Is there any better algorithm for this problem?
You can use Floyd-Warshall algorithm here.
The Floyd-Warshall algorithm finds shortest path between all pairs of vertices.
The algorithm is then very simple, go over all pairs (u,v), and find the pair that minimized dist(u,v)+dist(v,u), since this pair indicates on a cycle from u to u with weight dist(u,v)+dist(v,u). If the graph also allows self-loops (an edge (u,u)) , you will also need to check them alone, because those cycles (and only them) were not checked by the algorithm.
pseudo code:
run Floyd Warshall on the graph
min <- infinity
vertex <- None
for each pair of vertices u,v
if (dist(u,v) + dist(v,u) < min):
min <- dist(u,v) + dist(v,u)
pair <- (u,v)
return path(u,v) + path(v,u)
path(u,v) + path(v,u) is actually the path found from u to v and then from v to u, which is a cycle.
The algorithm run time is O(n^3), since floyd-warshall is the bottle neck, since the loop takes O(n^2) time.
I think correctness in here is trivial, but let me know if you disagree with me and I'll try to explain it better.
Is my algorithm correct?
No. Let me give a counter example. Imagine you start DFS from u, there are two paths p1 and p2 from u to v and 1 path p3 from v back to u, p1 is shorter than p2.
Assume you start by taking the p2 path to v, and walk back to u by path p3. One cycle found but apparently it's not minimum. Then you continue exploring u by taking the p1 path, but since v is fully explored, the DFS ends without finding the minimum cycle.
"For each vertex, it might point back to one of its ancestors and thus forms a cycle"
I think it might point back to any of its ancestors which means N
Also, how are u going to mark vertexes when you came out of its dfs, you may come there again from other vertex and its going to be another cycle. So this is not (n+m) dfs anymore.
So ur algo is incomplete
same here
3.
During one dfs, I think the vertex should be either unseen, or check, and for checked u can store the minimum weight for the path to the starting vertex. So if on some other stage u find an edge to that vertex u don't have to search for this path any more.
This dfs will find the minimum directed cycle containing first vertex. and it's O(n^2) (O(n+m) if u store the graph as list)
So if to do it from any other vertex its gonna be O(n^3) (O(n*(n+m))
Sorry, for my english and I'm not good at terminology
I did a similar kind of thing but i did not use any visited array for dfs (which was needed for my algorithm to work correctly) and hence i realised that my algorithm was of exponential complexity.
Since, you are finding all cycles it is not possible to find all cycles in less than exponential time since there can be 2^(e-v+1) cycles.

Is there an edge we can delete without disconnecting the graph?

Before I start, yes this is a homework.
I would not have posted here if I haven't been trying as hard as I could to solve this one for the last 14 hours and got nowhere.
The problem is as follows:
I want to check whether I can delete an edge from a connected undirected graph without disconnecting it or not in O(V) time, not just linear.
What I have reached so far:
A cycle edge can be removed without disconnecting the graph, so I simply check if the graph has a cycle.
I have two methods that could be used, one is DFS and then checking if I have back edges; the other is by counting Vs and Es and checking if |E| = |V| - 1, if that's true then the graph is a tree and there's no node we can delete without disconnecting it.
Both of the previous approaches solve the problem, but both need O(|E|+|V|), and the book says there's a faster way(that's probably a greedy approach).
Can I get any hints, please?
EDIT:
More specifically, this is my question; given a connected graph G=(V,E), can I remove some edge e in E and have the resulting graph still be connected?
Any recursive traversal of the graph, marking nodes as they're visited and short-circuiting to return true if you ever run into a node that is already marked will do the trick. This takes O(|V|) to traverse the entire graph if there is no edge that can be removed, and less time if it stops early to return true.
edit
Yes, a recusive traversal of the entire graph requires O(|V|+|E|) time, but we only traverse the entire graph if there are no cycles -- in which case |E| = |V|-1 and that only takes O(|V|) time. If there is a cycle, we'll find it after traversing at most |V| edges (and visiting at most |V|+1 nodes), which likewise takes O(|V|) time.
Also, obviously when traversing from a node (other than the first), you don't consider the edge you used to get to the node, as that would cause you to immediately see an already visited node.
You list all edges E and take an edge and mark one by one the two end vertices visited. If during traversing we find that the two vertices have been visited previously by some edges and we can remove that edge.
We have to take edges at most |V| time to see whether this condition satisfy.
Worst case may go like this, each time we take an edge it will visit atleast new vertex. Then there are |V| vertices and we have to take |V| edges for that particular edge to be found.
Best case may be the one with |V| / 2 + 1 e
Have you heard of spanning trees? A connected graph with V-1 edges.
We can remove certain edges from a connected graph G (like the ones which are creating cycle) until we get a connected tree. Notice that question is not asking you to find a spanning tree.
Question is asking if you can remove one or more edges from graph without loosing connectivity. Simply count number of edges and break when count grows beyond V-1 because the graph has scope to remove more edges and become spanning tree. It can be done in O(V) times if the graph is given in adjacency list.
From what I'm reading, DFS without repetition is considered O(|V|), so if you take edge e, and let the two vertices it connects be u and v, if you run DFS from u, ignoring e, you can surmise that e is not a bridge if v is discovered, and given that DFS without repetition is O(|V|), then this would, I guess be considered O(|V|).

Resources