Two-directional spanning tree in a directed graph in subquadratic time - algorithm

Here's the problem I'm trying to solve. Given a directed graph G, does it contain a connected subgraph that:
Contains every node from G
Is acyclic
Can be disconnected by removing any one edge
Has a path between every source node and every sink node
Intuitively, the subgraph I'm looking for consists of a downward pointing and an upward pointing tree that share the same root and together span G. I'm calling it the two-directional spanning tree problem, but it may have another name.
The dumb algorithm I've thought of is to cycle through each node in the graph, do a backwards and a forwards DFS starting at that node and then concatenate the search trees. If a two-directional spanning tree exists, I'm pretty sure this will find one on some iteration. However, it runs in O(V(V + E)) time. My intuition is that there should be a faster algorithm. Am I correct?

Warning: This answer is incomplete; the algorithm here returns "unknown" in some cases. I'm posting it only because no one else has proposed any answers in the past few days, and it's still an improvement over the proposal inside your question itself, which is to try each vertex as a candidate root, perform "backward DFS" and "forward DFS" (where the "forward DFS" is not allowed to visit any nodes that were visited by the "backward DFS"), and make sure that the all vertices are discovered by one or the other.
Specifically, there are two ways in which the below answer improves over your proposal:
As you note in your question, your proposal requires O(V·(V + E)) time. The below answer requires only O(V + E) time.
If the digraph contains cycles, your proposal can fail to detect a two-directional spanning tree whose vertex is an element of such a cycle. Therefore, your proposal can spuriously return "failure". The below answer never spuriously returns "failure", but (as noted above) returns "unknown" in some cases.
To start with . . . if G is acyclic, then we can determine whether a two-directional spanning tree exists in O(V + E) time, and if so construct it in a further O(V + E) time. To see why, observe the following:
If there's a two-directional spanning tree rooted at v, then that means that for every other vertex w, there's a path either from v to w or vice versa.
If we have a topological ordering of the vertices (which we can construct in O(V + E) time), then for any given vertices v and w, there is certainly no path from v to w if w precedes v in the topological ordering, and vice versa.
This means that we can find the vertices (if any) that have paths to or from all other vertices by topologically sorting the digraph and then making two passes:
First, we iterate over the vertices in forward order. We want to find all "backward roots", meaning vertices that are reachable from all of their predecessors — or put another way, we want to eliminate any vertex that is not reachable from some predecessor. To do this, keep a collection of already-encountered vertices. As we iterate, we remove any vertices that have edges pointing to the current vertex. Whenever this collection is empty, the current vertex is a "backward root". (It's a bit tricky to do this iteration in strict O(V + E) time, but it's doable. The key insight is that we need to store a list of inbound edges for each vertex, which may require an O(V + E) preprocessing pass if that wasn't initially part of our graph representation.)
Then, we do the reverse, iterating over the vertices in reverse order to find all "forward roots".
A vertex is the root of some two-directional spanning tree if and only if it's both a "forward root" and a "backward root".
Once we've found such a root v, we can construct the tree itself by doing "forward DFS" and "reverse DFS" from that root.
An important special case (whose importance will become clear below) is the case that two consecutive vertices ("consecutive" in the topological ordering, I mean) are both valid roots. In such a case, we can do "reverse DFS" from the first one and "forward DFS" from the second one to build a two-directional spanning tree where every vertex has either indegree ≤ 1 or outdegree ≤ 1.
OK, but what if G contains a cycle? Your proposed algorithm can never return "failure" in such cases, but it is at least guaranteed to find a two-directional spanning tree as long as the root itself isn't part of a cycle in G. The preceding section completely ignores that case.
To start addressing that case, note that we can use Tarjan's algorithm, which takes O(V + E) time, to derive a directed acyclic graph G′ of the strongly connected components of G. (If G is already acyclic, then G′ = G.)
To see why G′ is useful, observe the following:
If G has a two-directional spanning tree rooted at a vertex v, then G′ has a two-directional spanning tree rooted at the vertex representing the strongly connected component containing v.
I admit, this claim is not completely obvious. After all, it's possible that G has a two-directional spanning tree that has multiple branches that "pass through" a single strongly connected component, such that the component's overall indegree and outdegree are both > 1. However, in such a case, we can always "fix" the problem in G′ by removing one of the resulting outbound edges (if we're on the "sourceward" side of the root) or inbound edges (if we're on the "sinkward" side).
Given any strongly connected digraph H and any vertex v in H, we can do "forward DFS" or "reverse DFS" out from v to find a tree that spans H and has v as its only source or its only sink, respectively. So if H is one of the strongly connected components of G, and G′ has a two-directional spanning tree where the vertex representing H has either indegree ≤ 1 or outdegree ≤ 1, then we can straightforwardly (and efficiently) construct a subgraph of H that is a subgraph of a corresponding two-directional spanning tree of G, provided the rest of G's strongly connected components cooperate as well.
So the only remaining problem is with the root of the two-directional spanning tree of G′: just because it's the root of a two-directional spanning tree of G′, that doesn't necessarily mean that the corresponding strongly connected component of G contains the root of any two-directional spanning tree of G. For example, consider this graph:
A B
↓ ↓
C ↔ D
↓ ↓
E F
This graph doesn't have any two-directional spanning tree, but the corresponding graph of strongly connected components does (with root corresponding to {C, D}).
So, in other words, we have the following algorithm:
Use Tarjan's algorithm to derive a directed acyclic graph G′ of the strongly connected components of G.
Topologically sort G′ and identify all valid roots for two-directional spanning trees over G′.
If there are no such valid roots, return "failure".
If there are any two valid roots that are adjacent in the topological sort, return "success". G must contain an edge vw from some vertex in the one valid root to some vertex in the other valid root; we can obtain the two-directional spanning tree by doing "backward DFS" from v, plus the edge vw, plus doing "forward DFS" from w.
If there is any valid root corresponding to a strongly connected component that's just a single vertex of G, return "success". That single vertex is the root of a two-directional spanning tree of G that we can obtain by doing "backward DFS" plus "forward DFS".
Otherwise . . . return "unknown". We potentially have a much smaller problem in this case: in theory, for each valid root corresponding to some strongly connected component H, we need to examine H to see if it has a two-directional spanning tree that's suitable in terms of its inbound and outbound connections. But even that much-smaller problem still involves potentially massive numbers of possibilities, so exhaustive search seems infeasible.
So, as I mentioned at the outset, this algorithm requires O(V + E) time, and it deterministically returns a tree or "failure" in all cases where your proposal is deterministic plus some cases where your proposal is nondeterministic; but there are still some cases where this algorithm punts. :-/

Related

BFS tree to graph

Suppose we obtain the following BFS tree rooted at node D for an undirected graph with vertices {A,B,C,D,E,F,G,H}.
How to determine whether a particular edge is present or not in the original graph?
This is a multiple choice type question:
Which of the following edges is not present in the original graph?
(F, G)
(B, E)
(A, G)
(E, H)
You cannot know exactly which edges are in the graph, but you can be sure of some that are (namely those in the BST) and of some that are not (as otherwise the BST would have looked differently):
Every edge in the BST is also an edge in the graph
Every edge, that would allow a path from the root to a certain node that is shorter than the shortest path between those two nodes in the BST, is not a member of the graph.
Let's look at the following edges:
(F,G)
If that edge would be in the graph, then the shortest path from D to F would be D-G-F, having length 2, but in the BST the path from D to F has length 3. This is inconsistent, as a BST always finds the shortest path between the root and any other node in the graph.
(B,E)
This would allow a path from D to E of length 3, which is consistent with the BST. So this could be an edge in the graph, but doesn't have to.
(A,G)
This would allow a path from D to A or from D to G of length 2, which is consistent with the BST, as the BST offers shorter paths in both cases. So this could be an edge in the graph, but doesn't have to.
(E,H)
This would allow a path from D to E of length 3, which is consistent with the BST. So this could be an edge in the graph, but doesn't have to.
Of these four edges, only (F,G) is a clear case: that edge cannot be in the graph.
The edges in the BFS tree are a subset of the edges in the original graph and multiple original graphs might give the same BFS tree, so the answer to your question is:
If the BFS tree has an edge => the original graph has this edge too.
If the BFS tree does not have an edge => the original graph might or might not have this edge.
so it is not always possible to know whether the original graph has the edge or not.
Answering the MCQ question after adding it to the original question:
The way BFS works is level by level, that means all the nodes in level(i) will be processed before any node in level(i+1) is processed.
So all the nodes in L3 should be add to the queue before any node in L4 is added to the queue, so if (F,G) exits in the original graph then the node F should show in L3 as a child of node G instead of L4... so the answer is the edge (F,G).
Difference between the level of nodes should be no more than 1, so as to create a BFS tree as its just a matter of levels.
B is just one level away from E.
Same goes with E and H.
A and G are in the same level so the difference is 0.
F and G are 2 levels apart, so they might not be present in the Graph-if the edge between them was present then the BFS tree might not have taken pains to traverse other nodes, that currently are there in level 3(C, B and H), before F.

How to traverse on only a cycle in a graph?

I'm attempting ch23 in CLRS on MSTs, here's a question:
Given a graph G and a minimum spanning tree T , suppose that we decrease the weight of one of the edges not in T . Give an algorithm for finding the minimum spanning tree in the modified graph.
A solution I found was to add this new changed edge in T, then exactly one simple cycle is created in T, traverse this cycle and delete the max-weight edge in this cycle, voila, the new updated MST is found!
My question is, how do I only traverse nodes on this simple-cycle? Since DFS/BFS traversals might go out of the cycle if I, say, start the traversal in T from one endpoint of this newly added edge in T.
One solution I could think of was to find the biconnected components in T after adding the new edge. Only one BCC will be found, which is this newly formed simple-cycle, then I can put in a special condition in my DFS code saying to only traverse edges/nodes in this BCC, and once a back-edge is found, stop the traversal.
Edit: graph G is connected and undirected btw
Your solution is basically good. To make it more formal you can use Tarjan's bridge-finding algorithm
This algorithm find the cut-edges (aka bridges) in the graph in linear time. Consider E' to be the cut-edges set. It is easy to prove that every edge in E' can not be on circle. So, E / E' are must be the cycle in the graph.
You can use hash-map or array build function to find the difference between your E and the cut-edges set
From here you can run simple for-loop to find the max weight edge which you want to remove.
Hope that help!

Is there any minimum spanning tree that contains the maximum-weight edge on some cycle?

The origin problem is from the exercise of Introduction of Algorithm.
23.1-5 Let e be a maximum-weight edge on some cycle of connected graph G=(V, E). Prove that there is a minimum spanning tree of G'=(V, E - {e}) that is also a minimum spanning tree of G. That is, there is a minimum spanning tree of G that does not include e.
The question is that: I think the proposition that all the minimum spanning tree of G do not include e is right. The e is the only one maximum-weight edge on some cycle. Is it ?
Update: 2016-10-28 20:21
Add the restriction that e is the only one maximum-weight edge on some cycle.
One test case is when there are nodes labeled 0..n-1 and there are links only between node i and node (i + 1) mod n (that is, a ring). In this case the minimum spanning tree is created by leaving out just one of the links. If e is the unique maximum weight edge it is not in the unique spanning tree, which is all the other links. If there is more than one edge of maximum weight then there are as many different minimum spanning trees as there are edges of maximum weight, each one of them leaving out a different edge of maximum weight and keeping the other ones in.
Consider the case when there is just one edge of maximum weight. Supposing somebody hands you a minimum spanning tree that uses this edge. Delete it from the tree, giving you two disconnected components. Now try adding each of the other edges in the cycle, one at a time. If the edge doesn't connect the two components, delete it again. If any of the edges connect the two components, you have a spanning tree of smaller weight than before, so it can't have been a minimum spanning tree. Can it be the case that none of the edges connect the two components? Adding an edge that doesn't connect the two components doesn't increase the set of nodes reachable from either component, so if no single edge connected the two components, adding all of them at the same time won't. But we know that adding all of these edges adds a path that connects the two nodes connected by the previous maximum weight edge, so one of the edges must connect the components. So our original so-called minimum spanning tree wasn't, and an edge which is of unique maximum weight in a cycle can't be part of a minimum spanning tree.
Your guess is correct:
all the minimum spanning tree of G do not include e is right.
First we need to prove:
e is not a light edge crossing any cut of G.
Let C be any cut that cuts e, since e is in a cycle, so e is not a light edge for any of those cuts, and all the other cuts won't have the edge e crossing it, we won't have that the edge is light for any of those cuts either.
Then we need to prove:
if e is not a light edge crossing any cut of G, then all the minimum spanning tree of G do not include e.
Which is exactly the inverse proposition of 23.1-3.

MST theorem proof

Let G = (V , E) be a weighted undirected connected graph that contains a cycle, and let e be the maximum-weight edge among all edges in the cycle. I need to prove that there exists a minimum spanning tree of G which does NOT include e.
The idea is intuitively clear and I can show it on a cycle, consisting of 3 nodes. But I do not know how to show that formally for any cycle.
Assume that exists MST with e. Removing e from it, splits tree in two parts. Expecially, it splits cycle nodes into two non empty parts, call them A and B. Since these nodes form a cycle there is at least one more edge between A and B nodes, call it f. Than MST-e+f is a spanning tree with weight less than MST. That means it is not possible to have MST with e.

Finding a minimum-bottle neck spanning tree

Hi so i'm doing some test prep and i need to figure out parts b and c. I know part a is true and i can prove it, but finding the algorithms for part b and c is currently eluding me.
Solve the following for a minimum bottleneck tree where the edge with the maximum cost is referred to as the bottleneck.
(a) Is every minimum-bottleneck
spanning tree of G a minimum-spanning tree of G? Prove your claim.
(b) For a given cost c, give an O(n+m)-time algorithm to
find if the bottleneck cost of a minimum-bottleneck spanning tree
of G is not more than c.
(c) Find an algorithm to find a minimum-bottleneck
spanning tree of G.
thanks in advance to anyone who can help me out
For (b):
Erase every edge in G that costs more than c, then check if the left graph is still connected.
For (c):
Do a binary search on c, using the algorithm that solved (b) as the dividing condition.
Proof of (b):
Let's say the graph we got after deleting edges cost more than c from G is G' .
Then:
If G' is connected, then there must be a spanning tree T in G'. Since no edge in G' costs more than c, we can tell for sure that no edge in T costs more than c. Therefore T is a spanning tree for G' and also G whose bottle neck is at most c
If G' is not connected, then there's no spanning tree in G' at all. Since we know every edge in G- G' costs more than c, and we know that any spanning tree of G will contains at least one edge of G- G', therefore we know there's no edge spanning tree of G whose bottle neck <= c
And of course detecting if a graph is connected costs O(n+m)
Proof of (c):
Say, the algorithm we used in (b) is F(G,c) .
Then we have
If F(G,c) = True for some c, then F(G,c') = True for all c' that have c'>=c
If F(G,c) = False for some c, then F(G,c') = False for all c' that have c'<=c
So we can binary search on c :)
Ans. a)False,every minimum bottleneck spanning tree of graph G is not a minimum spanning tree of G.
b)To check whether the value of minimum bottleneck spanning tree is atmost c,you can apply depth first search by selecting any vertex from the set of vertices in graph G.
***Algorithm:***
check_atmostvalue(Graph G,int c)
{
for each vertex v belongs to V[G] do {
visited[v]=false;
}
DFS(v,c); //v is any randomly choosen vertex in Graph G
for each vertex v belongs to V[G] do {
if(visited[v]==false) then
return false;
}
return true;
}
DFS(v,c)
{
visited[v]=true;
for each w adjacent to v do {
if(visited[w]=false and weight(v,w)<=c) then
DFS(w,c);
}
visited[w]=true;
}
This algorithm works in O(V+E) in the worst case as running timr for depth first search DFS is O(V+E).
This problem can be solved by simply finding the MST of the graph. This based on the following claim:
MST is a MBST for a connected graph.
For a MST, choose the maximum edge e in the MST and the edge e divides the MST into two sets S and T. Then from the cut property, edge e must be the minimum weight among those edges that connects S and T.
Then for a MBST, there must be some edge e' that connect S and T. Then w(e') must be no less than w(e). Thus we know that MST must be a MBST.
However, there is another way to determine the minimum bottleneck. We don't need to computer the MBST. In your question, you actually implies the monotocity of the minimum bottleneck. Therefor we can use binary search combined with the connectivity algorithm to find the minimum bottle neck. I haves seen the use of monocity in other cases. I am a bit amazed that the similar technique can be used here!

Resources