BFS tree to graph - data-structures

Suppose we obtain the following BFS tree rooted at node D for an undirected graph with vertices {A,B,C,D,E,F,G,H}.
How to determine whether a particular edge is present or not in the original graph?
This is a multiple choice type question:
Which of the following edges is not present in the original graph?
(F, G)
(B, E)
(A, G)
(E, H)

You cannot know exactly which edges are in the graph, but you can be sure of some that are (namely those in the BST) and of some that are not (as otherwise the BST would have looked differently):
Every edge in the BST is also an edge in the graph
Every edge, that would allow a path from the root to a certain node that is shorter than the shortest path between those two nodes in the BST, is not a member of the graph.
Let's look at the following edges:
(F,G)
If that edge would be in the graph, then the shortest path from D to F would be D-G-F, having length 2, but in the BST the path from D to F has length 3. This is inconsistent, as a BST always finds the shortest path between the root and any other node in the graph.
(B,E)
This would allow a path from D to E of length 3, which is consistent with the BST. So this could be an edge in the graph, but doesn't have to.
(A,G)
This would allow a path from D to A or from D to G of length 2, which is consistent with the BST, as the BST offers shorter paths in both cases. So this could be an edge in the graph, but doesn't have to.
(E,H)
This would allow a path from D to E of length 3, which is consistent with the BST. So this could be an edge in the graph, but doesn't have to.
Of these four edges, only (F,G) is a clear case: that edge cannot be in the graph.

The edges in the BFS tree are a subset of the edges in the original graph and multiple original graphs might give the same BFS tree, so the answer to your question is:
If the BFS tree has an edge => the original graph has this edge too.
If the BFS tree does not have an edge => the original graph might or might not have this edge.
so it is not always possible to know whether the original graph has the edge or not.
Answering the MCQ question after adding it to the original question:
The way BFS works is level by level, that means all the nodes in level(i) will be processed before any node in level(i+1) is processed.
So all the nodes in L3 should be add to the queue before any node in L4 is added to the queue, so if (F,G) exits in the original graph then the node F should show in L3 as a child of node G instead of L4... so the answer is the edge (F,G).

Difference between the level of nodes should be no more than 1, so as to create a BFS tree as its just a matter of levels.
B is just one level away from E.
Same goes with E and H.
A and G are in the same level so the difference is 0.
F and G are 2 levels apart, so they might not be present in the Graph-if the edge between them was present then the BFS tree might not have taken pains to traverse other nodes, that currently are there in level 3(C, B and H), before F.

Related

Concept of Tarjan's bridge-finding algorithm

I was doning a problem of finding a bridge in a undirected connected graph, I looked up wikipedia for Tarjan's algorithm. Here is what it writes
Tarjan's bridge-finding algorithm
The first linear time algorithm for finding the bridges in a graph was described by
Robert Tarjan in 1974. It performs the following steps:
Find a spanning forest of G
Create a rooted forest F from the spanning forest
Traverse the forest F in preorder and number the nodes. Parent nodes in the forest now have lower numbers than child nodes.
For each node v in preorder (denoting each node using its preorder number), do:
Compute the number of forest descendants ND(v) for this node, by adding one to the sum of its children's descendants.
Compute L(v), the lowest preorder label reachable from v by a path for which all but the last edge stays within
the subtree rooted at v. This is the minimum of the set
consisting of the preorder label of v, of the values of
L(w) at child nodes of v and of the preorder
labels of nodes reachable from v by edges that do not
belong to F.
Similarly, compute H(v), the highest preorder label reachable by a path for which all but the last edge stays within the
subtree rooted at v. This is the maximum of the set
consisting of the preorder label of v, of the values of
H(w) at child nodes of v and of the preorder
labels of nodes reachable from v by edges that do not
belong to F.
For each node w with parent node v, if L(w) = w and H(w) < w + ND(w) then the edge
from v to w is a bridge.
I wonder whether I understand the previous steps wrong, since in my opinion, I think that L(w) = w is never gonna happen except at the root. Where in other cases, L(w) should be at least smaller than the father of w.
Source
The English description of L and H is slightly wrong -- they should exclude paths that contain the parent edge, or else it's as if there are parallel edges between each pair of adjacent nodes, hence no bridges. The algorithm for computing L and H correctly iterates over children only.

How to describe an algorithm to decide whether T is a depth-first spanning tree rooted at s?

Let's Suppose we have a connected graph G, a start vertex s, and a spanning tree T of G and G is undirected. How can I describe an algorithm to decide if T is a depth-first spanning tree rooted at s or not?
All DFS trees T for an undirected graph G have the following property:
{u, v} is an edge in G if and only if u is an ancestor of v in T or v is an ancestor of u in T.
To see why, assume without loss of generality that u is visited before v in the DFS. When building the DFS tree node for u, we will either (1) choose to visit node v as a neighbor of u, making node u a parent of node v, or (2) starting at node u we will visit some other neighbor z, and in recursively exploring z we will visit v, in which case u is a parent of z and z is an ancestor of v.
Moreover, we can make a stronger claim: any tree meeting the above criterion is a DFS tree for some DFS tree of G. Here’s how to see this. Start with the root node of T and look at its children. Given any two subtrees of the root, none of the nodes in those subtrees can be adjacent to one another in G, since otherwise by the above property one of those nodes would have to be an ancestor of the other. Therefore, each subtree consists of a set of nodes that are all reachable from one another via paths that only involve the nodes within that subtree. We can then recursively assemble one possible DFS ordering by starting at the root, recursively building DFS trees for the subgraphs represented by the subtrees in any order we’d like, and gluing those DFS orders together.
With this observation in mind, we can check very quickly with a second DFS whether T can be a DFS tree rooted at s, tracking which nodes have been visited as the DFS runs. After all children of a node v have been processed, check whether all the neighbors of v in graph G have been visited. If so, great! If not, it means that some neighbor of v is neither an ancestor nor a descendant, and the tree isn’t a DFS tree. If this process terminated without finding any violations, the process itself traces out a DFS of G using the edges of T, so T is definitely a valid DFS tree.
This algorithm runs in time O(m + n), which is as fast as possible here. After all, if you don’t look at all the nodes or edges of G, you can’t be sure whether the tree is a valid DFS tree because you can’t check the core property listed above.

Two-directional spanning tree in a directed graph in subquadratic time

Here's the problem I'm trying to solve. Given a directed graph G, does it contain a connected subgraph that:
Contains every node from G
Is acyclic
Can be disconnected by removing any one edge
Has a path between every source node and every sink node
Intuitively, the subgraph I'm looking for consists of a downward pointing and an upward pointing tree that share the same root and together span G. I'm calling it the two-directional spanning tree problem, but it may have another name.
The dumb algorithm I've thought of is to cycle through each node in the graph, do a backwards and a forwards DFS starting at that node and then concatenate the search trees. If a two-directional spanning tree exists, I'm pretty sure this will find one on some iteration. However, it runs in O(V(V + E)) time. My intuition is that there should be a faster algorithm. Am I correct?
Warning: This answer is incomplete; the algorithm here returns "unknown" in some cases. I'm posting it only because no one else has proposed any answers in the past few days, and it's still an improvement over the proposal inside your question itself, which is to try each vertex as a candidate root, perform "backward DFS" and "forward DFS" (where the "forward DFS" is not allowed to visit any nodes that were visited by the "backward DFS"), and make sure that the all vertices are discovered by one or the other.
Specifically, there are two ways in which the below answer improves over your proposal:
As you note in your question, your proposal requires O(V·(V + E)) time. The below answer requires only O(V + E) time.
If the digraph contains cycles, your proposal can fail to detect a two-directional spanning tree whose vertex is an element of such a cycle. Therefore, your proposal can spuriously return "failure". The below answer never spuriously returns "failure", but (as noted above) returns "unknown" in some cases.
To start with . . . if G is acyclic, then we can determine whether a two-directional spanning tree exists in O(V + E) time, and if so construct it in a further O(V + E) time. To see why, observe the following:
If there's a two-directional spanning tree rooted at v, then that means that for every other vertex w, there's a path either from v to w or vice versa.
If we have a topological ordering of the vertices (which we can construct in O(V + E) time), then for any given vertices v and w, there is certainly no path from v to w if w precedes v in the topological ordering, and vice versa.
This means that we can find the vertices (if any) that have paths to or from all other vertices by topologically sorting the digraph and then making two passes:
First, we iterate over the vertices in forward order. We want to find all "backward roots", meaning vertices that are reachable from all of their predecessors — or put another way, we want to eliminate any vertex that is not reachable from some predecessor. To do this, keep a collection of already-encountered vertices. As we iterate, we remove any vertices that have edges pointing to the current vertex. Whenever this collection is empty, the current vertex is a "backward root". (It's a bit tricky to do this iteration in strict O(V + E) time, but it's doable. The key insight is that we need to store a list of inbound edges for each vertex, which may require an O(V + E) preprocessing pass if that wasn't initially part of our graph representation.)
Then, we do the reverse, iterating over the vertices in reverse order to find all "forward roots".
A vertex is the root of some two-directional spanning tree if and only if it's both a "forward root" and a "backward root".
Once we've found such a root v, we can construct the tree itself by doing "forward DFS" and "reverse DFS" from that root.
An important special case (whose importance will become clear below) is the case that two consecutive vertices ("consecutive" in the topological ordering, I mean) are both valid roots. In such a case, we can do "reverse DFS" from the first one and "forward DFS" from the second one to build a two-directional spanning tree where every vertex has either indegree ≤ 1 or outdegree ≤ 1.
OK, but what if G contains a cycle? Your proposed algorithm can never return "failure" in such cases, but it is at least guaranteed to find a two-directional spanning tree as long as the root itself isn't part of a cycle in G. The preceding section completely ignores that case.
To start addressing that case, note that we can use Tarjan's algorithm, which takes O(V + E) time, to derive a directed acyclic graph G′ of the strongly connected components of G. (If G is already acyclic, then G′ = G.)
To see why G′ is useful, observe the following:
If G has a two-directional spanning tree rooted at a vertex v, then G′ has a two-directional spanning tree rooted at the vertex representing the strongly connected component containing v.
I admit, this claim is not completely obvious. After all, it's possible that G has a two-directional spanning tree that has multiple branches that "pass through" a single strongly connected component, such that the component's overall indegree and outdegree are both > 1. However, in such a case, we can always "fix" the problem in G′ by removing one of the resulting outbound edges (if we're on the "sourceward" side of the root) or inbound edges (if we're on the "sinkward" side).
Given any strongly connected digraph H and any vertex v in H, we can do "forward DFS" or "reverse DFS" out from v to find a tree that spans H and has v as its only source or its only sink, respectively. So if H is one of the strongly connected components of G, and G′ has a two-directional spanning tree where the vertex representing H has either indegree ≤ 1 or outdegree ≤ 1, then we can straightforwardly (and efficiently) construct a subgraph of H that is a subgraph of a corresponding two-directional spanning tree of G, provided the rest of G's strongly connected components cooperate as well.
So the only remaining problem is with the root of the two-directional spanning tree of G′: just because it's the root of a two-directional spanning tree of G′, that doesn't necessarily mean that the corresponding strongly connected component of G contains the root of any two-directional spanning tree of G. For example, consider this graph:
A B
↓ ↓
C ↔ D
↓ ↓
E F
This graph doesn't have any two-directional spanning tree, but the corresponding graph of strongly connected components does (with root corresponding to {C, D}).
So, in other words, we have the following algorithm:
Use Tarjan's algorithm to derive a directed acyclic graph G′ of the strongly connected components of G.
Topologically sort G′ and identify all valid roots for two-directional spanning trees over G′.
If there are no such valid roots, return "failure".
If there are any two valid roots that are adjacent in the topological sort, return "success". G must contain an edge vw from some vertex in the one valid root to some vertex in the other valid root; we can obtain the two-directional spanning tree by doing "backward DFS" from v, plus the edge vw, plus doing "forward DFS" from w.
If there is any valid root corresponding to a strongly connected component that's just a single vertex of G, return "success". That single vertex is the root of a two-directional spanning tree of G that we can obtain by doing "backward DFS" plus "forward DFS".
Otherwise . . . return "unknown". We potentially have a much smaller problem in this case: in theory, for each valid root corresponding to some strongly connected component H, we need to examine H to see if it has a two-directional spanning tree that's suitable in terms of its inbound and outbound connections. But even that much-smaller problem still involves potentially massive numbers of possibilities, so exhaustive search seems infeasible.
So, as I mentioned at the outset, this algorithm requires O(V + E) time, and it deterministically returns a tree or "failure" in all cases where your proposal is deterministic plus some cases where your proposal is nondeterministic; but there are still some cases where this algorithm punts. :-/

Using a minimum spanning tree algorithm

Suppose I have a weighted non-directed graph G = (V,E). Each vertex has a list of elements.
We start in a vertex root and start looking for all occurances of elements with a value x. We wish to travel the least amount of distance (in terms of edge weight) to uncover all occurances of elements with value x.
The way I think of it, a MST will contain all vertices (and hence all vertices that satisfy our condition). Therefore the algorithm to uncover all occurances can just be done by finding the shortest path from root to all other vertices (this will be done on the MST of course).
Edit :
As Louis pointed out, the MST will not work in all cases if the root is chosen arbitrarily. However, to make things clear, the root is part of the input and therefore there will be one and only one MST possible (given that the edges have distinct weights). This spanning tree will indeed have all minimum-cost paths to all other vertices in the graph starting from the root.
I don't think this will work. Consider the following example:
x
|
3
|
y--3--root
| /
5 3
| /
| /
x
The minimum spanning tree contains all three edges with weight 3, but this is clearly not the optimum solution.
If I understand the problem correctly, you want to find the minimum-weight tree in the graph which includes all vertices labeled x. (That is, the correct answer would have total weight 8, and would be the two edges drawn vertically in this drawing.) But this does not include your arbitrarily selected root at all.
I am pretty confident that the following variation on Prim's algorithm would work. Not sure if it's optimal, though.
Let's say the label we are looking for is called L.
Use an all-pairs shortest path algorithm to compute d(v, w) for all v, w.
Pick some node labeled L; call this the root. (We can be sure that this will be in the result tree, since we are including all nodes labeled L.)
Initialize a priority queue with the root initialized to 0. (The priority queue will consist of vertices labeled L, and their minimum distance from any node in the tree, including vertices not labeled L.)
While the priority queue is nonempty, do the following:
Pick out the top vertex in the queue; call it v, and its distance from the tree d.
For each vertex w on the path from v to the tree, v inclusive, find the nearest L-labeled node x to w, and add x to the priority queue, or update its priority. Add w to the tree.
The answer is no, if I'm understanding correctly. Finding the minimum spanning tree will contain all vertices V, but you only want to find the vertices with value x. Thus, your MST result may have unneeded vertices adding extra path length and therefore be sub-optimal.
An example has been given where the MST M1 from Root differs from an MST M2 containing all x nodes but not containing Root.
Here's an example where Root is in both MST's: Let graph G contain nodes R,S,T,U,V (R=Root), and a clockwise path R-S-T-U-V-R, with edge weights 1,1,3,2,2 going clockwise, and x at R, S, T, U. The first MST, M1, will have subtrees S-T and V-U below R, with cost 6 = 2+4, and cost-3 edge T-U not included in M1. But M2 has subtree S-T-U (only) below R, at cost 5.
Negative. If the idea is to find for every node that contains 'x' a separate path from root to it, and minimize the total cost of the paths, then you can just use simple shortest-path calculation separately for every node starting from the root, and put the paths together.
Some of those shortest paths will not be in the minimum spanning tree, so if this is your goal, the MST solution does not work. MST optimizes the cost of the tree, not the sum of costs of paths from root to the nodes.
If your idea is to find one path that starts from root and traverses through all nodes that contain 'x', then this is the traveling salesman problem and it is an NP-complete optimization problem, i.e. very hard.

Design an algorithm to detect cycle in graph G

What would the following algorithm look like:
a linear-time algorithm which, given an undirected graph G, and a particular edge e in it, determines whether G has a cycle containing e
I have following Idea:
for each v that belongs to V,
if v is a descendant of e and (e,v) has not been traversed then check following:
if we visited e before v and left v before we left e then
the graph contains cycle
I am not sure if this is your homework so I'll just give a little hint - use the properties of breadth-first search tree (with root in any of the two vertices of the edge e), its subtrees which are determined by neighbors of the root and the edges between those subtrees.
Per comingstorm's hint, an undirected edge is itself a cycle. A<->B back and forth as many times as you like.

Resources