I want to get a subgraph of a graph given the vertex to start at. All vertices connected to the starting vertex are considered part of the subgraph that should be returned.
I've already solved this requirement but am curious if there is a more efficient solution. The solution I came up with was to do a DFS of the graph and record every vertex that was encountered in a set, S. Then, I simply took all of the edges from the original graph that were connected to a vertex in S and I built a subgraph from it. The edges in the original graph are stored in a C# Dictionary which I believe is basically a hash.
DFS and BFS do not work because if you have two vertices that both have the same child, BFS or DFS will not traverse one of those edges. Hence the subgraph in that case would contain all of the correct vertices, but be missed some edge pairs.
Is there a better solution than the one I've come up with?
I think a BFS traversal is the most efficient algorithm for this.
If you do a BFS and enqueue all neighbors for each node (i.e. traverse all the edges attached to the current node) and only abort traversal when the current node has already been visited, you avoid the problem you described with "same child" / "missed edges".
a "fast" algorithm that enumerates all induced subgraphs of
given size can be found here:
http://theinf1.informatik.uni-jena.de/~wernicke/motifs-wabi2005.pdf
does this help?
Related
I have a graph and a starting node. I want to find how many nodes become isolated when I remove each node in the graph, for all nodes, using DFS.
For example, if I start on a fixed node 1, and remove node 2, how many isolated nodes I will have? and if I remove node 3?
I know I can just do DFS for all nodes(removing a different node each time), but doing so I will have to navigate the graph one time for each node, I want to solve it with just one run.
I have been told it has O(|V|*||A|), being |V|=number of edges, and |A|= number of nodes.
I've been playing with prenum and postnums, but with no success.
Let N be the number of vertices and M be the number of edges. If you just want a O(NM) solution as you stated, you don't need to go any further than running a DFS for each vertex.
The complexity for each DFS is O(N+M) so the total complexity will be of O(N(N+M)) = O(N²+NM). Usually we have more edges than vertices, so NM grows much faster than N² and we can say that the complexity is of O(NM). Keep in mind, though, that if you physically delete the current vertex at each step your implementation will have a much worse complexity, because physically deleting a vertex means removing entries from a lot of adjacency lists, which is costly no matter how you represent the graph. There is an implementation trick to speed up the process: instead of physically deleting the current vertex before each DFS, just mark the vertex as deleted, and when you are going through the adjacency lists during the DFS just ignore the marked vertex.
However, i feel that you can solve this problem in O(N+M) using Tarjan's algorithm for finding articulation points. This algorithm will find every vertex that, when removed from the graph, splits the graph in more than one connected component (these vertices are called articulation points). It's easy to see that there won't be isolated vertices if you remove a vertex that is not an articulation point. However, if you remove an articulation point, you will split the graph in two parts G and G', where G is the connected component of the starting vertex, and G' is the rest of the graph. All vertices from G' are isolated because you can't reach them if you run a DFS from the starting vertex. I think that you can find the size of G' for each vertex deletion efficiently, maybe you can even do this while running Tarjan's. If i find a solution i can edit this answer later.
EDIT: i managed to solve the problem in O(N+M). I will give some hints so you can find the answer by yourself:
Every undirected graph can be decomposed in (not disjoint) sets of biconnected components: each biconnected component is a subset of the vertices of the graph where every vertex in this subset will remain connected even if you remove any vertex of the graph
Tarjan's O(N+M) algorithm to find bridges and articulation points can be altered in order to find the biconnected components, finding which vertices belong to each biconnected component, or which biconnected components contain each vertex
If you remove any vertex that is not an articulation point, answer for this vertex is obviously N-1
If you remove an articulation point, every vertex in the same biconnected component of the starting vertex will still be acessible, but you don't know about the other biconnected components. Don't worry, there is a way to find this efficiently
You can compress every graph G in a graph B of its biconnected components. The compression algorithm is simple: every biconnected component becomes a vertex in B, and you link biconnected components that share some articulation point. We can prove that the resulting graph B is a tree. You must use this tree somehow in order to solve the problem presented in step 4
Good luck!
Can some one provide me the information about how check if the edges of the graph form a loop or not?
Any information would be highly helpful.
Many thanks in advance.
The Kruskal algorithm (which you tagged the question with) uses disjoint set data structure initialized with disjoint sets for each vertex. Then, for each edge, two sets that the edge's vertices belong to are merged. If the two vertices are already in the same set, you've found a loop. If you remove the edge every time it happens, you will get a spanning tree. If you sort the edges in order of ascending weight, that would be a minimum spanning tree.
If you need only to know whether a graph contains loops or not, use something simpler as DFS - if any node has an adjacent node (other than parent) which was visited already - you've found a cycle.
Do a complete DFS on the graph. Maintain two boolean variables, 'visited' and 'completed' for each node in the graph. 'visited' to indicate whether the vertex has been visited or not and 'completed' to indicate whether the DFS starting from that particular node has completed or not. If while doing DFS you hit a node which has already been visited but its DFS has not yet completed, then there exists a cycle in the graph.
It is the use of fast union-find datastructure which check if the edge to be connected is not between vertices that of same cluster.
Union-Find Datastructures
Well, I know that a breadth-first-search-tree of an undirected graph can't have a back edge. But I'm wondering how can it even have a cross-edge? I'm not able to image a spanning tree of a graph G constructed out of OFS, that contains a cross-edge.
The process of building a spanning tree using BFS over an undirected graph would generate the following types of edges:
Tree edges
Cross edges (connecting vertices on different branches)
A simple example: Imagine a triangle (a tri-vertice clique) - start a BFS from any node, and you'll reach the other two on the first step. You're left with an edge between them that does not belong to the spanning tree.
What about back-edges (connecting an ancestor with an non-immediate child) ? Well, as you point out, in BFS over an undirected graph you won't have them, since you would have used that edge when first reaching the ancestor.
In fact, you can make a stronger statement - all non-tree edges should be between vertices as the same level, or adjacent ones (you can't use that edge for the tree if the vertice on the other side is a sibling, like in the triangle case, or a sibling of the parent, that was not explored yet). Either way, it's falls under the definition of a cross-edge.
I had this same question...and the answer is that there are no cross edges in the BFS, but that the BFS tree itself encodes all the edges that would have been back-edges and forward-edges in the DFS tree as tree edges in the BFS tree, such that the remaining edges which the undirected graph has, but which are still not present in the BFS, are cross edges--and nothing else.
So the Boolean difference of the set of edges in the undirected graph and the edges in the BFS tree are all cross edges.
...As opposed to the DFS, where the set of missing edges may also include "Back Edges," "Forward Edges," and "Cross Edges."
I don't know why it is in the algorithmic parlance to say that both "tree edges and cross edges are in a BFS"
...I think it is just a short hand, and that in a math class, the professor would have written the relationship in set notation and unions (which I can't do on this stack exchange).
This is a question from Algorithm Design by Steven Skiena (for interview prep):
An articulation vertex of a graph G is a vertex whose deletion disconnects G. Let G be a graph with n vertices and m edges. Give a simple O(n + m) that finds a deletion order for the n vertices such that no deletion disconnects the graph.
This is what I thought:
Run DFS on the graph and keep updating each node's oldest reachable ancestor (based on which we decide if it's a bridge cut node, parent cute node or root cut node)
If we find a leaf node(vertex) or a node which is not an articulation vertex delete it.
At the end of DFS, we'd be left with all those nodes in graph which were found to be articulation vertices
The graph will remain connected as the articulation vertices are intact. I've tried it on a couple of graphs and it seems to work but it feels too simple for the book.
in 2 steps:
make the graph DAG using any traversal algorithm
do topology sort
each step finishes without going beyond O(m+n)
Assuming the graph is connected, then any random node reaches a subgraph whose spanning tree may be deleted in post-order without breaking the connectedness of the graph. Repeat in this manner until the graph is all gone.
Utilize DFS to track the exit time of each vertex;
Delete vertices in the order of recorded exit time;
If we always delete leaves of a tree one by one, rest of the tree remain connected. One particular way of doing this is to assign a pre-order number to each vertex as the graph is traversed using DFS or BFS. Sort the vertices in descending order (based on pre-order numbers). Remove vertices in that order from graph. Note that the leaves are always deleted first.
I am working on an assignment where one of the problems asks to derive an algorithm to check if a directed graph G=(V,E) is singly connected (there is at most one simple path from u to v for all distinct vertices u,v of V.
Of course you can brute force check it, which is what I'm doing right now, but I want to know if there's a more efficient way. Could anyone point me in the right direction?
There is a better answer for this question. you can do that in O(|V|^2). and with more effort you can do it in linear time.
First you find strongly connected components of G. in each strong component, you search to find this cases:
1) if there is a forward edge in this component, it is not singly connected,
2) if there is a cross edge in this component, it is not singly connected,
3) if there are at least two back edges in tree rooted at vertex u, to proper ancestors of u, then it is not singly connected.
this can be done in O(E). ( I think except for case 3. I couldn't implement it well!! ).
If none of cases above occurred, you should check whether there is a cross edge or a forward edge on G^SCC ( graph G, with strong components replaced with single nodes), since we don't have backedges, it can be done by repeating dfs on each vertex of this graph in O(|V|^2).
Have you tried DFS.
Run DFS for every vertex in the graph as source
If a visited vertex is encountered again, the graph is not singly connected
repeat for every unvisited vertex.
The graph is singly connected.
Complexity O(v^2), o(v) dfs as no repetition.
I don't agree that its complexity will be O(V^2), as In DFS we don't call it for every vertex as see in Introduction to algorithm book also, syntax is DFS(G). We only call DFS for whole graph not for any single vertex unlike BFS. So here in this case according to me we have to check it by calling DFS once.If a visited vertex is encountered again, the graph is not singly connected(definitely we have to call it for every disconnected component but it already included in the code). SO the complexity will be O(V+E). As here E=V therefore complexity should be O(V).
I thought of this :
1) Run DFS from any vertex, if all vertices are covered in the DFS with no forward edges(there can be no cross as else not all vertices will be covered), then it can be a potential candidate.
2) If a vertex(level j) which is found in the DFS has a back edge to level i then no other vertex found after it should have a back edge toward any vertex with level less than j and every vertex much be reachable to the root(checked with second DFS).
This does it in linear time if this is correct.
Take a look at the definition of simple path. A cyclic graph can be singly connected. DFS won't work for A->B, B->A, which is singly connected.
The following paper uses strongly connected component to solve this.
https://www.cs.umd.edu/~samir/grant/khuller99.ps
Run DFS once from each vertex. The graph is singly connected if and
only if there are no forward edges and there are no cross edges within a
component.
Complexity : O(V.E)