DFS count isolated nodes for articulation points - algorithm

I have a graph and a starting node. I want to find how many nodes become isolated when I remove each node in the graph, for all nodes, using DFS.
For example, if I start on a fixed node 1, and remove node 2, how many isolated nodes I will have? and if I remove node 3?
I know I can just do DFS for all nodes(removing a different node each time), but doing so I will have to navigate the graph one time for each node, I want to solve it with just one run.
I have been told it has O(|V|*||A|), being |V|=number of edges, and |A|= number of nodes.
I've been playing with prenum and postnums, but with no success.

Let N be the number of vertices and M be the number of edges. If you just want a O(NM) solution as you stated, you don't need to go any further than running a DFS for each vertex.
The complexity for each DFS is O(N+M) so the total complexity will be of O(N(N+M)) = O(N²+NM). Usually we have more edges than vertices, so NM grows much faster than N² and we can say that the complexity is of O(NM). Keep in mind, though, that if you physically delete the current vertex at each step your implementation will have a much worse complexity, because physically deleting a vertex means removing entries from a lot of adjacency lists, which is costly no matter how you represent the graph. There is an implementation trick to speed up the process: instead of physically deleting the current vertex before each DFS, just mark the vertex as deleted, and when you are going through the adjacency lists during the DFS just ignore the marked vertex.
However, i feel that you can solve this problem in O(N+M) using Tarjan's algorithm for finding articulation points. This algorithm will find every vertex that, when removed from the graph, splits the graph in more than one connected component (these vertices are called articulation points). It's easy to see that there won't be isolated vertices if you remove a vertex that is not an articulation point. However, if you remove an articulation point, you will split the graph in two parts G and G', where G is the connected component of the starting vertex, and G' is the rest of the graph. All vertices from G' are isolated because you can't reach them if you run a DFS from the starting vertex. I think that you can find the size of G' for each vertex deletion efficiently, maybe you can even do this while running Tarjan's. If i find a solution i can edit this answer later.
EDIT: i managed to solve the problem in O(N+M). I will give some hints so you can find the answer by yourself:
Every undirected graph can be decomposed in (not disjoint) sets of biconnected components: each biconnected component is a subset of the vertices of the graph where every vertex in this subset will remain connected even if you remove any vertex of the graph
Tarjan's O(N+M) algorithm to find bridges and articulation points can be altered in order to find the biconnected components, finding which vertices belong to each biconnected component, or which biconnected components contain each vertex
If you remove any vertex that is not an articulation point, answer for this vertex is obviously N-1
If you remove an articulation point, every vertex in the same biconnected component of the starting vertex will still be acessible, but you don't know about the other biconnected components. Don't worry, there is a way to find this efficiently
You can compress every graph G in a graph B of its biconnected components. The compression algorithm is simple: every biconnected component becomes a vertex in B, and you link biconnected components that share some articulation point. We can prove that the resulting graph B is a tree. You must use this tree somehow in order to solve the problem presented in step 4
Good luck!

Related

Algorithm for Finding Graph Connectivity

I'm tackling an interesting question in programming. It is this: we keep adding undirected edges to a graph, until the graph (or subgraph) is connected (i.e. we can use some path to get from each vertex to any other vertex in that subgraph). We stop as soon as the graph is connected.
For example if we have vertices 1,2,3 and 4 and we want the subgraph 1,2,3 to be connected.
Let's say we have edges (3,4), then (2,3), then (1,4), then (1,3). We only need to add in the first 3 edges for the subgraph to be connected, then we stop (edge 1,3 isn't needed).
Obviously I can run a BFS every time an edge is added to see if we can reach the required vertices, but if there are say m edges then we would potentially have to run BFS m times which seems too slow. Any better options? Thanks.
You should research the marvelous "Disjoint-set data structure" and the corresponding union - find algorithm. It can seem magical, but the worst case time and space complexity are tiny, O(α(n)) and O(n) respectively, where α is the inverse Ackerman function.
You can run just one time the BFS to find connected components. Then, each time you add an edge, if it is between vertices of two different components, you can merge them by a reference. So, the complexity of this algorithm is |V| + |E|.
Notice that the implementation of this method should be done by some reference techniques, especially to update the component number of the vertices.
I would normally do this using a disjoint set structure, as Doug suggests. It would be like Kruskal's algorithm for finding the minimum spanning tree, except you process edges in the given order.
If you don't need a spanning tree as output, though, then you can do this with an incremental BFS or DFS:
Pick any vertex, and find the vertices connected to it with BFS or DFS. Color these vertices red. If you start with no edges, of course, then there will be only one red vertex at this stage.
As you add edges, don't do anything else until you add an edge that connects a red vertex to a non-red vertex. Then run BFS or DFS, excluding the new edge, to find all the new vertices that will connect to the red set. Color them all red.
Stop when all vertices are red.
This is a little simpler in practice than using disjoint set, and takes O(|V|+|E|) time, since each vertex will be traversed by exactly one BFS/DFS search.
It does the work in chunks, though, so if you need each edge test to be fast individually, then disjoint set is better.

I don't understand how this algorithm will tell me if a graph is biconnected

I'm doing some practice for an upcoming interview, and a practice question I found asks for an O(V+E) algorithm to tell if a graph is biconnected. This page from Princeton says that a graph is biconnected if it has no articulation vertices, where an articulation vertex is a vertex whose removal will increase the number of connected components (since a biconnected graph should have one connected component).
One common solution to this problem is to do DFS with additional tracking to see if any of the vertices are articulation vertices. This page says that a vertex is an articulation vertex if
V is a root node with at least 2 children
OR
V has a child that cannot reach an ancestor of V
However, the condition for root nodes doesn't make sense to me. Here is an example of a biconnected graph:
If we choose any of the nodes to be the root, they ALL have 2 or more children, so would be an articulation vertex thus making the graph not biconnected. This is a common algorithm for finding connected components, so assume I have misunderstood something. What do I actually need to check for the root node to see if a graph is biconnected?
You're supposed to do depth-first search, which means that the DFS tree always looks like
1 4
| |
2---3
because you can't explore the 1--4 link until you're done exploring everything that can be reached from 2 without going through 1, and you won't add 1--4 because it makes a cycle with the tree edges. No node in this tree has two children (1 is the root).

How to determine whether removing a given cycle will disconnect a graph

I have seen ways to detect a cycle in a graph, but I still have not managed to find a way to detect a "bridge-like" cycle. So let's say we have found a cycle in a connected (and undirected) graph. How can we determine whether removing this cycle will disconnect the graph or not? By removing the cycle, I mean removing the edges in the cycle (so the vertices are unaffected).
One way to do it is clearly to count the number of components before and after the removal. I'm just curious to know if there's a better way.
If there happens to be an established algorithm for that, could anyone please point me to a related work/paper/publication?
Here's the naive algorithm, complexity wise I don't think there's a more efficient way of doing the check.
Start with your list of edges (cycleEdges)
Get the set of vertices within cycleEdges (cycleVertices)
If a vertex in cycleVertices only contains edges that are part of cycleEdges return FALSE
For Each vetex In cycleVertices
Recursively follow vertex's edges that are not in cycleEdges (avoid already visited vertices)
If a vertex is reached that is not in cycleVertices add it to te set outsideVertices (stop recursively searching this path)
If only vertices that are in cycleVertices have been reached Return FALSE
If outsideVertices contains 1 element Return TRUE
Choose a vertex in outsideVertices and remove it from outsideVertices
Recursively follow that vertex's edges that are not in cycleEdges (avoid already visited vertices) (favor choosing edges that contain a vertex in outsideVertices to improve running time for large graphs)
If a vertex is reached that is in outsideVertices remove it from outsideVertices
If outsideVertices is empty Return TRUE
Return FALSE
You can do it for E+V.
You can get all bridges in your graph for E+V by dfs + dynamic programming.
http://www.geeksforgeeks.org/bridge-in-a-graph
Save them (just make boolean[E], and make true.
Then you can say for O(1) the edge is bridge or not.
You can just take all edges from your cycle and verify that it is bridge.
Vish's mentions articulation points which is definitely in the right direction. More can be said though. Articulation points can be found via a modified DFS algorithm that looks something like this:
Execute DFS, assigning each number with its DFS number (e.g. the number of nodes visited before it). When you encounter a vertex that has already been discovered compare its DFS number to the current vertex and you can store a LOW number associated with this vertex (e.g. the lowest DFS number that this node has "seen"). As you recurse back to the start vertex, you can compare the parent vertex with the child's LOW number. As you're recursing back, if a parent vertex ever sees a child's low number that is greater than or equal to its own DFS number, then that parent vertex is an articulation point.
I'm using "child" and "parent" here as descriptors a lot because in the DFS tree we have to consider a special case for the root. If it ever sees a child's low number that is greater than or equal to its own DFS number and it has two children in the tree, then the first vertex is an articulation.
Here's a useful art. point image
Another concept to be familiar with, especially for undirected graphs, is biconnected components, aka any subgraph whose vertices are in a cycle with all other vertices.
Here's a useful colored image with biconnected components
You can prove that any two biconnected components can only share one vertex max; two "shared" vertices would mean that the two are in a cycle, as well as with all the other vertices in the components so the two components are actually one large component. As you can see in the graph, any vertex shared by two components (has more than one color) is an articulation point. Removing the cycle that contains any articulation point will thus disconnect the graph.
Well, as in a cycle from any vertex x can be reached any other vertex y and vice-verse, then it's a strongly connected component, so we can contract a cycle into a single vertex that represents the cycle. The operation can be performed for O(n+m) using DFS. Now, we can apply DFS again in order to check whether the contracted cycles are articulation vertices, if they are, then removing them will disconnect a graph, else not. Total time is 2(n+m) = O(n+m)

Give an order for deleting vertices from a graph such that it doesn't disconnect the graph

This is a question from Algorithm Design by Steven Skiena (for interview prep):
An articulation vertex of a graph G is a vertex whose deletion disconnects G. Let G be a graph with n vertices and m edges. Give a simple O(n + m) that finds a deletion order for the n vertices such that no deletion disconnects the graph.
This is what I thought:
Run DFS on the graph and keep updating each node's oldest reachable ancestor (based on which we decide if it's a bridge cut node, parent cute node or root cut node)
If we find a leaf node(vertex) or a node which is not an articulation vertex delete it.
At the end of DFS, we'd be left with all those nodes in graph which were found to be articulation vertices
The graph will remain connected as the articulation vertices are intact. I've tried it on a couple of graphs and it seems to work but it feels too simple for the book.
in 2 steps:
make the graph DAG using any traversal algorithm
do topology sort
each step finishes without going beyond O(m+n)
Assuming the graph is connected, then any random node reaches a subgraph whose spanning tree may be deleted in post-order without breaking the connectedness of the graph. Repeat in this manner until the graph is all gone.
Utilize DFS to track the exit time of each vertex;
Delete vertices in the order of recorded exit time;
If we always delete leaves of a tree one by one, rest of the tree remain connected. One particular way of doing this is to assign a pre-order number to each vertex as the graph is traversed using DFS or BFS. Sort the vertices in descending order (based on pre-order numbers). Remove vertices in that order from graph. Note that the leaves are always deleted first.

Completely disconnecting a bipartite graph

I have a disconnected bipartite undirected graph. I want to completely disconnect the graph. Only operation that I can perform is to remove a node. Removing a node will automatically delete its edges. Task is to minimize the number of nodes to be removed. Each node in the graph has atmost 4 edges.
By completely disconnecting a graph, I mean that no two nodes should be connected through a link. Basically an empty edge set.
I think, you cannot prove your algorithm is optimal because, in fact, it is not optimal.
To completely disconnect your graph minimizing the number of nodes to be removed, you have to remove all the nodes belonging to the minimal vertex cover of your graph. Searching the minimal vertex cover is usually NP-complete, but for bipartite graphs there is a polynomial-time solution.
Find maximum matching in the graph (probably with Hopcroft–Karp algorithm). Then use König's theorem to get the minimal vertex cover:
Consider a bipartite graph where the vertices are partitioned into left (L) and right (R) sets. Suppose there is a maximum matching which partitions the edges into those used in the matching (E_m) and those not (E_0). Let T consist of all unmatched vertices from L, as well as all vertices reachable from those by going left-to-right along edges from E_0 and right-to-left along edges from E_m. This essentially means that for each unmatched vertex in L, we add into T all vertices that occur in a path alternating between edges from E_0 and E_m.
Then (L \ T) OR (R AND T) is a minimum vertex cover.
Here's a counter-example to your suggested algorithm.
The best solution is to remove both nodes A and B, even though they are different colors.
Since all the edges are from one set to another, find these two sets using say BFS and coloring using 2 colours. Then remove the nodes in smaller set.
Since there are no edges among themselves the rest of the nodes are disconnected as well.
[As a pre-processing step you can leave out nodes with 0 edges first.]
I have thought of an algorithm for it but am not able to prove if its optimal.
My algorithm: On each disconnected subgraph, I run a BFS and color it accordingly. Then I identify the number of nodes colored with each color and take the minimum of the two and store. I repeat the procedure for each subgraph and add up to get the required minimum. Help me prove the algorithm if it's correct.
EDIT: The above algorithm is not optimal. The accepted answer has been verified to be correct.

Resources