Creating a graph and finding strongly connected components in a single pass (not just Tarjan!) - algorithm

I have a particular problem where each vertex of a directed graph has exactly four outward-pointing paths (which can point back to the same vertex).
At the beginning, I have only the starting vertex and I use DFS to discover/enumerate all the vertices and edges.
I can then use something like Tarjan's algo to break the graph into strongly connected components.
My question is, is there a more efficient way to doing this than discovering the graph and then applying an algorithm. For example, is there a way of combining the two parts to make them more efficient?

To avoid having to "discover" the graph at the outset, the key property that Tarjan's algorithm would need is that, at any point in its execution, it should only depend on the subgraph it has explored so far, and it should only ever extend this explored region by enumerating the neighbours of some already-visited vertex. (If, for example, it required knowing the total number of nodes or edges in the graph at the outset, then you would be sunk.) From looking at the Wikipedia page it seems that the algorithm does indeed have this property, so no, you don't need to perform a separate discovery phase at the start -- you can discover each vertex "lazily" at the lines for each (v, w) in E do (enumerate all neighbours of v just as you currently do in your discovery DFS) and for each v in V do (just pick v to be any vertex you have already discovered as a w in the previous step, but which you haven't yet visited yet with a call to strongconnect(v)).
That said, since your initial DFS discovery phase only takes linear time anyway, I'd be surprised if eliminating it sped things up much. If your graph is so large that it doesn't fit in cache, it could halve the total time, though.

Related

Algorithm: Minimal path alternating colors

Let G be a directed weighted graph with nodes colored black or white, and all weights non-negative. No other information is specified--no start or terminal vertex.
I need to find a path (not necessarily simple) of minimal weight which alternates colors at least n times. My first thought is to run Kosaraju's algorithm to get the component graph, then find a minimal path between the components. Then you could select nodes with in-degree equal to zero since those will have at least as many color alternations as paths which start at components with in-degree positive. However, that also means that you may have an unnecessarily long path.
I've thought about maybe trying to modify the graph somehow, by perhaps making copies of the graph that black-to-white edges or white-to-black edges point into, or copying or deleting edges, but nothing that I'm brain-storming seems to work.
The comments mention using Dijkstra's algorithm, and in fact there is a way to make this work. If we create an new "root" vertex in the graph, and connect every other vertex to it with a directed edge, we can run a modified Dijkstra's algorithm from the root outwards, terminating when a given path's inversions exceeds n. It is important to note that we must allow revisiting each vertex in the implementation, so the key of each vertex in our priority queue will not be merely node_id, but a tuple (node_id, inversion_count), representing that vertex on its ith visit. In doing so, we implicitly make n copies of each vertex, one per potential visit. Visually, we are effectively making n copies of our graph, and translating the edges between each (black_vertex, white_vertex) pair to connect between the i and i+1th inversion graphs. We run the algorithm until we reach a path with n inversions. Alternatively, we can connect each vertex on the nth inversion graph to a "sink" vertex, and run any conventional path finding algorithm on this graph, unmodified. This will run in O(n(E + Vlog(nV))) time. You could optimize this quite heavily, and also consider using A* instead, with the smallest_inversion_weight * (n - inversion_count) as a heuristic.
Furthermore, another idea hit me regarding using knowledge of the inversion requirement to speedup the search, but I was unable to find a way to implement it without exceeding O(V^2) time. The idea is that you can use an addition-chain (like binary exponentiation) to decompose the shortest n-inversion path into two smaller paths, and rinse and repeat in a divide and conquer fashion. The issue is you would need to construct tables for the shortest i-inversion path from any two vertices, which would be O(V^2) entries per i, and O(V^2logn) overall. To construct each table, for every entry in the preceding table you'd need to append V other paths, so it'd be O(V^3logn) time overall. Maybe someone else will see a way to merge these two ideas into a O((logn)(E + Vlog(Vlogn))) time algorithm or something.

how to test for bipartite in directed graph

Although we can check a if a graph is bipartite using BFS and DFS (2 coloring ) on any given undirected graph, Same implementation may not work for the directed graph.
So for testing same on directed graph , Am building a new undirected graph G2 using my source graph G1, such that for every edge E[u -> v] am adding an edge [u,v] in G2.
So by applying a 2 coloring BFS I can now find if G2 is bipartite or not.
and same applies for the G1 since these two are structurally same. But this method is costly as am using extra space for graph. Though this will suffice my purpose as of now, I'd like know if there any better implementations for the same.
Thanks In advance.
You can execute the algorithm to find the 2-partition of an undirected graph on a directed graph as well, you just need a little twist. (BTW, in the algorithm below I assume that you will eventually find a 2-coloring. If not, then you will run into a node that is already colored and you find you need to color it to the other color. Then you just exit saying it's not bipartite.)
Start from any node and do the 2-coloring by traversing the edges. If you have traversed every edge and every node in the graph then you have your partition. If not, then you have a component that is 2-colored and there are no edges leaving the component. Pick any node not in the component and repeat. If you get into a situation when you have a few components that are all 2-colored, and there are no edges leaving any of them, and you encounter an edge that originates in a node in the component you are currently building and goes into a node in one of the previous components then you just merge the current component with the older one (and possibly need to flip the color of every node in one of the components -- flip it in the smaller component). After merging just continue. You can do the merge, because at the time of the merge you have scanned only one edge between the two components, so flipping the coloring of one of the components leaves you in a valid state.
The time complexity is still O(max(|N|,|E|)), and all you need is an extra field for every node indicating which component that node is in.

Finding the list of common children (descendants) for any two nodes in a cyclic graph

I have a cyclic directed graph and I was wondering if there is any algorithm (preferably an optimum one) to make a list of common descendants between any two nodes? Something almost opposite of what Lowest Common Ancestor (LCA) does.
As user1990169 suggests, you can compute the set of vertices reachable from each of the starting vertices using DFS and then return the intersection.
If you're planning to do this repeatedly on the same graph, then it might be worthwhile first to compute and contract the strong components to supervertices representing a set of vertices. As a side effect, you can get a topological order on supervertices. This allows a data-parallel algorithm to compute reachability from multiple starting vertices at the same time. Initialize all vertex labels to {}. For each start vertex v, set the label to {v}. Now, sweep all vertices w in topological order, updating the label of w's out-neighbors x by setting it to the union of x's label and w's label. Use bitsets for a compact, efficient representation of the sets. The downside is that we cannot prune as with single reachability computations.
I would recommend using a DFS (depth first search).
For each input node
Create a collection to store reachable nodes
Perform a DFS to find reachable nodes
When a node is reached
If it's already stored stop searching that path // Prevent cycles
Else store it and continue
Find the intersection between all collections of nodes
Note: You could easily use BFS (breadth first search) instead with the same logic if you wanted.
When you implement this keep in mind there will be a few special cases you can look for to further optimize your search such as:
If an input node doesn't have any vertices then there are no common nodes
If one input node (A) reaches another input node (B), then A can reach everything B can. This means the algorithm wouldn't have to be ran on B.
etc.
Why not just reverse the direction of the edge and use LCA?

How to find the minimum set of vertices in a Directed Graph such that all other vertices can be reached

Given a directed graph, I need to find the minimum set of vertices from which all other vertices can be reached.
So the result of the function should be the smallest number of vertices, from which all other vertices can be reached by following the directed edges.
The largest result possible would be if there were no edges, so all nodes would be returned.
If there are cycles in the graph, for each cycle, one node is selected. It does not matter which one, but it should be consistent if the algorithm is run again.
I am not sure that there is an existing algorithm for this? If so does it have a name? I have tried doing my research and the closest thing seems to be finding a mother vertex
If it is that algorithm, could the actual algorithm be elaborated as the answer given in that link is kind of vague.
Given I have to implement this in javascript, the preference would be a .js library or javascript example code.
From my understanding, this is just finding the strongly connected components in a graph. Kosaraju's algorithm is one of the neatest approaches to do this. It uses two depth first searches as against some later algorithms that use just one, but I like it the most for its simple concept.
Edit: Just to expand on that, the minimum set of vertices is found as was suggested in the comments to this post :
1. Find the strongly connected components of the graph - reduce each component to a single vertex.
2. The remaining graph is a DAG (or set of DAGs if there were disconnected components), the root(s) of which form the required set of vertices.
[EDIT #2: As Jason Orendorff mentions in a comment, finding the feedback vertex set is overkill and will produce a vertex set larger than necessary in general. kyun's answer is (or will be, when he/she adds in the important info in the comments) the right way to do it.]
[EDIT: I had the two steps round the wrong way... Now we should guarantee minimality.]
Call all of the vertices with in-degree zero Z. No vertex in Z can be reached by any other vertex, so it must be included in the final set.
Using a depth-first (or breadth-first) traversal, trace out all the vertices reachable from each vertex in Z and delete them -- these are the vertices already "covered" by Z.
The graph now consists purely of directed cycles. Find a feedback vertex set F which gives you a smallest-possible set of vertices whose removal would break every cycle in the graph. Unfortunately as that Wikipedia link shows, this problem is NP-hard for directed graphs.
The set of vertices you're looking for is Z+F.

How to detect if breaking an edge will make a graph disjoint?

I have a graph that starts off with a single, root node. Nodes are added one by one to the graph. At node creation time, they have to be linked either to the root node, or to another node, by a single edge. Edges can also be created and deleted (one by one, between any two nodes). Nodes can be deleted one at a time. Node and edge creation, deletion operations can happen in any arbitrary order.
OK, so here's my question: When an edge is deleted, is it possible do determine, in constant time (i.e. with an O(1) algorithm), if doing this will divide the graph into two disjoint subgraphs? If it will, then which side of the edge will the root node belong?
I'm willing to maintain, within reasonable limits, any additional data structure that can facilitate the derivation of this information.
Maybe it is not possible to do it in O(1), if so any pointers to literature will be appreciated.
Edit: The graph is a directed graph.
Edit 2: OK, maybe I can restrict the case to deletion of edges from the root node. [Edit 3: not, actually] Also, no edge lands into the root node.
To speed things up a little over the obvious O(|V|+|E|) solution, you could keep a spanning tree which is fairly easy to update as the graph is changed.
If an edge not in the spanning tree is deleted, then the graph isn't disconnected and do nothing. If an edge in the spanning tree is deleted, then you must try to find a new path between those two vertices (if you find one, use it to update the spanning tree, otherwise the graph is disconnected).
So, best case O(1), worst-case O(|V|+|E|), but fairly simple to implement anyway.
Is this a directed graph? The below assumes undirected.
What you are looking for is whether the given edge is a Bridge in the graph. I believe this can be found using a traversal looking for cycles containing that edge and would be O(|V| + |E|).
O(1) is too much to ask.
You might find that looking to maintain 2-edge connected components in dynamic graphs could be useful to you.
Eppstein et al have a paper on this: http://www.ics.uci.edu/~eppstein/pubs/EppGalIta-TR-93-20.pdf
which can maintain 2-edge connected components, in a graph of n nodes where edge insertions and deletions are allowed. It has O(sqrt(n)) time per update and O(log n) time per query.
So any time you delete, you can query in O(logn) to determine if the number of 2-edge connected components has changed. I suppose it can also tell you which component a specific node is in.
This paper is more general and applies to other graph problems, not only 2 edge connected components.
I suggest you look for bridges and dynamic 2-edge connectivity to get you started.
Hope that helps.
as said by Moron just before, you are actually looking for a Bridge in your graph.
Now a Bridge is an edge that has the described attribute and also originates and ends up in Cut Vertexes. Cut vertex is exactly what a Bridge is, but in a vertex (node) edition.
So the only way (though quite bending the initial data structure hypothesis) I can think of, to get a O(1) complexity for this, is if you first check every node in your graph if it is a Cut Vertex and then simply in constant time checking if the edge you want to delete is a attached to one of those two.
Finding if a node in a graph is a Cut Vertex takes O(m+n) where m = # edges and n= # nodes.
Cheers

Resources