Can we detect cycles in directed graph using Union-Find data structure? - data-structures

I know that one can detect cycles in direct graphs using DFS and BFS. I want to know whether we can detect cycles in directed graphs using Union-Find or not?
If yes, then how? and
If we can't, then why?

No, we cannot use union-find to detect cycles in a directed graph. This is because a directed graph cannot be represented using the disjoint-set(the data structure on which union-find is performed).
When we say 'a union b' we cannot make out the direction of edge
is a going to b? (or)
is b going to a?
But, incase of unordered graphs, each connected component is equivalent to a set. So union-find can be used to detect a cycle. Whenever you try to perform union on two vertices belonging to the same connected component, we can say that cycle exists.

No.
Let me give you an example:
Q1. Take an undirected graph:
Is there a cycle in above undirected graph? Yes. And we can find the cycle using Union-Find algo.
Q2. Now look at the similar directed graph:
Is there a cycle in above directed graph? No!
BUT if you use Union-Find algo to detect cycle in above directed graph, it will say YES! Since union-find algo looks at the above diagram as below:
OR
Is there a cycle in above diagram? Yes! But the original(Q2) question was tampered and this is not what was asked. So Union-find algo will give wrong results for directed graphs.

No, we cannot use union-find to detect cycles in a directed graph.
This is because a directed graph cannot be represented using the
disjoint-set(the data structure on which union-find is performed).
When we say 'a union b' we cannot make out the direction of edge
is a going to b? (or) is b going to a? But, incase of unordered
graphs, each connected component is equivalent to a set. So union-find
can be used to detect a cycle. Whenever you try to perform union on
two vertices belonging to the same connected component, we can say
that cycle exists.
Just wanted to add to #Cherubim's answer, I just thought of this question that why can't we make out the direction of 'a union b' like we can consider it as a is connected to b right ('a union b' only nodes which are a->b)?
Well that will also fail, consider two nodes x and y, they are connected x -> y and they already have parents in the disjoint set, x_root and y_root.
So what happens when we say 'x union y' (x is connected to y), we make y_root the parent of x_root. well this tells us that:
x_root -> y_root (could be opposite too)
x_root and y_root are connected. (they could be disconnected)
So the main problem beside ambiguous direction, is also that not every node in a directed graph is connected to each other.

Related

Directed Graph walk - visit every node at least once

I’m familiar with the Hamilton path for a directed graph - visit every node exactly once.
I’m looking for an algorithm to walk the graph so that I visit every node at least once. I can’t find the standard name for this problem, if any.
This graph is walkable - root-a-d-b-c
This graph is not walkable - because in my walk, if I reach c, I have no directed edge to reach a & d and conversely, if I walk to a, d; there’s no directed edge that takes me to b & c
Hope that clarifies the question? Is there a standard name for this type of graph walk and an algorithm to solve it?
Hamiltonian path
Finding at most 2 leafs in the graph
I don't know if there's a name for a directed "walkable" graph, but it's not too hard to determine of a graph is walkable or not:
Find all the strongly connected components using Tarjan's algorithn, for example
Make a new directed graph of the connections between SCCs. This will be a DAG, and your original graph is walkable if and only if this DAG is walkable.
To determine whether or not a DAG is walkable, do a topological sort. Then check that each vertex has an edge to the next.
Each of these steps takes linear time, so you get O(|V|+|E|) complexity for the whole algorithm.
Theorem: a directed graph is walkable if and only if its strongly connected components are totally ordered by reachability.
Proof sketch: (walkable implies condition) The existence of a walk implies that, for each two strongly connected components, a vertex from one or the other appears first in the walk. That component can reach the other. (condition implies walkable) Since there is full connectivity inside a strongly connected component, each is walkable on its own. We just have to concatenate the walks according to the total order, adding the necessary transitions.
The proof is constructive, so we can extract an algorithm.
Algorithm
Compute the strongly connected components.
Concatenate the strongly connected components in topological order. Actually Tarjan's algorithm will list them in reverse topological order, so this doesn't need to be a separate step.
For each adjacent pair in the previous list, use breadth-first search to find a shortest path.
In general, this algorithm does not find the shortest walk (that's NP-hard by reduction from Hamiltonian path).

Finding a Minimal set of vertices that satisfy the given constraints

Note: no need for formal proof or anything, just the general idea of the algorithm and I will go deeper myself.
Given a directed graph: G(V,E), I want to find the smallest set of vertices T, such that for each vertex t in T the following edges don't exist: {(t,v) | for every v outside t} in O(V+E)
In other words, it's allowed for t to get edges from vertices outside T, but not to send.
(You can demonstrate it as phone call, where I am allowed to be called from outside and it's free but it's not allowed to call them from my side)
I saw this problem to be so close or similar to finding all strongly connected components (scc) in a directed graph which its time complexity is O(V+E) and I'm thinking of building a new graph and running this algorithm but not totally sure about that.
The main idea is to contract each strongly connected component (SCC) of G into a single vertex while keeping a score on how many vertices were contracted to create each vertex in the contracted graph (condensation of G). The resulting graph is a directed acyclic graph. The answer is the vertex with lower score among the ones with out-degree equal 0.
The answer structure is an union of strongly connected components because of the restriction over edges and you can prove that there is only a SCC involved in the answer because of the min restriction.

Make an undirected graph a strongly connected component (SCC)

Let G(V,E), an undirected, connected graph with no bridges at all. Describe an algorithm which direct the edges, so the new graph is an SCC.
My suggested algorithm
So we run DFS from an arbitrary vertex. We notice that since it's an undirected graph there are only tree-edges and back-edges (Right?). We connect edges accordingly (a tree-edge would be directed "down" and a back-edge would be directed "up").
A start of proof
We notice that the graph has no bridges. Therefore, every edge is part of some cycle. Therefore, the last edge to be discovered in some cycle must be a back-edge.
Now, I think the rest of the proof would need to show that we can always "climb" to the root, and so the graph is an SCC.
I'd be glad if you could help me connect the dots (or vertices XD)
Thanks
What you are looking for is a proof for Robbins' Theorem.
For a more formal proof you can look at this paper (See proof for theorem 2).
The below is not a formal proof but it's a way you can think of it:
As you already mentioned, since there are no bridges, every edge is a part of some cycle. Since you want your output graph to be SCC, a DFS on this output graph (from any vertex) must only have back-edges and tree-edges. It cannot have forward-edges or cross-edges.
Lets assume we have a forward-edge from s to t. This means that in the DFS we ran in order to build the graph, t was discovered in the sub-DFS (recursive call) of s and had no other gray or white adjacents. But this is not true because whenever we discover t in our DFS we would still have a gray adjacent.
Lets assume we have a cross-edge from s to t. This means that t's sub-DFS has ended before s was discovered. Again, this cannot happen in our DFS because either when discovering t first we would discover s in its sub-DFS or the opposite direction.
Here is a simple graph to help you see those cases.

Weakly connected balanced digraph

How can I prove that if a balanced digraph is weakly connected, then it is also strongly connected? (balanced digraph means that for every node, it's indegree and outdegree is the same and weakly connected means the non-directed version of this graph is connected).
What I can think of so far is: if the graph is balanced, it means it is a union of directed cycles. So if I remove any cycle, it will stay balanced. Also each vertex in the cycle has one edge coming into it and one edge leading out of it..
Then I guess I need to use some contradiction or induction to prove that the graph is strongly connected.. That's where I confused.
If two of those cycles intersect then they form a strongly connected component (since we can travel from any vertex in the cycle to the intersection of them then go around the other cycle and then come back again to complete the figure-8 )
Since the graph is weakly connected we know all the cycles intersect so therefore the graph is strongly connected.
I think you can flesh out the formality I left out.

How to find the minimum set of vertices in a Directed Graph such that all other vertices can be reached

Given a directed graph, I need to find the minimum set of vertices from which all other vertices can be reached.
So the result of the function should be the smallest number of vertices, from which all other vertices can be reached by following the directed edges.
The largest result possible would be if there were no edges, so all nodes would be returned.
If there are cycles in the graph, for each cycle, one node is selected. It does not matter which one, but it should be consistent if the algorithm is run again.
I am not sure that there is an existing algorithm for this? If so does it have a name? I have tried doing my research and the closest thing seems to be finding a mother vertex
If it is that algorithm, could the actual algorithm be elaborated as the answer given in that link is kind of vague.
Given I have to implement this in javascript, the preference would be a .js library or javascript example code.
From my understanding, this is just finding the strongly connected components in a graph. Kosaraju's algorithm is one of the neatest approaches to do this. It uses two depth first searches as against some later algorithms that use just one, but I like it the most for its simple concept.
Edit: Just to expand on that, the minimum set of vertices is found as was suggested in the comments to this post :
1. Find the strongly connected components of the graph - reduce each component to a single vertex.
2. The remaining graph is a DAG (or set of DAGs if there were disconnected components), the root(s) of which form the required set of vertices.
[EDIT #2: As Jason Orendorff mentions in a comment, finding the feedback vertex set is overkill and will produce a vertex set larger than necessary in general. kyun's answer is (or will be, when he/she adds in the important info in the comments) the right way to do it.]
[EDIT: I had the two steps round the wrong way... Now we should guarantee minimality.]
Call all of the vertices with in-degree zero Z. No vertex in Z can be reached by any other vertex, so it must be included in the final set.
Using a depth-first (or breadth-first) traversal, trace out all the vertices reachable from each vertex in Z and delete them -- these are the vertices already "covered" by Z.
The graph now consists purely of directed cycles. Find a feedback vertex set F which gives you a smallest-possible set of vertices whose removal would break every cycle in the graph. Unfortunately as that Wikipedia link shows, this problem is NP-hard for directed graphs.
The set of vertices you're looking for is Z+F.

Resources