Connected components of undirected graph - algorithm

Suppose I have an undirected graph G with vertices v1...vn and edges. Right now it is in adjacency list representation.
For every time moment a have as input some subset of vertices that are "active" at this moment. And I need to find all connected components in this subset of vertices for that time moment.
Right now I've implemented this using union-find data structure like this:
initialize sets for every active vertex so that every vertex has itself as "representative" (also called "parent")
for every active vertex v
for all neighbours of v in G v_neighbour
if v_neighbour is active
union set of v and set of v_neighbour
It's should work ok, but I want to know if there is a more optimal approach to that?
And what is the running time of that algorithm? O(N*M)?

A BFS or DFS that restarts once a connected component is exhausted can easily find all connected components in undirected graph in O(V+E)
connectedComponentNumber= 0
While there is a node that was not discovered yet:
connectedComponentNumber= connectedComponentNumber+ 1
v = some vertex that was not discovered yet
vertices = BFS(v) //all vertices connected to v
set all nodes in vertices as belong to connected component connectedComponentNumber

I want to add one thing to the above reply that if you are finding active vertex in adjacency list representation it will increase V number of steps so running time time will increase to O(V^3).which can take constant time if we maintain another data structure for this.

Related

Finding a Minimal set of vertices that satisfy the given constraints

Note: no need for formal proof or anything, just the general idea of the algorithm and I will go deeper myself.
Given a directed graph: G(V,E), I want to find the smallest set of vertices T, such that for each vertex t in T the following edges don't exist: {(t,v) | for every v outside t} in O(V+E)
In other words, it's allowed for t to get edges from vertices outside T, but not to send.
(You can demonstrate it as phone call, where I am allowed to be called from outside and it's free but it's not allowed to call them from my side)
I saw this problem to be so close or similar to finding all strongly connected components (scc) in a directed graph which its time complexity is O(V+E) and I'm thinking of building a new graph and running this algorithm but not totally sure about that.
The main idea is to contract each strongly connected component (SCC) of G into a single vertex while keeping a score on how many vertices were contracted to create each vertex in the contracted graph (condensation of G). The resulting graph is a directed acyclic graph. The answer is the vertex with lower score among the ones with out-degree equal 0.
The answer structure is an union of strongly connected components because of the restriction over edges and you can prove that there is only a SCC involved in the answer because of the min restriction.

Find Minimum Vertex Connected Sub-graph

First of all, I have to admit I'm not good at graph theory.
I have a weakly connected directed graph G=(V,E) where V is about 16 millions and E is about 180 millions.
For a given set S, which is a subset of V (size of S will be around 30), is it possible to find a weakly connected sub-graph G'=(V',E') where S is a subset of V' but try to keep the number of V' and E' as small as possible?
The graph G may change and I hope there's a way to find the sub-graph in real time. (When a process is writing into G, G will be locked, so don't worry about G get changed when your sub-graph calculation is still running.)
My current solution is find the shortest path for each pair of vertex in S and merge those paths to get the sub-graph. The result is OK but the running time is pretty expensive.
Is there a better way to solve this problem?
If you're happy with the results from your current approach, then it's certainly possible to do at least as well a lot faster:
Assign each vertex in S to a set in a disjoint set data structure: https://en.wikipedia.org/wiki/Disjoint-set_data_structure. Then:
Do a breadth-first-search of the graph, starting with S as the root set.
When you the search discovers a new vertex, remember its predecessor and assign it to the same set as its predecessor.
When you discover an edge that connects two sets, merge the sets and follow the predecessor links to add the connecting path to G'
Another way to think about doing exactly the same thing:
Sort all the edges in E according to their distance from S. You can use BFS discovery order for this
Use Kruskal's algorithm to generate a spanning tree for G, processing the edges in that order (https://en.wikipedia.org/wiki/Kruskal%27s_algorithm)
Pick a root in S, and remove any subtrees that don't contain a member of S. When you're done, every leaf will be in S.
This will not necessarily find the smallest possible subgraph, but it will minimize its maximum distance from S.

Can graphs have isolated vertices?

Can a graph have an isolated vertex without edges?
Would this count as 1 graph or 2?
The definition of a graph is that it is just an ordered pair of a vertex set say V and an edge set say E, where E is a subset of VxV .
There is no restriction saying that every vertex should have atleast one edge.
Of course we can have local definitions to solve a particular problem, but generally what you drew is just a single graph with 2 connected components.

Finding the heaviest edge in the graph that forms a cycle

Given an undirected graph, I want an algorithm (inO(|V|+|E|)) that will find me the heaviest edge in the graph that forms a cycle. For example, if my graph is as below, and I'll run DFS(A), then the heaviest edge in the graph will be BC.
(*) In this problem, I have at most 1 cycle.
I'm trying to write a modified DFS, that will return the desired heavy edge, but I'm having some trouble.
Because I have at most 1 cycle, I can save the edges in the cycle in an array, and find the maximum edge easily at the end of the run, but I think this answer seems a bit messy, and I'm sure there's a better recursive answer.
I think the easiest way to solve this is to use a union-find data structure (https://en.wikipedia.org/wiki/Disjoint-set_data_structure) in a manner similar to Kruskal's MST algorithm:
Put each vertex in its own set
Iterate through the edges in order of weight. For each edge, merge the sets of the adjacent vertices if they're not already in the same set.
Remember the last edge for which you found that its adjacent vertices were already in the same set. That's the one you're looking for.
This works because the last and heaviest edge that you visit in any cycle must already have its adjacent vertices connected by edges you visited earlier.
Use Tarjan's Strongly Connected Components algorithm.
Once you have split your graph into many strongly connected graphs assign a COMP_ID to each node which specifies the component ID to which this node belongs (This can be done with a small edit on the algorithm. Define a global integer value which starts at 1. Every time you pop nodes from the stack they all correspond to the same component, save the value of this variable to the COMP_ID of these nodes. When the pop loop ends increment the value of this integer by one).
Now, iterate over all the edges. You have 2 possibilities:
If this edge links two nodes from two different components, then this edge can't be the answer, since it can't possibly be a part of a cycle.
If this edge links two nodes from the same component, then this edge is a part of some cycle. All you have left to do now is to choose the maximum edge among all the edges of type 2.
The described approach runs in a total complexity of O(|V| + |E|) because every node and edge corresponds to at most one strongly connected component.
In the graph example you provided COMP_ID will be as follows:
COMP_ID[A] = 1
COMP_ID[B] = 2
COMP_ID[C] = 2
COMP_ID[D] = 2
Edge 10 connects COMP_ID 1 with COMP_ID 2, thus it can't be the answer. The answer is the maximum among edges {2, 5, 8} since they all connect COMP_ID 1 with it self, thus the answer is 8

Minimize set of edges in a directed graph keeping connected components

Here is the full question:
Assume we have a directed graph G = (V,E), we want to find a graph G' = (V,E') that has the following properties:
G' has same connected components as G
G' has same component graph as G
E' is minimized. That is, E' is as small as possible.
Here is what I got:
First, run the strongly connected components algorithm. Now we have the strongly connected components. Now go to each strong connected component and within that SCC make a simple cycle; that is, a cycle where the only nodes that are repeated are the start/finish nodes. This will minimize the edges within each SCC.
Now, we need to minimize the edges between the SCCs. Alas, I can't think of a way of doing this.
My 2 questions are: (1) Does the algorithm prior to the part about minimizing edges between SCCs sound right? (2) How does one go about minimizing the edges between SCCs.
For (2), I know that this is equivalent to minimizing the number of edges in a DAG. (Think of the SCCs as the vertices). But this doesn't seem to help me.
The algorithm seems right, as long as you allow for closed walks (i.e. repeating vertices.) Proper cycles might not exist (e.g. in an "8" shaped component) and finding them is NP-hard.
It seems that it is sufficient to group the inter-component edges by ordered pairs of components they connect and leave only one edge in each group.
Regarding the step 2,minimize the edges between the SCCs, you could randomly select a vertex, and run DFS, only keeping the longest path for each pair of (root, end), while removing other paths. Store all the vertices searched in a list L.
Choose another vertex, if it exists in L, skip to the next vertex; if not, repeat the procedure above.

Resources