Number of connected components after deleting k vertices - algorithm

I am trying to solve the following graph problem:
We are given a general unweighted and undirected graph and k (k < |V| ) vertices that are
already known beforehand. The vertices are deleted sequentially. After
each deletion, how many connected components are there?
I thought of using tarjan's algorithm at each step to check if the current vertex to be deleted is a cut vertex so that when the deletion is performed, we can simply add the number of neighbours to the number of connected components. The complexity of this algorithm is O(V(V+E)).
I was told that there is a O(V+E) algorithm to perform this task. But I cannot figure it out. Research on Google also does not reveal much. Could anyone please advise me?

We can use the fact that the vertices are known beforehand.
Let's solve a "reverse" problem: given a graph and a list vertices that are ADDED to it sequentially, compute the number of connected components in the graph after each addition structure.
The solution is pretty straightforward: we can maintain a disjoint set union structure and add all edges incident to the vertex to the graph (it's easy to keep the number of components in this structure: initially, it is equal to the number of vertices and is decreased by one when a union actually happens).
The original problem is reduced to the "reverse" one in the following way:
Let's add all edges that are not incident to any of the deleted vertices to the disjoint set union.
Now we can reverse the list of deleted vertices and add them one by one as described above.
After that, we need to reverse the resulting list that contains the number of components.
Note: this solution is not actually O(V + E), its O(V + E * alpha(V)), where alpha(x) is the inverse Ackermann's function. It is very close to linear for all practical purposes.

here is my implementation of algorithm in c++ using disjoint set:
#include <bits/stdc++.h>
using namespace std;
#define pb push_back
typedef pair<int, int> pii;
const int M=2e5+137;
class DisjointSet {
public:
int connected_comp;
int parent[100000];
void makeSet(int n){
for (int i=1;i<n+1; ++i)
parent[i] = i;
connected_comp = n;
}
int Find(int l) {
if (parent[l] == l)
return l;
return Find(parent[l]);
}
void Union(int m, int n) {
int x = Find(m);
int y = Find(n);
if(x==y) return;
if(x<y){
parent[y] = x;
connected_comp--;
}
else{
parent[x] = y;
connected_comp--;
}
}
};
set<pii> not_delete;
vector<pii> to_add;
int main(){
int node, edge;
cout<<"enter number of nodes and edges"<<"\n";
cin>>node>>edge;
DisjointSet dis;
dis.makeSet(node);
cout<<"enter two nodes to add edges"<<"\n";
for(int i=0;i<edge;i++){
int u,v;
cin>>u>>v;
if(u>v){
not_delete.insert({u,v});
}
else{
not_delete.insert({v,u});
}
}
int deletions;
cout<<"enter number of deletions"<<"\n";
cin>>deletions;
cout<<"enter two node to delete edge between them"<<"\n";
for(int i=0;i<deletions;i++){
int u,v;
cin>>u>>v;
if(u>v){
not_delete.erase({u,v});// edges that never delete from graph
to_add.pb({u,v}); // edges that gonna delete from graph
}
else{
not_delete.erase({v,u});
to_add.pb({v,u});
}
}
vector<int> res;
// first adding edges that never delete from graph
for(pii x: not_delete){
dis.Union(x.first, x.second);
}
res.pb(dis.connected_comp);
// then adding edges that will be deleted from graph backwards
reverse(to_add.begin(), to_add.end());
for(pii x: to_add){
dis.Union(x.first, x.second);
res.pb(dis.connected_comp);
}
cout<<"connected components after each deletion:"<<"\n";
for (auto it = ++res.rbegin(); it != res.rend(); ++it)
cout << *it << "\n";
return 0;
}

Related

How to increase efficiency of Prim's algorithm used in finding minimum spanning tree from adjacency matrix of an undirected graph?

I have implemented an undirected graph using adjacency matrix. Now I want to find the edges in the minimum spanning tree that can be obtained by using Prim's Algorithm (along with priority queue). I did that using classic method, but it is highly inefficient (giving correct results). On larger data sets (of vertices and the vertices that they are connected to.).
This is the implementation of Prim's algorithm using priority queue as i used in my code. (This is the code from site geeksforgeeks, the code i wrote is an inspiration from this.)
void Graph::primMST()
{
// Create a priority queue to store vertices that
// are being primMST. This is weird syntax in C++.
// Refer below link for details of this syntax
// http://geeksquiz.com/implement-min-heap-using-stl/
priority_queue< iPair, vector <iPair> , greater<iPair> > pq;
int src = 0; // Taking vertex 0 as source
// Create a vector for keys and initialize all
// keys as infinite (INF)
vector<int> key(V, INF);
// To store parent array which in turn store MST
vector<int> parent(V, -1);
// To keep track of vertices included in MST
vector<bool> inMST(V, false);
// Insert source itself in priority queue and initialize
// its key as 0.
pq.push(make_pair(0, src));
key[src] = 0;
/* Looping till priority queue becomes empty */
while (!pq.empty())
{
// The first vertex in pair is the minimum key
// vertex, extract it from priority queue.
// vertex label is stored in second of pair (it
// has to be done this way to keep the vertices
// sorted key (key must be first item
// in pair)
int u = pq.top().second;
pq.pop();
//Different key values for same vertex may exist in the priority queue.
//The one with the least key value is always processed first.
//Therefore, ignore the rest.
if(inMST[u] == true){
continue;
}
inMST[u] = true; // Include vertex in MST
// 'i' is used to get all adjacent vertices of a vertex
list< pair<int, int> >::iterator i;
for (i = adj[u].begin(); i != adj[u].end(); ++i)
{
// Get vertex label and weight of current adjacent
// of u.
int v = (*i).first;
int weight = (*i).second;
// If v is not in MST and weight of (u,v) is smaller
// than current key of v
if (inMST[v] == false && key[v] > weight)
{
// Updating key of v
key[v] = weight;
pq.push(make_pair(key[v], v));
parent[v] = u;
}
}
}
// Print edges of MST using parent array
for (int i = 1; i < V; ++i)
printf("%d - %d\n", parent[i], i);
}
Thanks in advance.

why prim`s algorithm needs distance array?

I have some questions about Prim`s algorithm.
Prim algorithms can find MST. In general implementation, It needs initialize all Nodes as INF. but i don`t know why this initialize needs.
Here is my implementation
#include<iostream>
#include<tuple>
#include<algorithm>
#include<vector>
using namespace std;
typedef tuple<int,int,int> ti;
int main(void)
{
ios::sync_with_stdio(0);
cin.tie(0);
bool vis[1005];
vector<pair<int,int>> vertex[1005];
int V,E;
int u,v,w;
int sum = 0;
int cnt = 0;
priority_queue<ti,vector<ti>,greater<ti>> pq;
cin >> V >> E;
for(int i = 0; i < E; i++)
{
cin >> u >> v >> w;
vertex[u].push_back({v,w});
vertex[v].push_back({u,w});
}
for(auto i : vertex[1]){
pq.push({i.second,1,i.first});
}
vis[1] = true;
while(!pq.empty())
{
tie(w,u,v) = pq.top(); pq.pop();
if(vis[v]) continue;
vis[v] = true;
sum += w;
cnt++;
for(auto i : vertex[v]){
if(!vis[i.first])
pq.push({i.second,v,i.first});
}
if(cnt == V-1) break;
}
// VlogV
cout << sum;
return 0;
plz ignore indentation (code paste error)
In this code, you can find sum of the MST. O(VlogV), Also we can find some Vertex Parent node (vis[v] = true, pre[v] = u) so we can know order of MST.
When we don`t need distance array, we can implement prim algorithm O(VlogV), In almost case(not in MST case) it always faster than Kruskal.
I know I'm something wrong, so i want to know what point I am wrong.
So is there any reason why we use distance array??
Your conclusion that this algorithm works in O(Vlog(V)) seems to be wrong. Here is why:
while(!pq.empty()) // executed |V| times
{
tie(w,u,v) = pq.top();
pq.pop(); // log(|V|) for each pop operation
if(vis[v]) continue;
vis[v] = true;
sum += w;
cnt++;
for(auto i : vertex[v]){ // check the vertices of edge v - |E| times in total
if(!vis[i.first])
pq.push({i.second,v,i.first}); // log(|V|) for each push operation
}
if(cnt == V-1) break;
}
First of all notice that, you have to implement the while loop |V| times, since there are |V| number of vertices stored in the pq.
However, also notice that you have to traverse all the vertices in the line:
for(auto i : vertex[v])
Therefore it takes |E| number of operations in total.
Notice that push and pop operations takes |V| number of operations for each approximately.
So what do we have?
We have |V| many iterations and log(|V|) number of push/pop operations in each iteration, which makes V * log(V) number of operations.
On the other hand, we have |E| number of vertex iteration in total, and log(|V|) number of push operation in each vertex iteration, which makes E * log(V) number of operations.
In conclusion, we have V*log(V) + E*log(V) total number of operations. In most cases, V < E assuming a connected graph, therefore time complexity can be shown as O(E*log(V)).
So, time complexity of Prim's Algorithm doesn't depend on keeping a distance array. Still, you have to make the iterations mentioned above.

how to check presence of cycle in undirected graph?

#include <bits/stdc++.h>
using namespace std;
int n,m;
vector<int> adj[51];
int visited[51];
bool flag;
void dfs(int i,int parent){
vector<int>::iterator it;
for(it = adj[i].begin();it!=adj[i].end();it++){
if(!visited[*it]){
visited[*it]=1;
dfs(*it,i); // passing parent element
}
if(visited[*it] && (*it !=parent )){
flag=true; return;
}
}
}
int main(){
int a,b;
cin>>n>>m;
for(int i=0;i<m;i++){ // graph ready.
cin>>a>>b;
if(a==b){
cout<<"YES"; return 0;
}
adj[a].push_back(b);
adj[b].push_back(a);
}
for(int i=1;i<=n;i++){
std::vector<int>::iterator it;
for(it=adj[i].begin();it!=adj[i].end();it++){
if(!visited[*it]){
visited[*it]=1;
dfs(*it,-1);
}
}
}
if(flag){
cout<<"YES"<<endl;
}else{
cout<<"NO"<<endl;
}
}
can anyone check my code and tell me which test case i'm missing here. got only 60 /100 on hackerearth. i'm using parent variable here to keep track of a single edge considered to be a loop.
You are getting the wrong output because in adjacency list every edge is listed twice.
So say we have the graph with 3 vertices and 2 edges as:
1------2------3
Clearly, no cycle is there. But you code returns YES for this input as well. Reason is once a particular vertex i gets visited due to its parent j, Next time when dfs for i is called, vertex j, will come out to be visited and therefore, output YES , which is wrong.
FIX
Whenever we visit a vertex i, which is already visited, we do not immediately declare that we have found a cycle, We must make sure that vertex i , is not the parent of vertex whose dfs we called, only then you will get the right answer.
The code is easier to write , once you have understood what is going wrong.

Count number of cycles in directed graph using DFS

I want to count total number of directed cycles available in a directed graph (Only count is required).
You can assume graph is given as adjacency matrix.
I know DFS but could not make a working algorithm for this problem.
Please provide some pseudo code using DFS.
This algorithm based on DFS seems to work, but I don't have a proof.
This algorithm is modified from the dfs for topological sorting
(https://en.wikipedia.org/wiki/Topological_sorting#Depth-first_search).
class Solution {
vector<Edge> edges;
// graph[vertex_id] -> vector of index of outgoing edges from #vertex_id.
vector<vector<int>> graph;
vector<bool> mark;
vector<bool> pmark;
int cycles;
void dfs(int node) {
if (pmark[node]) {
return;
}
if (mark[node]) {
cycles++;
return;
}
mark[node] = true;
// Try all outgoing edges.
for (int edge_index : graph[node]) {
dfs(edges[edge_index].to);
}
pmark[node] = true;
mark[node] = false;
}
int CountCycles() {
// Build graph.
// ...
cycles = 0;
mark = vector<bool>(graph.size(), false);
pmark = vector<bool>(graph.size(), false);
for (int i = 0; i < (int) graph.size(); i++) {
dfs(i);
}
return cycles;
}
};
Let us consider that , we are coloring the nodes with three types of color . If the node is yet to be discovered then its color is white . If the node is discovered but any of its descendants is/are yet to be discovered then its color is grey. Otherwise its color is black . Now, while doing DFS if we face a situation when, there is an edge between two grey nodes then the graph has cycle. The total number of cycles will be total number of times we face the situation mentioned above i.e. we find an edge between two grey nodes .

As I read Djiktra's algo fails for negative edges but I implemented the same concept and my code is working? Is there some bug?

The code works for negative edges too and I have used priority queue
please check it and let me know what's wrong with it and why is this working even for negative edges.
Constraint: edges should be less than 10000 length
Am I doing something wrong here?
As I read Djiktra's algo fails for negative edges but I implemented the same concept and my code is working? Is there some bug?
#include<iostream>
#include<queue>
using namespace std;
struct reach
{
int n;
int weight;
};
struct cmp
{
bool operator()(reach a, reach b)
{
return a.weight>b.weight;
}
};
class sp
{
int *dist;
int n;
int **graph;
int src;
public:
sp(int y)
{
n=y;
src=1;
dist=new int[n+1];
graph=new int*[n+1];
for(int i=0;i<=n;i++)
{
graph[i]=new int[n+1];
}
for(int i=2;i<=n;i++)
{
dist[i]=10000;
}
// cout<<INT_MAX<<endl;
dist[src]=0;
for(int i=1;i<=n;i++)
{
for(int j=1;j<=n;j++)
{
graph[i][j]=10000;
}
}
graph[1][1]=0;
}
void read()
{
int a;
cout<<"enter number of edges"<<endl;
cin>>a;
cout<<"now enter the two vertices which has an edge followed by the weight of the edge"<<endl;
while(a--)
{//cout<<"location: "<<i<<" : "<<j<<endl;
int as, ad,val;
cin>>as>>ad>>val;
graph[as][ad]=val;
}
}
void finder()
{cout<<"enetered"<<endl;
priority_queue<reach, vector<reach>, cmp> q;
for(int i=1;i<=n;i++)
{
if(dist[src]+graph[src][i]<dist[i])
{
reach temp;
temp.n=i;
cout<<i<<endl;
temp.weight=graph[src][i];
q.push(temp);
dist[i]=graph[src][i]+dist[src];
}
}
while(q.empty()!=1)
{
reach now =q.top();
//cout<<" we have here: "<<now.n<<endl;
q.pop();
for(int i=1;i<=n;i++)
{
if((dist[now.n] + graph[now.n][i])<dist[i])
{
dist[i]=dist[now.n]+graph[now.n][i];
cout<<"it is: "<<i<<" : "<<dist[i]<<endl;
reach temp;
temp.n=i;
//cout<<temp.n<<endl;
temp.weight=graph[now.n][i];
q.push(temp);
}
}
}
}
void print()
{
for(int i=1;i<=n;i++)
{
cout<<"we have: "<<dist[i]<<" at "<<i;
cout<<endl;
}
cout<<endl;
}
};
int main()
{cout<<"enter no. of vertices"<<endl;
int n;
cin>>n;
sp sp(n);
sp.read();
sp.finder();
sp.print();
}
Consider this example:
Undirected Graph(6v,8e)
Edges (v1-v2 (weight)):
1-2 (2)
2-3 (1)
1-3 (-2)
1-6 (-2)
3-6 (-3)
5-6 (1)
4-5 (2)
2-5 (1)
Now, Let source be 1 and destination be 4. There is one cycle from source to source (1-3-6-1) which weighs (-7). So, lets list a few paths from source to destination with weights:
1-6-5-4 (1)
1-3-6-5-4 (-2)
1-3-6-1-3-6-5-4 (-9)
1-3-6-1-3-6-1-3-6-5-4 (-16)
etc.
So, which path is the shortest? Since it is ambiguous, the algorithm does not work. Now, you can argue that if a node is visited, you will not update it. In this case, you will not get the correct answer. May be there are some cases where in-spite of having negative edges, algo gives correct results, but this is not how Dijkstra works.
A really simple way to understand Dijkstra is that it performs BFS on the graph, from source till destination, and in every step it updates the visited nodes. So, if there is a node n which has cost c and a few levels deep in bfs, its cost becomes k (<c). Then again you will have to update all the nodes visited from n for their shorter paths (because path to n is now shorter). Since graph has negative edges, if it has a cycle, n will keep updating infinitely and will never end.
The simplest graph for which Dijkstra's algorithm fails with negative weights has adjacency matrix
0 1 2
1 0 -3
2 -3 0
and looks for a route from vertex 0 to vertex 1. The first vertex to come off the priority queue is vertex 1 at distance 1, so that's the route returned. But there was a route of total weight -1 via a vertex which is still in the priority queue, with weight 2.

Resources