Correctness of algorithm to calculate maximal independent set - algorithm

I am trying to find the maximal set for an undirected graph and here is the algorithm that i am using to do so:
1) Select the node with minimum number of edges
2) Eliminate all it's neighbors
3) From the rest of the nodes, select the node with minimum number of edges
4) Repeat the steps until the whole graph is covered
Can someone tell me if this is right? If not, then why is this method wrong to calculate the maximal independent set in a graph?

What you have described will pick a maximal independent set. We can see this as follows:
This produces an independent set. By contradiction, suppose that it didn't. Then there would have to be two nodes connected by edges that were added into the set you produced. Take whichever one of them was picked first (call it u, let the other be v) Then when it was added to the set, you would have removed all of its neighboring nodes from the set, including node v. Then v wouldn't have been added to the set, giving a contradiction.
This produces a maximal independent set. By contradiction, suppose that it didn't. This means that there is some node v that can be added to the independent set produced by your algorithm, but was not added. Since this node wasn't added, it must have been removed from the graph by the algorithm. This means that it must have been adjacent to some node added to the set already. But this is impossible, because it would mean that the node v cannot be added to the produced independent set without making the result not an independent set. We have a contradiction.
Hope this helps!

There is not one definite maximal independent set in any graph; take for example the cycle over 3 nodes, each of the nodes forms a maximal independent set. Your algorithm will give you one of the maximal independent sets of the graph, without guaranteeing that it has maximum cardinality.On the other hand, finding the maximum independent set in a graph is NP-complete (since that problem is complementary to that of finding a maximum clique), so there probably isn't an efficient algorithm.

After your clarify situation in comments, your solutions is right.
Even better, according to Corollary 3 from this paper http://courses.engr.illinois.edu/cs598csc/sp2011/Lectures/lecture_7.pdf
your get good aproximation for subset order.
Greedy gives a 1 / (d + 1) -approximation for (unweighted) MIS in graphs of degree at most d

Related

How to check if given set of nodes is the vertex cut set of the graph?

I am looking for efficient algorithm to discover whether removing a set of nodes in graph would split graph into multiple components.
Formally, given undirected graph G = (V,E) and nonempty set od vertices W ⊆ V, return true iff W is vertex cut set. There are no edge weights in graph.
What occurs to my mind until now is using disjoint set:
The disjoint set is initialized with all neigbors of nodes in W where every set contains one such neighbor.
During breadth-first traversal of all nodes of V \ W, exactly one of following cases holds for every newly explored node X:
X is already in same set as its predecessor
X is absent in disjoint set ⇒ is added into same set as its predecessor (both cases mean the connected component is being further explored)
X is in different set ⇒ merge disjoint sets (two components so far appeared as disconnected but turned out to be connected)
Whenever the disjoint set contains only single set (even before traversal is finished), result is false.
If the disjoint set contains 2 or more sets when traversal is finished, result is true.
Time complexity is O(|V|+|E|) (assuming time complexity of disjoint set is O(1) instead of more precise inverse Ackermann function).
Do you know better solution (or see any flaw in proposed one)?
Note: Since it occurs very often in Google search results, I would like to explicitly state I'm not looking for algorithm to find so-far-unknown vertex cut set, let alone optimal. The vertex set is given, the task is just to say yes or no.
Note 2: Also, I don't search for edge cut set validation (I'm aware of Find cut set in a graph but can't think up similar solution for vertices).
Thanks!
UPDATE: I figured out that in case of true result I will also need data of nodes in disconnected components in hierarchical structure depending on distance from removed nodes. Hence the choice of BFS. I apologize for post-edit.
The practical case behind is an outage in telecommunication network. When some node breaks so the whole network get disconnected, one component (containing the node with connectivity to higher level network) is still ok, every other components need to be reported.
Have a unordered_set<int> r to store the set of vertices you want to remove.
Run a DFS normally, but only go to adjacents who are not in r. Always you visit an unvisited node, add 1 to the number of nodes visited.
If in the end, the number of visited nodes is smaller than |V| - r, this set divides the graph.
With this approach, you don't need to make changes in the graph, just ignore nodes that are in r, which you can check in O(1) using an unordered_set<int>.
The complexity is the same than a normal DFS.
Can’t you get O(|V|) complexity by performing a depth first search on the graph? Eliminate the set W from V and perform DFS. Record the number of nodes processed and stop when you cannot reach any more nodes. If the number of nodes processed is less than |V| - |W|, then W is a cut set.

Will a standard Kruskal-like approach for MST work if some edges are fixed?

The problem: you need to find the minimum spanning tree of a graph (i.e. a set S of edges in said graph such that the edges in S together with the respective vertices form a tree; additionally, from all such sets, the sum of the cost of all edges in S has to be minimal). But there's a catch. You are given an initial set of fixed edges K such that K must be included in S.
In other words, find some MST of a graph with a starting set of fixed edges included.
My approach: standard Kruskal's algorithm but before anything else join all vertices as pointed by the set of fixed edges. That is, if K = {1,2}, {4,5} I apply Kruskal's algorithm but instead of having each node in its own individual set initially, instead nodes 1 and 2 are in the same set and nodes 4 and 5 are in the same set.
The question: does this work? Is there a proof that this always yields the correct result? If not, could anyone provide a counter-example?
P.S. the problem only inquires finding ONE MST. Not interested in all of them.
Yes, it will work as long as your initial set of edges doesn't form a cycle.
Keep in mind that the resulting tree might not be minimal in weight since the edges you fixed might not be part of any MST in the graph. But you will get the lightest spanning tree which satisfies the constraint that those fixed edges are part of the tree.
How to implement it:
To implement this, you can simply change the edge-weights of the edges you need to fix. Just pick the lowest appearing edge-weight in your graph, say min_w, subtract 1 from it and assign this new weight,i.e. (min_w-1) to the edges you need to fix. Then run Kruskal on this graph.
Why it works:
Clearly Kruskal will pick all the edges you need (since these are the lightest now) before picking any other edge in the graph. When Kruskal finishes the resulting set of edges is an MST in G' (the graph where you changed some weights). Note that since you only changed the values of your fixed set of edges, the algorithm would never have made a different choice on the other edges (the ones which aren't part of your fixed set). If you think of the edges Kruskal considers, as a sorted list of edges, then changing the values of the edges you need to fix moves these edges to the front of the list, but it doesn't change the order of the other edges in the list with respect to each other.
Note: As you may notice, giving the lightest weight to your edges is basically the same thing as you suggest. But I think it is a bit easier to reason about why it works. Go with whatever you prefer.
I wouldn't recommend Prim, since this algorithm expands the spanning tree gradually from the current connected component (in the beginning one usually starts with a single node). The case where you join larger components (because your fixed edges might not all be in a single component), would be needed to handled separately - it might not be hard, but you would have to take care of it. OTOH with Kruskal you don't have to adapt anything, but simply manipulate your graph a bit before running the regular algorithm.
If I understood the question properly, Prim's algorithm would be more suitable for this, as it is possible to initialize the connected components to be exactly the edges which are required to occur in the resulting spanning tree (plus the remaining isolated nodes). The desired edges are not permitted to contain a cycle, otherwise there is no spanning tree including them.
That being said, apparently Kruskal's algorithm can also be used, as it is explicitly stated that is can be used to find an edge that connects two forests in a cost-minimal way.
Roughly speaking, as the forests of a given graph form a Matroid, the greedy approach yields the desired result (namely a weight-minimal tree) regardless of the independent set you start with.

Size of Special Vertex Set on DAG

In Singapore, this year's (2016) NOI (National Olympiad in Informatics) included the following problem "ROCKCLIMBING" (I was unable to solve it during the contest.) :
Abridged Problem Statement
Given a DAG with N <= 500 vertices, find the maximum number of vertices in a subset of the original vertices such that there is no path from 1 vertex in the set to another vertex in the same set, directly or indirectly.
Solution
The solution was to use transitive closure algorithm, and then to form a bipartite graph by duplicating each vertex i to form i' such that if vertex j can be reached from vertex i directly or indirectly in the original graph, then there is a directed edge from i to j' in the new graph.
However, during the solution presentation, the presenters did not explain how or why N - MCBM (MCBM being the Maximum Cardinality Bipartite Matching) of the new bipartite graph is also the maximum size of the set of vertices that cannot reach each other directly or indirectly in the original DAG.
I looked up other problems related to DAGs and bipartite graphs, such as the Minimum Path Cover problem on DAGs, but I could not find anything that explains this.
Does anyone know a way in which to prove this equality?
The problem statement can be found here: ROCKCLIMBING
Thank you in advance.
There are two things going on here:
A set is independent if and only if its complement is a vertex cover (see wikipedia). This means that the size of a max independent set is equal to the size of a minimum vertex cover.
Konig's theorem proves that
In any bipartite graph, the number of edges in a maximum matching equals the number of vertices in a minimum vertex cover.
Therefore to find the size of the max independent set we first compute the size MCBM of the max matching, and then compute its complement which equals N-MCBM.
An alternative viewpoint is as follows:
If we use A<B to mean we can climb from A to B, we have defined a partially ordered set
There is a result called Dilworth's theorem that says the maximum number of incomparable elements is equal to the minimum number of chains
The proof shows how to construct the minimum number of chains by constructing a maximum matching in your bipartite graph.

Destroy weighted edges in a bidirected graph

On my recent interview I was asked the following question:
There is a bidirectional graph G with no cycles. Each edge has some weight. Also there is a set of nodes K which should be disconnected from each other (by removing some edges). There is only one path between any two nodes in K set. The goal is to minimize total weight of removed edges and disconnect all nodes (from set K) from each other.
My approach was to run BFS for each node from K set and determine all paths between all pairs of nodes from K. So then I'll have set of paths (each path is a set of edges).
My next step is to apply dynamic programming approach to find minimum total weight of removed edges. And here I stuck. Could you please help me (just direct me) of how DP should be done.
Thanks.
This sounds like the Multiway Cut problem in trees, assuming a "bidirectional" graph is just like an undirected one. It can be solved in polynomial time by a straightforward dynamic programming. See Chopra and Rao: "On the multiway cut polyhedron", Networks 21(1):51–89, 1991. See also this question.
This problem can be solved using disjoint Sets. In this problem you need to make set of each vertex like forest. Then, join the vertices which belong to two different sets through the edges which are in the graph such that -
1) if one of the set contains a node which is mentioned in the k set of nodes, then the node becomes the representative element.
2) if both the sets contain the unwanted node (node present in the k set of nodes), then find the minimum weight edge in both the sets and compare them, then compare the minimum one of them with the edge to be joined and find the minimum. Then delete the edge which we found so that no path exists b/w them.
3) if none of the set contain the unwanted node, then join one set to the other set.
In this way you find the minimum total weight of the destroyed edges in a very good time complexity of the order of O(nlogn).
Your approach will also work but it will prove to be costly in terms of time complexity.
Here is the complete code - (GameAlgorithm.cpp)
https://github.com/KARTHEEKCIC/RoboAttack

Algorithm to find 'maximal' independent set in a simple graph

in words, can someone post directions towards finding the 'maximal' independent set in a simple graph?
I read up something from ETH site which said one can find such in O(n) by simply picking a random vertex v and than scanning the rest and attempting to find if there's an edge from v to the rest.
Thanks
Yes, by definition, a maximal indpendent set is an independent set to which no more vertices can be added without violating the 'independence' condition.
So just picking vertices till you can pick no more would give you a maximal independent set, can be done in linear time (i.e. linear in |V| + |E|).
Note, this is different from maximum independent set, which is an independent set of the largest possible size and finding that is NP-Hard.
Found this on the web, probably from a class" To accompany the text ``Introduction to Parallel Computing'', Addison Wesley, 2003
Finding a Maximal Independent Set (MIS)
parallel MIS algorithms use randimization to gain
concurrency (Luby's algorithm for graph coloring).
Initially, each node is in the candidate set C. Each
node generates a (unique) random number and
communicates it to its neighbors.
If a nodes number exceeds that of all its neighbors, it
joins set I. All of its neighbors are removed from C.
This process continues until C is empty.
On average, this algorithm converges after O(log|V|)
such steps.

Resources