Is this bipartite graph optimization task NP-complete? - algorithm

I have been trying to find out a polynomial-time algorithm to solve this problem, but in vain. I'm not familiar with the NP-complete thing. Just wondering whether this problem is actually NP-complete, and I should not waste any further effort trying to come up with a polynomial-time algorithm.
The problem is easy to describe and understand. Given a bipartite graph, what is the minimum number of vertices you have to select from one vertex set, say A, so that each vertex in B is adjacent to at least one selected vertex.

Unfortunately, this is NP-hard; there's an easy reduction from Set Cover (in fact it's arguably just a different way of expressing the same problem). In Set Cover we're given a ground set F, a collection C of subsets of F, and a number k, and we want to know if we can cover all n ground set elements of F by choosing at most k of the sets in C. To reduce this to your problem: Make a vertex in B for each ground element, and a vertex in A for each set in C, and add an edge uv whenever ground element v is in set u. If there was some algorithm to efficiently solve the problem you describe, it could solve the instance I just described, which would immediately give a solution to the original Set Cover problem (which is known to be NP-hard).
Interestingly, if we are allowed to choose vertices from the entire graph (rather than just from A), the problem is solvable in polynomial time using bipartite maximum matching algorithms, due to Kőnig's Theorem.

Related

Edge clique cover algorithm

I am trying to write an algorithm that computes the edge clique cover number (the smallest number of cliques that cover all edges) of an input graph (undirected and no self-loops). My idea would be to
Calculate all maximal cliques with the Bron-Kerbosch algorithm, and
Try if any 1,2,3,... of them would cover all edges until I find the
minimum number
Would that work and does anyone know a better method; is there a standard algorithm? To my surprise, I couldn't find any such algorithm. I know that the problem is NP-hard, so I don't expect a fast solution.
I would gather maximal cliques as you do now (or perhaps using a different algorithm, as suggested by CaptainTrunky), but then use branch and bound. This won't guarantee a speedup, but will often produce a large speedup on "easy" instances.
In particular:
Instead of trying all subsets of maximal cliques in increasing subset size order, pick an edge uv and branch on it. This means:
For each maximal clique C containing uv:
Make a tentative new partial solution that contains all cliques in the current solution
Add C to this partial solution
Make a new subproblem containing the current subproblem's graph, but with all vertices in C collapsed into a single vertex
Recurse to solve this smaller subproblem.
Keep track of the best complete solution so far. This is your upper bound (UB). You do not need to continue processing any subproblem that has already reached this upper bound but still has edges present; a better solution already exists!
It's best to pick an edge to branch on that is covered by as few cliques as possible. When choosing in what order to try those cliques, try whichever you think is likely to be the best (probably the largest one) first.
And here is an idea for a lower bound to improve the pruning level:
If a subgraph G' contains an independent set of size s, then you will need at least s cliques to cover G' (since no clique can cover two or more vertices in an independent set). Computing the largest possible IS is NP-hard and thus impractical here, but you could get a cheap bound by using the 2-approximation for Vertex Cover: Just keep choosing an edge and throwing out both vertices until no edges are left; if you threw out k edges, then what remains is an IS that is within k of optimal.
You can add the size of this IS to the total number of cliques in your solution so far; if that is larger than the current UB, you can abort this subproblem, since we know that fleshing it out further cannot produce a better solution than one we have already seen.
I was working on the similar problem 2 years ago and I've never seen any standard existing approaches to it. I did the following:
Compute all maximal cliques.
MACE was way better than
Bron-Kerbosch in my case.
Build a constraint-satisfaction problem for determining a minimum number of cliques required to cover the graph. You could use SAT, Minizinc, MIP tools to do so. Which one to pick? It depends on your skills, time resources, environment and dozens of other parameters. If we are talking about proof-of-concept, I would stick with Minizinc.
A bit details for the second part. Define a set of Boolean variables with respect to each edge, if it's value == True, then it's covered, otherwise, it's not. Add constraints that allow you covering sets of edges only with respect to each clique. Finally, add variables corresponding to each clique, if it's == True, then it's used already, otherwise, it's not. Finally, require all edges to be covered AND a number of used cliques is minimal.

NP-complete to determine vertex cover

Is it right that "it is NP-complete to determine if a graph contains a vertex cover of size 99"???
and
is it right that " it takes linear time to determine if a graph contains a vertex cover of size 99"???
One more, is it right to say that " No NP-complete problem can be solved in polynomial time unless the VERTEX COVER problem admits a polynomial-time algorithm."???
"is it NP-complete to determine if a graph contains a vertex cover of size 99"
Pedantically: no.
This problem can be solved in polynomial time. However, the following algorithm is completely useless in practice.
The approach for a graph with n vertices is simply to test all C(n,99) possible choices of vertex cover. For each choice, we test all edges (at most n*(n-1) edges in the graph) to see if either of their vertices are included.
There are fewer than n^99 ways of choosing the vertex cover, so overall this algorithm has polynomial complexity of n^101.
As noted by j_random_hacker, this answer assumes that the vertex size of 99 is a known constant. If the 99 is meant to be a variable and is part of the input, then the problem become the standard NP-complete vertex cover problem.

Back Tracking Vs. Greedy Algorithm Maximum Independent Set

I implemented a back tracing algorithm using both a greedy algorithm and a back tracking algorithm.
The back tracking algorithm is as follows:
MIS(G= (V,E): a graph): largest set of independent vertices
1:if|V|= 0
then return .
3:end if
if | V|= 1
then return V
end if
pick u ∈ V
Gout←G−{u}{remove u from V and E }
Gn ← G−{ u}−N(u){N(u) are the neighbors of u}
Sout ←MIS(Gout)
Sin←MIS(Gin)∪{u}
return maxsize(Sout,Sin){return Sin if there’s a tie — there’s a reason for this.
}
The greedy algorithm is to iteratively pick the node with the smallest degree, place it in the MIS and then remove it and its neighbors from G.
After running the algorithm on varying graph sizes where the probability of an edge existing is 0.5, I have empirically found that the back tracking algorithm always found a smaller a smaller maximum independent set than the greedy algorithm. Is this expected?
Your solution is strange. Backtracking is usually used to yes/no problems, not optimization. The algorithm you wrote depends heavily on how you pick u. And it definitely is not backtracking because you never backtrack.
Such problem can be solved in a number of ways, e.g.:
genetic programming,
exhaustive searching,
solving the problem on dual graph (maximum clique problem).
According to Wikipedia, this is a NP-hard problem:
A maximum independent set is an independent set of the largest possible size for a given graph G.
This size is called the independence number of G, and denoted α(G).
The problem of finding such a set is called the maximum independent set problem and is an NP-hard optimization problem.
As such, it is unlikely that there exists an efficient algorithm for finding a maximum independent set of a graph.
So, for finding the maximum independent set of a graph, you should test all available states (with an algorithm which its time complexity is exponential). All other faster algorithms (like greedy, genetic or randomize ones), can not find the exact answer. They can guarantee to find a maximal independent set, but not the maximum one.
In conclusion, I can say that your backtracking approach is slower and accurate; but the greedy approach is only an approximation algorithm.

Verification algorithm for minimum vertex cover?

We know that the minimum vertex cover is NP complete, which means that it is in the set of problems that can be verified in polynomial time.
As I understand it, the verification process would require the following:
Verify that the solution is a vertex cover at all
Verify that the solution is the smallest possible subset of the source graph that satisfies condition #1
I'm finding it hard to establish that step #2 can be done in polynomial time. Can anyone explain how it is?
The minimum vertex cover is NP-hard. It is only NP-complete if it is restated as a decision problem which can be verified in polynomial time.
The minimum vertex cover problem is the optimization problem of finding a smallest vertex cover in a given graph.
INSTANCE: Graph G
OUTPUT: Smallest number k such that G has a vertex cover of size k.
If the problem is stated as a decision problem, it is called the vertex cover problem:
INSTANCE: Graph G and positive integer k.
QUESTION: Does G have a vertex cover of size at most k?
Restating a problem as a decision problem is a common way to make problems NP-complete. Basically you turn an open-ended problem of the form "find the smallest solution k" into a yes/no question, "for a given k, does a solution exist?"
For example, for the travelling salesman problem, verifying that a proposed solution the shortest path between all cities is NP-hard. But if the problem is restated as only having to find a solution shorter than k total distance for some k, then verifying a solution is easy. You just find the length of the proposed solution and check that it's less than k.
The decision problem formulation can be easily used to solve the general formulation. To find the shortest path all you have to do is ratchet down the value of k until there are no solutions found.

System of nondistinct representatives efficiently solvable?

We have sets S1, S2, ..., Sn. These sets do not have to be disjoint. Our task is to select a representative member for each set, such that the total number of elements selected is as small as possible. One element may be present in more than one set, and can represent all the sets it is included in. Is there an algorithm to solve this efficiently?
It is easier to answer this question after restatement: let the original sets S1, S2, ..., Sn be elements of the universe and let the original set members be sets themselves: T1, T2, ..., Tm (where Ti contains elements {Sj} which are the original sets containing corresponding member).
Now we have to cover universe S1, S2, ..., Sn with sets T1, T2, ..., Tm. Which is exactly Set cover problem. It is a well known NP-hard problem, so there is no algorithm to solve it efficiently (unless P=NP, as theorists usually say). As you can see from Wikipedia page, there is a greedy approximation algorithm; it is efficient, but approximation ratio is not very good.
I'm assuming that by "efficiently," you mean in polynomial time.
Evgeny Kluev is correct, the problem is NP-hard. The decision version of it is known as the hitting set problem and was shown to be what we now call NP-complete soon after the introduction of that concept. While it's true that Evgeny's reduction is from the hitting set problem to the set cover problem, it's not hard to see an explicit inverse reduction.
Given a set C={C1,C2,...Cm} whose union is U={u1,u2,...,un}, we want to find a minimal-cardinality subset C' whose union is also equal to U. Define Si in your initial problem as {Cj in C | ui is an element of Cj}. The minimum hitting set of S={S1,S2,...,Sn} is then equal to our desired C'.
Not to steal Evgeny's glory, but here's a rather straightforward way of showing perhaps more rigorously that the general case of the poster's problem is NP-hard.
Consider the minimum vertex cover problem of finding a minimum set X from vertices V in an simple graph (V,E) where every edge in E is adjacent to at least one vertex in X.
An edge can be represented by an unordered two-element set {va, vb} where va and vb are distinct elements in V. Note that an edge e represented as {va, vb} is adjacent to vc if and only if vc is an element of {va, vb}.
Hence, the minimal vertex cover problem is the same as finding a minimum size subset X of V where each edge set {va, vb} defined by an edge in E contains an element that is in X.
If one has an algorithm to efficiently solve the original stated problem, then one has an algorithm to efficiently solve the above problem, and therefore one can solve the minimal vertex cover problem efficiently as well.
A couple of algorithms to look at are Simulated Annealing and Genetic Algorithms, if you can live with a solution close to optimal (they might get you the optimal solution, but not necessarily). Simulated Annealing can be made to work in production electronic CAD autoplacement (as I was part of the development team for Wintek's autoplacement program).

Resources