How to prove a prob is np complete and is in np? - algorithm

Given a department needs a committee to select the department’s head. The committee cannot include people who have conflicts of interest with each other. The input consists of:
the desired committee size
a list of all people
a list of all pairs of people that are conflicted.
The goal is to determine whether there’s a conflict-free committee of that size.
How can I show that this problem is NP-complete and is in NP?

As this is 99.99% homework, so I only give you a very brief "answer":
Try to reduce
Indepedent Set Decision Problem to your problem.
Also a useful note is that if you prove the problem is NPC, then it is NP

Showing that a problem is NP-Complete requires you to show that it is in NP.
Get familiar to a subset of NP Complete problems
Prove NP Hardness : Reduce an arbitrary instance of an NP complete problem to an instance of your problem. This is the biggest piece of a pie and where the familiarity with NP Complete problems pays. The reduction will be more or less difficult depending on the NP Complete problem you choose.
Prove that your problem is in NP : design an algorithm which can verify in polynomial time whether an instance is a solution.
Showing that it is in NP :
Given a random subset of people of size N, How do you check if they
form a conflict-free committee?
Should be easy enough. Algorithm doesn't have to be efficient in memory or size, just correct. Form all possible pair in the subset and check if a pair is in the conflict matching list.
Familiarity with NP Completeness:
There are some specific NP Complete problems which are very popular for prooving NP hardness. For instance Karp's 21 NP-complete problems
Proof:
From a quick analysis of your problem, I may initially try to use Vertex Cover NP Complete problems, especially because of the conflict clause. Given that you have a restriction on the committee size, maybe you could first try minimum vertex cover.
Good luck.

To prove that the problem is np-complete you first must prove that the problem is in np. You can do so forming a certificate, so you will pick a committee size, a list of people, a list of people with conflicts of interest, and a committee. Then if you can verify (not prove) if the committee is valid in polynomial time, then it the problem is in np.
From there you can prove if the problem is np-complete by transforming a problem that is already proven to be np-complete into your problem.
If you have done both, then the problem is both in np and np-complete.

NP proofs usually show equivalence with a know NP problem. See for example Karp's 21 NP-complete problems. SAT is the one most used (see also the Cook-Levin theorem). You could try to create logic gates using a small number of people, where for one person being a committee member depends on the membership of two other persons. This is for example how NP proofs work for games like Conway's Game of Life and for Morpion solitaire.

Related

Why is the NP-complete set restricted to only decision problems?

Among P, NP, NP hard and NP-complete, only the NP-complete set is restricted to decision problems (those that have a binary solution). What is the reason for this? Why not define it simply as the intersection of NP and NP-hard? And this leads to another question - there must be problems that are not necessarily decision problems and also have the property that any problem in NP can be reduced to them in polynomial time. Is this then a set encompassing NP-complete? Is there already a name for this set?
EDIT: Per the comment by Matt and also the post: What are the differences between NP, NP-Complete and NP-Hard?, its seems P and NP are defined only for decision problems. That would resolve this question apart from why they would be defined this way. But, this seems to be in contradiction to the book Introduction to Algorithms by Cormen et.al. In chapter 34, the section titled "NP-completeness and the classes P and NP", they simply say: "P consists of those problems that are solvable in polynomial time". They even say later, "NP-completeness applies directly not to optimization problems, but to decision problems" but say no such thing about P and NP.
The classes P and NP are indeed classes of decision problems. I don’t have my copy of CLRS handy, but if they’re claiming that P and NP are all problems solvable in polynomial time or nondeterministic polynomial time, respectively, they are incorrect.
There are some good theoretical reasons why it’s helpful to define P and NP as sets of decision problems. Working with decision problems makes reducibility much easier (reducing one problem to another doesn’t require transforming output), simplifies the model by eliminating issues of how big the answer to a question must be, makes it easier to define nondeterministic computation, etc.
But none of those are “dealbreakers” in the sense that you can define analogous complexity classes that involve computing functions rather than just coming up with yes or no answers. For example, the classes FP and FNP are the function problem versions of P and NP, and the question of whether FP = FNP is similarly open.

How do we know NP-complete problems are the hardest in NP?

I get that if you can do a polynomial time reduction from "every" problem then it proves that the problem is at least as hard as every problem in NP. Except, how do we know that we've discovered every problem in NP? Can't there exist problems that we may not have discovered or proven exist in NP but CANNOT be reduced to any np-complete problem? Or is this still an open question?
As others have correctly stated, the existence of the problem that is NP, but is not NP-complete would imply that P != NP, so finding one would bring you a million dollar and eternal glory. One famous problem that is believed to belong in this class is integer factorization. However, your original question was
Can't there exist problems that we may not have discovered or proven
exist in NP but CANNOT be reduced to any np-complete problem?
The answer is no. By definition of NP-completeness, one of two
necessary conditions for a problem A to be NP-complete is that every NP problem needs to be reducible in polynomial time to A. If you want to find out how to prove that every single NP problem can be reducible in polynomial time to some NP-complete problem, have a look at the proof of Cook-Levin theorem that states that 3-SAT problem is NP-complete. It was the first proven NP-complete problem and many other NP-complete problems are later proven to be NP-complete by finding the appropriate reduction from 3-SAT to these problems.
NP consists of all problems that could (theoretically) be solved by being able to make lucky guesses, guessing the solution and checking in polynomial time that the solution is correct. For example, the travelling salesman problem "can I visit the capitols of all 50 states of the USA with a trip of less than 9,825 miles" can be solved by guessing a trip and checking that it is not too long.
And one problem in NP is basically simulating a programmable computer circuit with various inputs and checking whether a certain output can be achieved. And that programmable computer circuit is powerful enough to solve all problems in NP.
So yes, we know all about all problems in NP.
(Then of course an NP complete problem can by definition be used to solve any problem in NP. If there is a problem that it cannot solve, that problem is not in NP).
Except, how do we know that we've discovered every problem in NP?
We don't. The set of all problems in the universe is not only infinite, but uncountable.
Can't there exist problems that we may not have discovered or proven
exist in NP but CANNOT be reduced to any np-complete problem?
We don't know that. We suspect that this is the case, but this hasn't been proven yet. If we were to find a NP problem that is not in NP-Complete, it would be proof that P =/= NP.
It is one of the great unsolved problems in CS. Many brilliant minds have been taking a go at it, but this nut has been one tough one to crack.

Is the complement of the language CLIQUE element of NP?

I'm studying about the NP class and one of the slides mentions:
It seems that verifying that something is not present is more difficult than verifying that it is present.
______ _________
Hence, CLIQUE (complement) and SubsetSUM (complement) are not obviously members of NP.
Was it ever proved, whether the complement of CLIQUE is an element of NP?
Also, do you have the proof?
This is an open problem, actually! The complexity class co-NP consists of the complements of all problems in NP. It's unknown whether NP = co-NP right now, and many people suspect the answer is no.
Just as CLIQUE is NP-complete, the complement of CLIQUE is co-NP-complete. (More generally, the complement of any NP-complete problem is co-NP-complete). There's a theorem that if any co-NP-complete problem is in NP, then co-NP = NP,which would be a huge theoretical breakthrough.
If you're interested in learning more about this, check out the Wikipedia article on co-NP and look around online for more resources.
The general intuition here: it's easy to prove a graph contains an N-clique: just show me the clique. It's hard to prove a graph that doesn't have an N-clique in fact doesn't have an N-clique. What property of the graph are you going to exploit to do that?
Sure, for some families of graphs you can -- for example, graphs with sufficiently few edges can't possibly have such a clique. It's entirely possible that all graphs can have similar proofs built around them, although it'd be surprising -- almost as surprising as P=NP.
This is why the complement of languages in NP are not, in general, obviously in NP -- in fact, we have the term "co-NP" (as in "the complement is in NP") to refer to languages like !CLIQUE.
One common approach to make progress in complexity theory, where we haven't made progress against the hard questions, is to show that some specific hard-to-prove result would imply something surprising. Showing that NP=co-NP is a common target of these proofs -- for example, any hard problem in both NP and co-NP probably isn't complete for either, because if it were it is complete for both and thus both are equal, so somehow you have those crazy general graph proofs.
This even generalizes -- you can start talking about what happens if your NTMs (or certificate checkers) have an oracle for an NP complete language like CLIQUE. Obviously both CLIQUE and !CLIQUE is in P^CLIQUE, but now there are (probably) new languages in NP^CLIQUE and co-NP^CLIQUE, and you can keep going further until you have an entire hierarchy of complexity classes -- the "polynomial hierarchy". This hierarchy intuitively goes on forever, but may well collapse at some point or even not exist at all (if P=NP).
The polynomial hierarchy makes this general argument technique more powerful -- showing that some result would make the polynomial hierarchy collapse to the 2nd or 3rd level would make that result pretty surprising. Even showing it collapses at all would be somewhat surprising.

Is it necessary for NP problems to be decision problems ?

Professor Tim Roughgarden from Stanford University while teaching a MOOC said that solutions to problems in the class NP must be polynomial in length. But the wikipedia article says that NP problems are decision problems. So what type of problems are basically in the class NP ? And is it unnecessary to say that solutions to such problems have a polynomial length output(as decision problems necessarily output either 0 or 1) ?
He was probably talking about witnesses and verifiers.
For every problem in NP, there is a verifier—read algorithm/turing machine—that can verify "yes"-claims in polynomial time.
The idea is, that you have some kind of information—the witness—to help you do this given the time constraints.
For instance, in the travelling salesman problem:
TSP = {(G, k) if G has a hamiltonian cycle of cost <= k}
For a given input (G, k), you only need to determine whether or not the problem instance is in TSP. That's a yes/no answer.
Now, if someone comes along and says: This problem instance is in TSP, you will demand a proof. The other person will then probably give you a sequence of cities. You can then simply check whether the cities in that order form a Hamiltonian cycle and whether the total cost of the cycle is ≤ k.
You can perform this procedure in polynomial time—given that the witness is polynomial in length.
Using this sequence of cities, you were thus able to correctly determine that the problem instance was indeed in TSP.
That's the idea of verifiers: They take a proof object/witness that is polynomial in length to check in polynomial time, that a certain problem instance is in the language.
The standard definition of NP is that it is a class of decision problems only. Decision problems always produce a yes/no answer and thus have constant-sized output.
sDidn't watch the video/course, but I am guessing he was talking about certificates/verification and not solutions. Big difference.

Explaining computational complexity theory

Assuming some background in mathematics, how would you give a general overview of computational complexity theory to the naive?
I am looking for an explanation of the P = NP question. What is P? What is NP? What is a NP-Hard?
Sometimes Wikipedia is written as if the reader already understands all concepts involved.
Hoooo, doctoral comp flashback. Okay, here goes.
We start with the idea of a decision problem, a problem for which an algorithm can always answer "yes" or "no." We also need the idea of two models of computer (Turing machine, really): deterministic and non-deterministic. A deterministic computer is the regular computer we always thinking of; a non-deterministic computer is one that is just like we're used to except that is has unlimited parallelism, so that any time you come to a branch, you spawn a new "process" and examine both sides. Like Yogi Berra said, when you come to a fork in the road, you should take it.
A decision problem is in P if there is a known polynomial-time algorithm to get that answer. A decision problem is in NP if there is a known polynomial-time algorithm for a non-deterministic machine to get the answer.
Problems known to be in P are trivially in NP --- the nondeterministic machine just never troubles itself to fork another process, and acts just like a deterministic one. There are problems that are known to be neither in P nor NP; a simple example is to enumerate all the bit vectors of length n. No matter what, that takes 2n steps.
(Strictly, a decision problem is in NP if a nodeterministic machine can arrive at an answer in poly-time, and a deterministic machine can verify that the solution is correct in poly time.)
But there are some problems which are known to be in NP for which no poly-time deterministic algorithm is known; in other words, we know they're in NP, but don't know if they're in P. The traditional example is the decision-problem version of the Traveling Salesman Problem (decision-TSP): given the cities and distances, is there a route that covers all the cities, returning to the starting point, in less than x distance? It's easy in a nondeterministic machine, because every time the nondeterministic traveling salesman comes to a fork in the road, he takes it: his clones head on to the next city they haven't visited, and at the end they compare notes and see if any of the clones took less than x distance.
(Then, the exponentially many clones get to fight it out for which ones must be killed.)
It's not known whether decision-TSP is in P: there's no known poly-time solution, but there's no proof such a solution doesn't exist.
Now, one more concept: given decision problems P and Q, if an algorithm can transform a solution for P into a solution for Q in polynomial time, it's said that Q is poly-time reducible (or just reducible) to P.
A problem is NP-complete if you can prove that (1) it's in NP, and (2) show that it's poly-time reducible to a problem already known to be NP-complete. (The hard part of that was provie the first example of an NP-complete problem: that was done by Steve Cook in Cook's Theorem.)
So really, what it says is that if anyone ever finds a poly-time solution to one NP-complete problem, they've automatically got one for all the NP-complete problems; that will also mean that P=NP.
A problem is NP-hard if and only if it's "at least as" hard as an NP-complete problem. The more conventional Traveling Salesman Problem of finding the shortest route is NP-hard, not strictly NP-complete.
Michael Sipser's Introduction to the Theory of Computation is a great book, and is very readable. Another great resource is Scott Aaronson's Great Ideas in Theoretical Computer Science course.
The formalism that is used is to look at decision problems (problems with a Yes/No answer, e.g. "does this graph have a Hamiltonian cycle") as "languages" -- sets of strings -- inputs for which the answer is Yes. There is a formal notion of what a "computer" is (Turing machine), and a problem is in P if there is a polynomial time algorithm for deciding that problem (given an input string, say Yes or No) on a Turing machine.
A problem is in NP if it is checkable in polynomial time, i.e. if, for inputs where the answer is Yes, there is a (polynomial-size) certificate given which you can check that the answer is Yes in polynomial time. [E.g. given a Hamiltonian cycle as certificate, you can obviously check that it is one.]
It doesn't say anything about how to find that certificate. Obviously, you can try "all possible certificates" but that can take exponential time; it is not clear whether you will always have to take more than polynomial time to decide Yes or No; this is the P vs NP question.
A problem is NP-hard if being able to solve that problem means being able to solve all problems in NP.
Also see this question:
What is an NP-complete in computer science?
But really, all these are probably only vague to you; it is worth taking the time to read e.g. Sipser's book. It is a beautiful theory.
This is a comment on Charlie's post.
A problem is NP-complete if you can prove that (1) it's in NP, and
(2) show that it's poly-time reducible to a problem already known to
be NP-complete.
There is a subtle error with the second condition. Actually, what you need to prove is that a known NP-complete problem (say Y) is polynomial-time reducible to this problem (let's call it problem X).
The reasoning behind this manner of proof is that if you could reduce an NP-Complete problem to this problem and somehow manage to solve this problem in poly-time, then you've also succeeded in finding a poly-time solution to the NP-complete problem, which would be a remarkable (if not impossible) thing, since then you'll have succeeded to resolve the long-standing P = NP problem.
Another way to look at this proof is consider it as using the the contra-positive proof technique, which essentially states that if Y --> X, then ~X --> ~Y. In other words, not being able to solve Y in polynomial time isn't possible means not being to solve X in poly-time either. On the other hand, if you could solve X in poly-time, then you could solve Y in poly-time as well. Further, you could solve all problems that reduce to Y in poly-time as well by transitivity.
I hope my explanation above is clear enough. A good source is Chapter 8 of Algorithm Design by Kleinberg and Tardos or Chapter 34 of Cormen et al.
Unfortunately, the best two books I am aware of (Garey and Johnson and Hopcroft and Ullman) both start at the level of graduate proof-oriented mathematics. This is almost certainly necessary, as the whole issue is very easy to misunderstand or mischaracterize. Jeff nearly got his ears chewed off when he attempted to approach the matter in too folksy/jokey a tone.
Perhaps the best way is to simply do a lot of hands-on work with big-O notation using lots of examples and exercises. See also this answer. Note, however, that this is not quite the same thing: individual algorithms can be described by asymptotes, but saying that a problem is of a certain complexity is a statement about every possible algorithm for it. This is why the proofs are so complicated!
I remember "Computational Complexity" from Papadimitriou (I hope I spelled the name right) as a good book
very much simplified: A problem is NP-hard if the only way to solve it is by enumerating all possible answers and checking each one.
Here are a few links on the subject:
Clay Mathematics statement of P vp NP problem
P vs NP Page
P, NP, and Mathematics
In you are familiar with the idea of set cardinality, that is the number of elements in a set, then one could view the question like P representing the cardinality of Integer numbers while NP is a mystery: Is it the same or is it larger like the cardinality of all Real numbers?
My simplified answer would be: "Computational complexity is the analysis of how much harder a problem becomes when you add more elements."
In that sentence, the word "harder" is deliberately vague because it could refer either to processing time or to memory usage.
In computer science it is not enough to be able to solve a problem. It has to be solvable in a reasonable amount of time. So while in pure mathematics you come up with an equation, in CS you have to refine that equation so you can solve a problem in reasonable time.
That is the simplest way I can think to put it, that may be too simple for your purposes.
Depending on how long you have, maybe it would be best to start at DFA, NDFA, and then show that they are equivalent. Then they understand ND vs. D, and will understand regular expressions a lot better as a nice side effect.

Resources