I am trying to understand the relationships between P, NP, NP-Complete and NP-Hard.
I believe I am starting to understand the general idea but, I am hung up on this question(see title).
What is an example of a problem that is not solvable in P time, is verifiable in P time but is not NP-Complete?
If there is some piece of understanding I am missing please fill me in.
Thanks in advance
As noted in the comments, this is the wrong site for this question. However, it can be answered briefly:
What is an example of a problem that is not solvable in P time, is verifiable in P time but is not NP-Complete?
If I understand you, what you want are problems that are (1) not in P, (2) in NP, and (3) not in NPC. Such problems are the NP-intermediate (NPI) problems.
It is not known if there is any such problem, because it is not known if P=NP.
If P=NP then clearly there are no such problems; if P=NP then also P=NPC, and therefore every problem which can be verified in P time is in all of P, NP and NPC because they are equal.
If P!=NP then it is known that there are such problems; at least one exists. Unfortunately we do not know if any real-world problems we face are in NPI provided that P!=NP. A list of likely candidates can be found here:
https://en.wikipedia.org/wiki/NP-intermediate
In short: knowing whether NPI is empty or not is equivalent to solving proving P!=NP, so get cracking! If you can find a problem that is definitely in NP but definitely not in P or NPC, then there's a big money prize awaiting you.
Assuming P != NP
The euler diagram shows a part not part of P and NP-complete. I read on wikipedia that this set is called NP-Intermediate.
Euler Diagram
I have some doubts as to how are NPI problems defined?
An NP-intermediate problem is a decision problem that
is in NP (that is, "yes" answers can be verified in polynomial time),
is not in P (that is, there is no polynomial-time algorithm for solving the problem), and
is not NP-complete.
That last criterion can be stated in a number of different ways. One way to say this is that there is no polynomial-time mapping reduction from SAT to that particular problem.
These problems are primarily of theoretical interest right now because we don't know if any NP-intermediate problems exist - if we could find one, we'd have a problem in NP that's not in P, meaning that P ≠ NP! However, they're interesting because if we can prove that P ≠ NP, then we know that there are some problems in NP that are too hard to be solved in polynomial time, but which aren't among the "hardest" of the hard problems in NP (the problems that are NP-complete).
In the event that P = NP, then there would not be any NP-intermediate problems because you couldn't have a problem in NP but not in P. If P ≠ NP, then Ladner's theorem guarantees at least one NP-intermediate problem exists, but does so by specifically constructing a problem that is highly artificial and designed solely to be NP-intermediate in that case. Right now, with a few exceptions (notably the graph isomorphism problem), all the problems we know of in NP are either squarely in P or known to be NP-complete.
Is there any language in RE that is complete with regard to polynomial-time reductions?
I think that A_TM will be a good example,but I'm not sure...
Yes, ATM is RE-complete with respect to polynomial-time reductions. Given any RE language L, let M be a recognizer for it. Then the function f(w) = can be computed in polynomial time (for some reasonable representation of tuples) because M is a fixed machine and the length of w in the encoded version should certainly be at most polynomially larger than the original input w. We also have that w ∈ L if and only if M accepts w if and only if ∈ ATM, so f is a polynomial-time reduction from an arbitrary RE language L to ATM, making ATM RE-complete with respect to polynomial-time reductions.
I'm not sure why you'd be interested in this particular notion of RE-completeness, since RE is mostly useful for notions of computability (can you solve this problem at all?) while polynomial-time reductions are usually for complexity (can you solve this problem efficiently?) If you do have an interesting use case for them, though, I'd love to hear about it!
Professor Tim Roughgarden from Stanford University while teaching a MOOC said that solutions to problems in the class NP must be polynomial in length. But the wikipedia article says that NP problems are decision problems. So what type of problems are basically in the class NP ? And is it unnecessary to say that solutions to such problems have a polynomial length output(as decision problems necessarily output either 0 or 1) ?
He was probably talking about witnesses and verifiers.
For every problem in NP, there is a verifier—read algorithm/turing machine—that can verify "yes"-claims in polynomial time.
The idea is, that you have some kind of information—the witness—to help you do this given the time constraints.
For instance, in the travelling salesman problem:
TSP = {(G, k) if G has a hamiltonian cycle of cost <= k}
For a given input (G, k), you only need to determine whether or not the problem instance is in TSP. That's a yes/no answer.
Now, if someone comes along and says: This problem instance is in TSP, you will demand a proof. The other person will then probably give you a sequence of cities. You can then simply check whether the cities in that order form a Hamiltonian cycle and whether the total cost of the cycle is ≤ k.
You can perform this procedure in polynomial time—given that the witness is polynomial in length.
Using this sequence of cities, you were thus able to correctly determine that the problem instance was indeed in TSP.
That's the idea of verifiers: They take a proof object/witness that is polynomial in length to check in polynomial time, that a certain problem instance is in the language.
The standard definition of NP is that it is a class of decision problems only. Decision problems always produce a yes/no answer and thus have constant-sized output.
sDidn't watch the video/course, but I am guessing he was talking about certificates/verification and not solutions. Big difference.
Assuming some background in mathematics, how would you give a general overview of computational complexity theory to the naive?
I am looking for an explanation of the P = NP question. What is P? What is NP? What is a NP-Hard?
Sometimes Wikipedia is written as if the reader already understands all concepts involved.
Hoooo, doctoral comp flashback. Okay, here goes.
We start with the idea of a decision problem, a problem for which an algorithm can always answer "yes" or "no." We also need the idea of two models of computer (Turing machine, really): deterministic and non-deterministic. A deterministic computer is the regular computer we always thinking of; a non-deterministic computer is one that is just like we're used to except that is has unlimited parallelism, so that any time you come to a branch, you spawn a new "process" and examine both sides. Like Yogi Berra said, when you come to a fork in the road, you should take it.
A decision problem is in P if there is a known polynomial-time algorithm to get that answer. A decision problem is in NP if there is a known polynomial-time algorithm for a non-deterministic machine to get the answer.
Problems known to be in P are trivially in NP --- the nondeterministic machine just never troubles itself to fork another process, and acts just like a deterministic one. There are problems that are known to be neither in P nor NP; a simple example is to enumerate all the bit vectors of length n. No matter what, that takes 2n steps.
(Strictly, a decision problem is in NP if a nodeterministic machine can arrive at an answer in poly-time, and a deterministic machine can verify that the solution is correct in poly time.)
But there are some problems which are known to be in NP for which no poly-time deterministic algorithm is known; in other words, we know they're in NP, but don't know if they're in P. The traditional example is the decision-problem version of the Traveling Salesman Problem (decision-TSP): given the cities and distances, is there a route that covers all the cities, returning to the starting point, in less than x distance? It's easy in a nondeterministic machine, because every time the nondeterministic traveling salesman comes to a fork in the road, he takes it: his clones head on to the next city they haven't visited, and at the end they compare notes and see if any of the clones took less than x distance.
(Then, the exponentially many clones get to fight it out for which ones must be killed.)
It's not known whether decision-TSP is in P: there's no known poly-time solution, but there's no proof such a solution doesn't exist.
Now, one more concept: given decision problems P and Q, if an algorithm can transform a solution for P into a solution for Q in polynomial time, it's said that Q is poly-time reducible (or just reducible) to P.
A problem is NP-complete if you can prove that (1) it's in NP, and (2) show that it's poly-time reducible to a problem already known to be NP-complete. (The hard part of that was provie the first example of an NP-complete problem: that was done by Steve Cook in Cook's Theorem.)
So really, what it says is that if anyone ever finds a poly-time solution to one NP-complete problem, they've automatically got one for all the NP-complete problems; that will also mean that P=NP.
A problem is NP-hard if and only if it's "at least as" hard as an NP-complete problem. The more conventional Traveling Salesman Problem of finding the shortest route is NP-hard, not strictly NP-complete.
Michael Sipser's Introduction to the Theory of Computation is a great book, and is very readable. Another great resource is Scott Aaronson's Great Ideas in Theoretical Computer Science course.
The formalism that is used is to look at decision problems (problems with a Yes/No answer, e.g. "does this graph have a Hamiltonian cycle") as "languages" -- sets of strings -- inputs for which the answer is Yes. There is a formal notion of what a "computer" is (Turing machine), and a problem is in P if there is a polynomial time algorithm for deciding that problem (given an input string, say Yes or No) on a Turing machine.
A problem is in NP if it is checkable in polynomial time, i.e. if, for inputs where the answer is Yes, there is a (polynomial-size) certificate given which you can check that the answer is Yes in polynomial time. [E.g. given a Hamiltonian cycle as certificate, you can obviously check that it is one.]
It doesn't say anything about how to find that certificate. Obviously, you can try "all possible certificates" but that can take exponential time; it is not clear whether you will always have to take more than polynomial time to decide Yes or No; this is the P vs NP question.
A problem is NP-hard if being able to solve that problem means being able to solve all problems in NP.
Also see this question:
What is an NP-complete in computer science?
But really, all these are probably only vague to you; it is worth taking the time to read e.g. Sipser's book. It is a beautiful theory.
This is a comment on Charlie's post.
A problem is NP-complete if you can prove that (1) it's in NP, and
(2) show that it's poly-time reducible to a problem already known to
be NP-complete.
There is a subtle error with the second condition. Actually, what you need to prove is that a known NP-complete problem (say Y) is polynomial-time reducible to this problem (let's call it problem X).
The reasoning behind this manner of proof is that if you could reduce an NP-Complete problem to this problem and somehow manage to solve this problem in poly-time, then you've also succeeded in finding a poly-time solution to the NP-complete problem, which would be a remarkable (if not impossible) thing, since then you'll have succeeded to resolve the long-standing P = NP problem.
Another way to look at this proof is consider it as using the the contra-positive proof technique, which essentially states that if Y --> X, then ~X --> ~Y. In other words, not being able to solve Y in polynomial time isn't possible means not being to solve X in poly-time either. On the other hand, if you could solve X in poly-time, then you could solve Y in poly-time as well. Further, you could solve all problems that reduce to Y in poly-time as well by transitivity.
I hope my explanation above is clear enough. A good source is Chapter 8 of Algorithm Design by Kleinberg and Tardos or Chapter 34 of Cormen et al.
Unfortunately, the best two books I am aware of (Garey and Johnson and Hopcroft and Ullman) both start at the level of graduate proof-oriented mathematics. This is almost certainly necessary, as the whole issue is very easy to misunderstand or mischaracterize. Jeff nearly got his ears chewed off when he attempted to approach the matter in too folksy/jokey a tone.
Perhaps the best way is to simply do a lot of hands-on work with big-O notation using lots of examples and exercises. See also this answer. Note, however, that this is not quite the same thing: individual algorithms can be described by asymptotes, but saying that a problem is of a certain complexity is a statement about every possible algorithm for it. This is why the proofs are so complicated!
I remember "Computational Complexity" from Papadimitriou (I hope I spelled the name right) as a good book
very much simplified: A problem is NP-hard if the only way to solve it is by enumerating all possible answers and checking each one.
Here are a few links on the subject:
Clay Mathematics statement of P vp NP problem
P vs NP Page
P, NP, and Mathematics
In you are familiar with the idea of set cardinality, that is the number of elements in a set, then one could view the question like P representing the cardinality of Integer numbers while NP is a mystery: Is it the same or is it larger like the cardinality of all Real numbers?
My simplified answer would be: "Computational complexity is the analysis of how much harder a problem becomes when you add more elements."
In that sentence, the word "harder" is deliberately vague because it could refer either to processing time or to memory usage.
In computer science it is not enough to be able to solve a problem. It has to be solvable in a reasonable amount of time. So while in pure mathematics you come up with an equation, in CS you have to refine that equation so you can solve a problem in reasonable time.
That is the simplest way I can think to put it, that may be too simple for your purposes.
Depending on how long you have, maybe it would be best to start at DFA, NDFA, and then show that they are equivalent. Then they understand ND vs. D, and will understand regular expressions a lot better as a nice side effect.