Are there decision problems which are decidable but not in NP? [closed] - complexity-theory

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 11 years ago.
Improve this question
this is my first stackoverflow question, so be gentle. I apologize in advance if this has been beaten to death already... I read a few threads on NP but I haven't found a tantalizing answer to my question (if anything, I came up with some new ones). Briefly:
Are there decision problems which are decidable but not in NP?
If so, are problems which ask for a solution harder than the equivalent decision problem?
My suspicion is that the answer to the first question is a resounding 'yes' and that the answer to the second is a resounding 'no'.
In the first case, an example problem might be "given a set S, a subset T of S and a function f with domain 2^S, determine whether T maximizes f". For generic S, T and f, you can't even verify this without checking f(X) for all subsets X of S, right?
In the second case... well, I'll admit this is more of a hunch. For some reason, it doesn't seem like it should matter whether an answer contains one bit (for decision problems) or any (finite) number of bits... or, in other words, why you shouldn't be able to consider the symbols left on the tape after the TM halts as part of the "answer".
EDIT:
Actually, I have a question... how precisely does your construction show that function problems are "no harder" than decision problems? If anything, you've shown that it's no easier to answer a function problem than a decision problem... which is trivial. Perhaps this is my fault for asking the question in a sloppy fashion.
Given a TM T1 in NP which solves the problem "Is X a solution to problem P" for variable X and (for the sake of argument) fixed P, is it guaranteed that there will be a TM T2 in NP which halts everywhere T1 halts, which ends in the "halt accept" state everywhere it halts, and leaves e.g. a binary representation of X on the tape?

(1) Yes, there are decision problems that are decidable but not in NP. It is a consequence of the time hierarchy theorem that NP ⊊ NEXP, so any NEXP-hard problem is not in NP. The canonical example is this problem:
Given a non-deterministic Turing machine M and an integer n written in binary, does M accept the empty string in at most n steps?
This problem is certainly decidable (just simulate all computation paths for M of length n and if see any accepts).
See this question on cstheory.stackexchange.com for many more NEXP-complete problems. And of course, there are decision problems outside NEXP: indeed, a whole exponential hierarchy of them...
(2) The answer to your second question is no: function problems are no harder than decision problems. (In a particular technical sense of "no harder than".) Suppose we have a function problem which asks for output N. Then there's a decision problem which takes an input k and asks whether the kth bit of N is 1. Solve this decision problem for every bit in the answer, and you're done!
For example, FSAT (the problem of finding a satisfying assignment to a Boolean formula) can be polynomial-time reduced to SAT (the problem of determining whether a Boolean formula has a satisfying assignment). Suppose you can solve SAT, and you are asked to find a satisfying assignment for the formula φ. Consider the first variable x in your formula φ, and ask SAT whether x∧φ is satisfiable. If it is, there must be a satisfying assignment for φ in which x is true; if not, and φ is satisfiable, then there must be a satisfying assignment for φ in which x is false. Continue with the second variable y, asking whether x∧y∧φ (or ¬x∧y∧φ, according to the answer to your first question) is satisfiable. And so on for each variable in the formula.
[I ought to add an important caveat about this example. Here SAT and FSAT are "naturally" related: they are both concerned with the same kind of thing, namely satisfying assignments to formulae. But in my argument that search in general can be reduced to decision, I used a highly artificial decision problem about the kth bit of the output. So although search reduces to decision, it doesn't necessarily reduce to the natural corresponding decision problem. In particular, Bellare and Goldwasser showed that sometimes search can't be so reduced.]
M. Bellare and S. Goldwasser (1994). "The complexity of decision versus search." SIAM Journal on Computing. 23:1 pp. 97–119.

Related

Proof a sorting algorithm is incorrect [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 8 months ago.
Improve this question
I was given an algorithm as below and my aim is to prove its incorrectness. The thing is that it woud be too obvious to use just counter-example, that is why I was looking for a more formal approach. I have thought the proof by induction, but in the past I had use it only to prove the correctness of an algorithm and I can't really figure out the opposite way.
GoodSort(A, left, right)
{
if (A[left] > A[right])
swap(A[left], A[right]);
if (left+1 >= right)
return;
pivot = floor((right-left+1)/3);
GoodSort(A, left, right-pivot);
GoodSort(A, left+pivot, right);
}
For any purpose but a class assignment where it's forbidden, a proof by counterexample, e.g. [3,2,4,0,1,4], would be ideal. As some commenters said, clarity and simplicity is desirable.
Assuming this is a class assignment and you need to categorize the set of inputs (or a set of inputs) where this will fail that's broader than a single counterexample, take some minimal input that fails, and analyze why it fails, then generalize that.
An algorithm can either be correct or incorrect, if you are having a hard time finding ways to prove its incorrectness you can rather try to prove its correctness. If you reach a nonsense you can then conclude that the algorithm is incorrect. This method is called Reductio ad Absurdum

How to select the number of cluster centroid in K means [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I am going through a list of algorithm that I found and try to implement them for learning purpose. Right now I am coding K mean and is confused in the following.
How do you know how many cluster there is in the original data set
Is there any particular format that I have follow in choosing the initial cluster centroid besides all centroid have to be different? For example does the algorithm converge if I choose cluster centroids that are different but close together?
Any advice would be appreciated
Thanks
With k-means you are minimizing a sum of squared distances. One approach is to try all plausible values of k. As k increases the sum of squared distances should decrease, but if you plot the result you may see that the sum of squared distances decreases quite sharply up to some value of k, and then much more slowly after that. The last value that gave you a sharp decrease is then the most plausible value of k.
k-means isn't guaranteed to find the best possible answer each run, and it is sensitive to the starting values you give it. One way to reduce problems from this is to start it many times, with different starting values, and pick the best answer. It looks a bit odd if an answer for larger k is actually larger than an answer for smaller k. One way to avoid this is to use the best answer found for k clusters as the basis (with slight modifications) for one of the starting points for k+1 clusters.
In the standard K-Means the K value is chosen by you, sometimes based on the problem itself ( when you know how many classes exists OR how many classes you want to exists) other times a "more or less" random value. Typically the first iteration consists of randomly selecting K points from the dataset to serve as centroids. In the following iterations the centroids are adjusted.
After check the K-Means algorithm, I suggest you also see the K-means++, which is an improvement of the first version, as it tries to find the best K for each problem, avoiding the sometimes poor clusterings found by the standard k-means algorithm.
If you need more specific details on implementation of some machine learning algorithm, please let me know.

Subset product & quantum computers, is an instance solvable [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 8 years ago.
Improve this question
Suppose you have a quantum computer that can run Shor's algorithm for factorization of integers.
Is it then possible to produce an oracle that determines if no solution exists for an instance of the Subset Product problem, with 100% confidence, in sub-exponential time?
So, the oracle is given a sequence x1, ... xn, as the description of a subset product problem.
It responds either Yes, a solution to this instance does not exist, or No, a solution to this instance may or may not exist.
If we take he prime factors of all elements in the sequence and then check to see if all of them are present in the target product's factors, this should tell us if a solution is not at all possible. A solution exist may exist if and only if all the prime factors are accounted for. On quantum computers, prime factorization is sub-exponential.
Would like some feedback on if this is correct logic- if it works- and if the complexity is indeed different between classical and quantum systems for this oracle/algorithm. Would also appreciate an explanation on reductions - can Subset Product be reduced to 3SAT without consequence?
Your algorithm, if I understood it correctly, will fail for the elements [6, 15] and the target 10. It will determine that 6*15 = 2*3*3*5, which has all of the factors used in 10=2*5, and incorrectly assert that this means you can make 10 by multiplying 6 and 15.
There are two reasons that it's unlikely you'll be able to fix this algorithm:
Subset Product is NP-Complete. Finding a polynomial time quantum algorithm for it, or showing that no such algorithm exists, is probably as hard as determining if P=NP. That is to say, very very hard.
You don't want the prime factors, you want the "no-need-to-reduce" factors. For example, if every time a number in the problem has a prime factor of 13 it's accompanied by a factor of 17 then there's no need to break 221 into 13*17. You can apply Euclid's gcd algorithm to various combinations of elements to find these no-need-to-reduce factors, no quantum-ness required.

Optimization similar to Knapsack [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
I am trying to find a way to solve an Optimization problem as follows:
I have 22 different objects that can be selected more than once. I have a evaluation function f that takes the multiplicities and calculates the total value.
f is a product over fractions of linear (affine) terms and as such, differentiable and even smooth in the allowed region.
I want to optimize f with respect to the 22 variables, with the additional conditions that certain sums may not exceed certain values (for example, if a,...,v are my variables, a + e + i + m + q + s <= 9). By this, all of the variables are bounded.
If f were strictly monotonuous, this could be solved optimally by a (minimalistically modified) knapsack solution. However, the function isnt convex. That means it is even impossible to assume that if taking an object A is better than B on an empty knapsack, that this choice holds even when adding a third object C (as C could modify B's benefit to be better than A). This means that a greedy algorithm cannot be used;
Are there similar algorithms that solve such a problem in a optimal (or at least, nearly optimal) way?
EDIT: As requested, an example of what the problem is (I chose 5 variables a,b,c,d,e for simplicity)
for example,
f(a,b,c,d,e) = e*(a*0.45+b*1.2-1)/(c+d)
(Every variable only appears once, if this helps at all)
Also, for example, a+b+c=4, d+e=3
The problem is to optimize that with respect to a,b,c,d,e as integers. There is a bunch of optimization algorithms that hold for convex functions, but very few for non-convex...

How many fewer questions can I ask to guess a number? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
If I ask a person to select a number between 1 and 1200 in his mind. If I can only ask questions for which he will only reply with YES or NO, how many questions will I need to ask before I arrive at the answer for the number He had selected in my mind ?
I am looking for the less possible number of questions. Any proven solution would be appreciable.
To determine which of the numbers between 1 and n are chosen, you will need to ask at least log2 n questions. There is no possible way to do better.
The intuition for this answer is as follows. Suppose that you ask a total of k questions. The maximum number of different possible answers you can receive to those questions, even if they're dependent on one another, is 2k. Since there are n possible numbers that could be picked, you need to choose k such that
2k ≥ n
Which happens precisely when
k ≥ log2 n
In other words, you have to ask at least log2 n questions to be able to even have enough different possible outcomes to associate each possible number with some possible outcome. Since the number of questions must always be a natural number, the minimum number of questions you can ask must be at least ⌈log2 n⌉
This is purely a lower bound on the answer. At this point, we can't rule out the possibility that maybe you need far more questions than this to get the answer. However, the fact that we know about the binary search algorithm means that we know that you never need more than ⌈log2 n⌉ questions to get the answer, since this is the number of questions you'd ask if you were doing a binary search. This means that the binary search algorithm has to optimal, since there is no possible way of asking a smaller number of questions.
Hope this helps!
The log base 2 of 1200, rounded up to an integer: that's 11. Basically, every question cuts the possible range in half, so you just continue with a binary search until the possible range has length 1.
Ask for all the bits in the number. 11 questions are enough for that.
Edit: I would argue, that it's impossible to do better, due to the bijectivity between the binary and decimal representation - at least for the worst case.
This also a classical example of adversary arguments method which is used to find lower bound complexity. In our case the person who knows the number is the adversary. So he will wisely change his real answer when you ask a new question. How does he determine in his answer in each step? Let, for example the number is between 1-100.
You ask: is n>=50?. He may say YES OR NO both will be equally well for him since intervals are equal. Let assume he says yes.
Then you say a number between 50<=N<=100, lets say you ask: is n>=80.Then he should say NO even if the number he picked is larger than 80 because that 50<=n<=80 is larger interval.Now the number may be between 50 and 80
Maintaining this way, he will guarentee the maximum number of questions, that is logn since the interval size is decreasing like in binary search

Resources