How I can show the correctness of brute-force TSP algorithm? - complexity-theory

I'm working on exploring TSP. So I have to prove the correctness of brute-force algorithm on graphs (pickin up the good permutation from all permutations that exists ~ O(n!)). So I'm learning a lot of books and sites, but I can't find how to prove the correctness. Does the prove exists in books and scientists works? If anyone had the same problem before or know how to solve this problem, can you give me advice please?

the proof for all brute force algorithms are basically the same:
let BF be a brute force and X the set of all possible solutions.
let us assume our algorithm, BF has returned x in X, than let us assume by contradiction that there exists y in X such that y is a better solution than x. but BF is a brute force algorithm, so he compared x and y, and deduced x is better. contradiction. so x is better than y.

Related

Complexity of solving a Diophantine equation with potential solutions

Say I have an equation, given as
13x + 5y = M, where M is given each time.
Evidently this is a diophantine equation and could take a long time to solve depending on the case. However, a reading told me that if we have a set of unique integer "possible solutions" of size k for X and Y stored in a Binary search tree (meaning the correct values for X and Y are contained in there somewhere), we can compute the solution pair (x, y) to the equation in O(k) time.
Now, I'm stuck on this logic because I do not see how storing elements in this data structure helps us or prevents us from having to plug in each of the k elements for X or Y, solve for the other variable, and check if the data structure contains that variable. The only thing I can think of would be somehow keeping two pointers to move along the tree, but that doesn't seem feasible.
Could someone explain the reasoning behind this logic?
Solving linear Diophantine equations (which is what you seem to be thinking of) is trivial and requires nothing more than the extended Euclidian algorithm. On the other hand, the successful resolution of Hilbert's tenth problem implies that there is no algorithm which is able to solve arbitrary Diophantine equations.
There is a vast gap between linear and arbitrary. Perhaps you can focus your question on the type of equation you are interested in.

If Y is reducible to X in polynomial time, then how is it true that X is at least as hard as Y?

I am having difficulty understanding the relationship between the complexity of two classes of problems, say NP-hard and NP-complete problems.
The answer at https://stackoverflow.com/a/1857342/ states:
NP Hard
Intuitively, these are the problems that are at least as hard as the NP-complete problems. Note that NP-hard problems do not have to be in NP, and they do not have to be decision problems.
The precise definition here is that a problem X is NP-hard, if there is an NP-complete problem Y, such that Y is reducible to X in polynomial time.
If a problem Y can be reduced to X in polynomial time, should we not say that Y is at least as hard as X? If a problem Y is reducible to X in polynomial time, then the time required to solve Y is polynomial time + the time required to solve X. So it appears to me that problem Y is at least as hard as X.
But the quoted text above says just the opposite. It says, if an NP-complete problem Y is reducible to an NP-hard problem X, then the NP-hard problem is at least as hard as the NP-complete problem.
How does this make sense? Where am I making an error in thinking?
Your error is in supposing that you have to solve X in order to solve Y. Y might be actually much easier, but one way to solve it is to change it to an instance of X problem. And since we are in big O notation and in NP class we are way past linear algorithms, you can always safely discard any linear parts of an algorithm. Heck you can almost safely discard any polynomial parts until P=NP problem is solved. That means O(f(n) + n) = O(f(n)) where n=O(f(n)).
Example (which is obviously with neither NP-hard or NP-complete problems but just a mere illustration): You are to find the lowest number in an unsorted array of n numbers. There is obvious solution to iterate over the whole list and remember the lowest number you found, pretty straight-forward and solid O(n).
Someone else comes and says, ok, let's change it to sorting the array, then we can just take the first number and it will be the lowest. Note here, that this conversion of the problem was O(1), but we can for example pretend there had to be some preprocessing done with the array that would make it O(n). The overall solution is O(n + n*log(n)) = O(n * log(n)).
Here you too changed easy problem to a hard problem, thus proving that the hard problem is indeed the same or harder as the easy one.
Basically what the NP-hard problem difinition means is, that X is at least as hard as an NP-complete Y problem. If you find an NP-complete Y problem that you can solve by solving X problem, it means either that X is as hard or harder than Y and then it is indeed NP-hard, or if it is simpler, it means you found an algorithm to solve Y faster than any algorithm before, potentially even moving it out of NP-complete class.
Another example: let's pretend convolution is in my set of "complete", and normally takes O(n²). Then you come up with Fast Fourier Transformation with O(n * log(n)) and you find out you can solve convolution by transforming it to FFT problem. Now you came up with a solution for convolution, which is o(n²), more specifically O(n * log(n)).
Let I_X be the indicator function of X (i.e., 1 if the input is in X and 0 otherwise) and I_Y be the indicator function of Y. If Y reduces to X via a function f that can be computed in polynomial-time, then I_Y = I_X . f, where . denotes function composition. X is at least as hard as Y because, given an algorithm for I_X, the formula above gives an algorithm for I_Y that, for any class of running times closed under polynomial substitution (e.g., polynomial, exponential, finite), if the algorithm for I_X belongs to the class, then so does the algorithm for I_Y. The contrapositive of this statement is, if Y has no fast decision procedure, then X has no fast decision procedure.

Lambert's O(n^2) algorithm on pairwise sum sort

Sorting Pairwise Sums of X and Y with O(n^2) comparisons are quite hard.
http://cs.smith.edu/~orourke/TOPP/P41.html
http://en.wikipedia.org/wiki/X_%2B_Y_sorting
It seems Lambert has concluded that if some non-comparison operations are allowed, then it is achievable.
I can find his original paper: http://www.sciencedirect.com/science/article/pii/030439759290089X
Also I found a functional approach for his algorithm.
However, I really don't understand even after several times of attempts.
So what I roughly know is his algorithm tries:
tag element with its original index
negate it
using the equation of X - Y < X' - Y' == X - X' < Y - Y'.
Other than that I am totally lost.
So anyone here really understands that algorithm? Can anyone clearly explain it in a simple way?

Finding maximum subsequence below or equal to a certain value

I'm learning dynamic programming and I've been having a great deal of trouble understanding more complex problems. When given a problem, I've been taught to find a recursive algorithm, memoize the recursive algorithm and then create an iterative, bottom-up version. At almost every step I have an issue. In terms of the recursive algorithm, I write different different ways to do recursive algorithms, but only one is often optimal for use in dynamic programming and I can't distinguish what aspects of a recursive algorithm make memoization easier. In terms of memoization, I don't understand which values to use for indices. For conversion to a bottom-up version, I can't figure out which order to fill the array/double array.
This is what I understand:
- it should be possible to split the main problem to subproblems
In terms of the problem mentioned, I've come up with a recursive algorithm that has these important lines of code:
int optionOne = values[i] + find(values, i+1, limit - values[i]);
int optionTwo = find(values, i+1, limit);
If I'm unclear or this is not the correct qa site, let me know.
Edit:
Example: Given array x: [4,5,6,9,11] and max value m: 20
Maximum subsequence in x under or equal to m would be [4,5,11] as 4+5+11 = 20
I think this problem is NP-hard, meaning that unless P = NP there isn't a polynomial-time algorithm for solving the problem.
There's a simple reduction from the subset-sum problem to this problem. In subset-sum, you're given a set of n numbers and a target number k and want to determine whether there's a subset of those numbers that adds up to exactly k. You can solve subset-sum with a solver for your problem as follows: create an array of the numbers in the set and find the largest subsequence whose sum is less than or equal to k. If that adds up to exactly k, the set has a subset that adds up to k. Otherwise, it does not.
This reduction takes polynomial time, so because subset-sum is NP-hard, your problem is NP-hard as well. Therefore, I doubt there's a polynomial-time algorithm.
That said - there is a pseudopolynomial-time algorithm for subset-sum, which is described on Wikipedia. This algorithm uses DP in two variables and isn't strictly polynomial time, but it will probably work in your case.
Hope this helps!

Is it necessary for NP problems to be decision problems ?

Professor Tim Roughgarden from Stanford University while teaching a MOOC said that solutions to problems in the class NP must be polynomial in length. But the wikipedia article says that NP problems are decision problems. So what type of problems are basically in the class NP ? And is it unnecessary to say that solutions to such problems have a polynomial length output(as decision problems necessarily output either 0 or 1) ?
He was probably talking about witnesses and verifiers.
For every problem in NP, there is a verifier—read algorithm/turing machine—that can verify "yes"-claims in polynomial time.
The idea is, that you have some kind of information—the witness—to help you do this given the time constraints.
For instance, in the travelling salesman problem:
TSP = {(G, k) if G has a hamiltonian cycle of cost <= k}
For a given input (G, k), you only need to determine whether or not the problem instance is in TSP. That's a yes/no answer.
Now, if someone comes along and says: This problem instance is in TSP, you will demand a proof. The other person will then probably give you a sequence of cities. You can then simply check whether the cities in that order form a Hamiltonian cycle and whether the total cost of the cycle is ≤ k.
You can perform this procedure in polynomial time—given that the witness is polynomial in length.
Using this sequence of cities, you were thus able to correctly determine that the problem instance was indeed in TSP.
That's the idea of verifiers: They take a proof object/witness that is polynomial in length to check in polynomial time, that a certain problem instance is in the language.
The standard definition of NP is that it is a class of decision problems only. Decision problems always produce a yes/no answer and thus have constant-sized output.
sDidn't watch the video/course, but I am guessing he was talking about certificates/verification and not solutions. Big difference.

Resources