If algorithm in P, efficient way to extract solutions? - algorithm

Maybe this is very obvious, but if we had an algorithm in P (so this algorithm gives a yes/no answer in polynomial time), is there a more efficient way to find the solution beyond just guessing and checking?
So, suppose SAT is in P (I know this is an NP-Complete problem, but this seems like the best example for what I'm trying to ask). This means that there is a polynomial time algorithm that will tell you yes or no depending on whether or not the given input is satisfiable.
It would seem that there should thus be an efficient way to find/extract this satisfying assignment (rather than just know it exists, if there is one). However, I can't think of any efficient way to utilize this poly-time algorithm to find such an assignment.
** side note **
For maximization/minimization (e.g. Knapsack) problems I know that you can use binary search to find your solution, but my question is more pertaining to these non-maximization type problems like SAT

You don't have to guess the entire thing and then test it.
You can get a satisfying valuation (if it exists) like this:
Pick a variable, make it false, remove it from all clauses/remove satisfied clauses. Consult the SAT oracle, which apparently runs in polynomial time today. If it's still satisfiable, fine, keep it. Otherwise it must be true, restore the clauses and clean up the clauses again. There's no backtracking, there's just one call to SAT for every variable. That whole thing is still in polynomial time.
Or if that's what you had in mind, then well, that's it. Does it really matter though? Polynomial time is polynomial time, and this isn't usable in practice anyway so wall-clock time is hardly a concern.

Related

Why is chess, checkers, Go, etc. in EXP but conjectured to be in NP?

If I tell you the moves for a game of chess and declare who wins, why can't it be checked in polynomial time if the winner does really win? This would make it an NP problem from my understanding.
First of all: The number of positions you can set up with 32 pieces on a 8x8 field is limited. We need to consider any pawn being converted to any other piece and include any such available position, too. Of course, among all these, there are some positions that cannot be reached following the rules of chess, but this does not matter. The important thing is: we have a limit. Lets name this limit simply MaxPositions.
Now for any given position, let's build up a tree as follows:
The given position is the root.
Add any position (legal chess position or not) as child.
For any of these children, add any position as child again.
Continue this way, until your tree reaches a depth of MaxPositions.
I'm now too tired to think of if we need one additional level of depth or not for the idea (proof?), but heck, just let's add it. The important thing is: the tree constructed like this is limited.
Next step: Of this tree, remove any sub-tree that is not reachable from the root via legal chess moves. Repeat this step for the remaining children, grand-children, ..., until there is no unreachable position left in the whole tree. The number of steps must be limited, as the tree is limited.
Now do a breadth-first search and make any node a leaf if it has been found previously. It must be marked as such(!; draw candidate?). Same for any mate position.
How to find out if there is a forced mate? In any sub tree, if it is your turn, there must be at least one child leading to a forced mate. If it is the opponents move, there must be a grand child for every child that leads to a mate. This applies recursively, of course. However, as the tree is limited, this whole algorithm is limited.
[sensored], this whole algorithm is limited! There is some constant limiting the whole stuff. So: although the limit is incredibly high (and far beyond what up-to-date hardware can handle), it is a limit (please do not ask me to calculate it...). So: our problem actually is O(1)!!!
The same for checkers, go, ...
This applies for the forced mate, so far. What is the best move? First, check if we can find a forced mate. If so, fine, we found the best move. If there are several, select the one with the least moves necessary (still there might be more than one...).
If there is no such forced mate, then we need to measure by some means the 'best' one. Possibly count the number of available successions to mate. Other propositions for measurement? As long as operating on this tree from top to down, we still remain limited. So again, we are O(1).
Now what did we miss? Have a look at the link in your comment again. They are talking about an NxN checkers! The author is varying size of the field!
So have a look back at how we constructed the tree. I think it is obvious that the tree grows exponentially with the size of the field (try to prove it yourself...).
I know very well that this answer is not a prove for that the problem is EXP(TIME). Actually, I admit, it is not really an answer at all. But I think what I illustrated still gives quite a good image/impression of the complexity of the problem. And as long as no one provides a better answer, I dare to claim that this is better than nothing at all...
Addendum, considering your comment:
Let me allow to refer to wikipedia. Actually, it should be suffient to transform the other problem in exponential time, not polynomial as in the link, as applying the transformation + solving the resulting problem still remains exponential. But I'm not sure about the exact definition...
It is sufficient to show this for a problem of which you know already it is EXP complete (transforming any other problem to this one and then to the chess problem again remains exponential, if both transformations are exponential).
Apparently, J.M. Robson found a way to do this for NxN checkers. It must be possible for generalized chess, too, probably simply modifying Robsons algorithm. I do not think it is possible for classical 8x8 chess, though...
O(1) applies for classical chess only, not for generalized chess. But it is the latter one for which we assume not being in NP! Actually, in my answer up to this addendum, there is one prove lacking: The size of the limited tree (if N is fix) does not grow faster than exponentially with growing N (so the answer actually is incomplete!).
And to prove that generalized chess is not in NP, we have to prove that there is no polynomial algorithm to solve the problem on a non-deterministic turing machine. This I leave open again, and my answer remains even less complete...
If I tell you the moves for a game of chess and declare who wins, why
can't it be checked in polynomial time if the winner does really win?
This would make it an NP problem from my understanding.
Because in order to check if the winner(white) does really win, you will have to also evaluate all possible moves that the looser(black) could've made in other to also win. That makes the checking also exponential.

Any NP to SAT. How to do that and prove that it is possible?

Let's start here:
It is said that all NP problems can be reduced to SAT(boolean satisfiability problem). To be more accurate to Circuit SAT, because all decision problems like NP should end up with answer Yes or No.
But now, if I have a random NP problem, how to build a boolean circuit to test, how to group my input, what kind of gates(AND, NOT, OR etc..) should connect those inputs. So basically, my question how to design boolean Circuit which gives an answer TRUE or FALSE.
Last thing, what that answer means. I understand TRUE as this NP problem can be solved in polynomial time and FALSE cannot, am I correct?
It is huge mess in my mind, don't be really outrageous if I made logical mistakes explaining my question :) I hope you understood it.
Excitingly waiting for answers.
I understand the confusion but your understanding is not quite how it works.
NP-hardness is a qualification of decision problems, that is, a problem with answer is yes or no. If we want to show that a decision problem is NP-hard, we do so by showing that it is as least as hard as a problem of which we know it is NP-hard already, for instance SAT.
How can we show that problem A is at least as hard as problem B? Well, we can phrase that as
if we can solve A, we can also solve B
So, given an instance of problem B, we convert it to an instance of problem A, use our solution to A to solve it, and convert it back to a solution to B. Assuming that both conversations are easy, we know that A cannot be easier than B, since a solution to A is also a solution to B.
Your understanding thus had it backwards. In order to show that some problem is NP-hard, we to show that it is at least as hard as SAT, that is, given an arbitrary instance of SAT, convert it to an instance of your problem, and then solve that problem. If the answer is "yes", then the original SAT problem was satisfiable, otherwise it wasn't.
Now, as I wrote in a comment, there is no standard way to do the conversion. You somehow need to manipulate your problem, such that it "looks like SAT", in order to make the conversion. For some problems that's easier then others, but I'd claim it's the hardest part of the NP-hardness proof.
What people typically do instead is that they look for another problem, which is known to be NP-hard already, but looks a bit more like their own problem. That way, the reduction becomes a bit easier. But still it requires a lot of work and creativity. I recommend you look at some existing proofs to see how others do this.

What is a "naive" algorithm, and what is a "closed - form" solution?

I have a few questions regarding the semantics of terminology used when describing algorithms.
Firstly, what is meant by a 'naive' algorithm? How does this differ from other solutions to a given problem? What other forms can solutions take?
Secondly, I have heard much reference to having a 'closed - form' solution. I have no idea what this means either - but often it appears when trying to solve recurrence relations...
Thanks for your time
A Naive algorithm is usually the most obvious solution when one is asked a problem. It may not be a smart algorithm but will probably get the job done (...eventually.)
Eg. Trying to search for an element in a sorted array.
A Naive algorithm would be to use a Linear Search.
A Not-So Naive Solution would be to use the Binary Search.
A better example, would be in case of substring search Naive Algorithm is far less efficient than Boyer–Moore or Knuth–Morris–Pratt Algorithm
A Closed Form Solution is a simple Solution that works instantly without any loops,functions etc..
Eg:
Iterative Algorithm for sum of integer from 1 to n
s= 0
for i in 1 to n
s = s + i
end for
print s
Closed Form (for the same problem)
s = n * (n + 1 ) /2
Naive algorithm is a very simple algorithm, one with very simple rules. Sometimes the first one that comes to mind. It may be stupid and very slow, it may not even solve the problem. It may sometimes be the best possible. Here's an example of a problem and "naive" algorithms:
Problem: You are in a (2-dimensional) maze. Find your way out. (meaning: to a spot with an "EXIT" sign :)
Naive algorithm 1: Start walking and choose the right one in every intersection you meet (until you find "EXIT").
Naive algorithm 2: Start walking and choose a random one in every intersection you meet (until you find "EXIT").
Algorithm 1 will not even get you out of some mazes!
Algorithm 2 will get you out of all mazes (although this is rather hard to prove).
Closed form means you can give the one expression as solution, that does solve it without recurrence/recursive. Here one should remark, that it is not always possible to find such a closed form.
Naive means just that what it says: A first, stupid solution to the problem, that solves it, but maybe not very time-/space efficient. What one really considers 'naive' depends on the speaker, the context, and the weather of the next day. Often it is used to distinguish a very sophisticated solution (that uses some kind of trick) from the obvious implementation.

N-Puzzle with 5x5 grid, theory question

I'm writing a program which solves a 24-puzzle (5x5 grid) using two heuristic. The first uses how many blocks the incorrect place and the second uses the Manhattan distance between the blocks current place and desired place.
I have different functions in the program which use each heuristic with an A* and a greedy search and compares the results (so 4 different parts in total).
I'm curious whether my program is wrong or whether it's a limitation of the puzzle. The puzzle is generated randomly with pieces being moved around a few times and most of the time (~70%) a solution is found with most searches, but sometimes they fail.
I can understand why greedy would fail, as it's not complete, but seeing as A* is complete this leads me to believe that there's an error in my code.
So could someone please tell me whether this is an error in my thinking or a limitation of the puzzle? Sorry if this is badly worded, I'll rephrase if necessary.
Thanks
EDIT:
So I"m fairly sure it's something I'm doing wrong. Here's a step-by-step list of how I'm doing the searches, is anything wrong here?
Create a new list for the fringe, sorted by whichever heuristic is being used
Create a set to store visited nodes
Add the initial state of the puzzle to the fringe
while the fringe isn't empty..
pop the first element from the fringe
if the node has been visited before, skip it
if node is the goal, return it
add the node to our visited set
expand the node and add all descendants back to the fringe
If you mean that sliding puzzle: This is solvable if you exchange two pieces from a working solution - so if you don't find a solution this doesn't tell anything about the correctness of your algorithm.
It's just your seed is flawed.
Edit: If you start with the solution and make (random) legal moves, then a correct algorithm would find a solution (as reversing the order is a solution).
It is not completely clear who invented it, but Sam Loyd popularized the 14-15 puzzle, during the late 19th Century, which is the 4x4 version of your 5x5.
From the Wikipedia article, a parity argument proved that half of the possible configurations are unsolvable. You are probably running into something similar when your search fails.
I'm going to assume your code is correct, and you implemented all the algorithms and heuristics correctly.
This leaves us with the "generated randomly" part of your puzzle initialization. Are you sure you are generating correct states of the puzzle? If you generate an illegal state, obviously there will be no solution.
While the steps you have listed seem a little incomplete, you have listed enough to ensure that your A* will reach a solution if there is one (albeit not optimal as long as you are just simply skipping nodes).
It sounds like either your puzzle generation is flawed or your algorithm isn't implemented correctly. To easily verify your puzzle generation, store the steps used to generate the puzzle, and run it in reverse and check if the result is a solution state before allowing the puzzle to be sent to the search routines. If you ever generate an invalid puzzle, dump the puzzle, and expected steps and see where the problem is. If the puzzle passes and the algorithm fails, you have at least narrowed down where the problem is.
If it turns out to be your algorithm, post a more detailed explanation of the steps you have actually implemented (not just how A* works, we all know that), like for instance when you run the evaluation function, and where you resort the list that acts as your queue. That will make it easier to determine a problem within your implementation.

Why does backtracking make an algorithm non-deterministic?

So I've had at least two professors mention that backtracking makes an algorithm non-deterministic without giving too much explanation into why that is. I think I understand how this happens, but I have trouble putting it into words. Could somebody give me a concise explanation of the reason for this?
It's not so much the case that backtracking makes an algorithm non-deterministic.
Rather, you usually need backtracking to process a non-deterministic algorithm, since (by the definition of non-deterministic) you don't know which path to take at a particular time in your processing, but instead you must try several.
I'll just quote wikipedia:
A nondeterministic programming language is a language which can specify, at certain points in the program (called "choice points"), various alternatives for program flow. Unlike an if-then statement, the method of choice between these alternatives is not directly specified by the programmer; the program must decide at runtime between the alternatives, via some general method applied to all choice points. A programmer specifies a limited number of alternatives, but the program must later choose between them. ("Choose" is, in fact, a typical name for the nondeterministic operator.) A hierarchy of choice points may be formed, with higher-level choices leading to branches that contain lower-level choices within them.
One method of choice is embodied in backtracking systems, in which some alternatives may "fail", causing the program to backtrack and try other alternatives. If all alternatives fail at a particular choice point, then an entire branch fails, and the program will backtrack further, to an older choice point. One complication is that, because any choice is tentative and may be remade, the system must be able to restore old program states by undoing side-effects caused by partially executing a branch that eventually failed.
Out of the Nondeterministic Programming article.
Consider an algorithm for coloring a map of the world. No color can be used on adjacent countries. The algorithm arbitrarily starts at a country and colors it an arbitrary color. So it moves along, coloring countries, changing the color on each step until, "uh oh", two adjacent countries have the same color. Well, now we have to backtrack, and make a new color choice. Now we aren't making a choice as a nondeterministic algorithm would, that's not possible for our deterministic computers. Instead, we are simulating the nondeterministic algorithm with backtracking. A nondeterministic algorithm would have made the right choice for every country.
The running time of backtracking on a deterministic computer is factorial, i.e. it is in O(n!).
Where a non-deterministic computer could instantly guess correctly in each step, a deterministic computer has to try all possible combinations of choices.
Since it is impossible to build a non-deterministic computer, what your professor probably meant is the following:
A provenly hard problem in the complexity class NP (all problems that a non-deterministic computer can solve efficiently by always guessing correctly) cannot be solved more efficiently on real computers than by backtracking.
The above statement is true, if the complexity classes P (all problems that a deterministic computer can solve efficiently) and NP are not the same. This is the famous P vs. NP problem. The Clay Mathematics Institute has offered a $1 Million prize for its solution, but the problem has resisted proof for many years. However, most researchers believe that P is not equal to NP.
A simple way to sum it up would be: Most interesting problems a non-deterministic computer could solve efficiently by always guessing correctly, are so hard that a deterministic computer would probably have to try all possible combinations of choices, i.e. use backtracking.
Thought experiment:
1) Hidden from view there is some distribution of electric charges which you feel a force from and you measure the potential field they create. Tell me exactly the positions of all the charges.
2) Take some charges and arrange them. Tell me exactly the potential field they create.
Only the second question has a unique answer. This is the non-uniqueness of vector fields. This situation may be in analogy with some non-deterministic algorithms you are considering. Further consider in math limits which do not exist because they have different answers depending on which direction you approach a discontinuity from.
I wrote a maze runner that uses backtracking (of course), which I'll use as an example.
You walk through the maze. When you reach a junction, you flip a coin to decide which route to follow. If you chose a dead end, trace back to the junction and take another route. If you tried them all, return to the previous junction.
This algorithm is non-deterministic, non because of the backtracking, but because of the coin flipping.
Now change the algorithm: when you reach a junction, always try the leftmost route you haven't tried yet first. If that leads to a dead end, return to the junction and again try the leftmost route you haven't tried yet.
This algorithm is deterministic. There's no chance involved, it's predictable: you'll always follow the same route in the same maze.
If you allow backtracking you allow infinite looping in your program which makes it non-deterministic since the actual path taken may always include one more loop.
Non-Deterministic Turing Machines (NDTMs) could take multiple branches in a single step. DTMs on the other hand follow a trial-and-error process.
You can think of DTMs as regular computers. In contrast, quantum computers are alike to NDTMs and can solve non-deterministic problems much easier (e.g. see their application in breaking cryptography). So backtracking would actually be a linear process for them.
I like the maze analogy. Lets think of the maze, for simplicity, as a binary tree, in which there is only one path out.
Now you want to try a depth first search to find the correct way out of the maze.
A non deterministic computer would, at every branching point, duplicate/clone itself and run each further calculations in parallel. It is like as if the person in the maze would duplicate/clone himself (like in the movie Prestige) at each branching point and send one copy of himself into the left subbranch of the tree and the other copy of himself into the right subbranch of the tree.
The computers/persons who end up at a dead end they die (terminate without answer).
Only one computer will survive (terminate with an answer), the one who gets out of the maze.
The difference between backtracking and non-determinism is the following.
In the case of backtracking there is only one computer alive at any given moment, he does the traditional maze solving trick, simply marking his path with a chalk and when he gets to a dead end he just simply backtracks to a branching point whose sub branches he did not yet explore completely, just like in a depth first search.
IN CONTRAST :
A non deteministic computer can clone himself at every branching point and check for the way out by running paralell searches in the sub branches.
So the backtracking algorithm simulates/emulates the cloning ability of the non-deterministic computer on a sequential/non-parallel/deterministic computer.

Resources