Comparison of A* heuristics for solving an N-puzzle - algorithm

I am trying to solve the N-puzzle using the A* algorithm with 3 different heuristic functions. I want to know how to compare each of the heuristics in terms of time complexity. The heuristics I am using are: manhattan distance , manhattan distance + linear conflict, N-max swap. And specifically for an 8-puzzle and an 15-puzzle.

The N-puzzle is, in general, NP hard to find the shortest solution, so no matter what heuristic you use it's unlikely you'll be able to find any difference in complexity between them, since you won't be prove the tightness of any bound.
If you restrict yourself to the 8-puzzle or 15-puzzle, an A* algorithm with any admissible heuristic will run in O(1) time since there are a finite (albeit large) number of board positions.

As #Harold said in his comment, the approach to compare time complexity of heuristic functions is tipically by experimental tests. In your case, generate a set of n random problems for the 8-puzzle and the 15-puzzle and solve them using the different heuristic functions. Things to be aware of are:
The comparison will always depend on several factors, like hardware expecs, programming language, your skills when implementing the algorithm, ...
Generally speaking, a more informed heuristic will always expand less nodes than a less informed one, and will probably be faster.
And finally, in order to compare the three heuristics for each problem set, I would suggest a graphic with average running times (repeat for example 5 times each problem) where:
The problems are in the x-axis sorted by difficulty.
The running times are in the y-axis for each heuristic function (perhaps in logarithmic scale if the difference between the alternatives cannot be easily seen).
and a similar graphic with the number of explored states.

Related

Difference between a stochastic and a heuristic algorithm

Extending the question of streetparade, I would like to ask what is the difference, if any, between a stochastic and a heuristic algorithm.
Would it be right to say that a stochastic algorithm is actually one type of heuristic?
TTBOMK, "stochastic algorithm" is not a standard term. "Randomized algorithm" is, however, and it's probably what is meant here.
Randomized: Uses randomness somehow. There are two flavours: Monte Carlo algorithms always finish in bounded time, but don't guarantee an optimal solution, while Las Vegas algorithms aren't necessarily guaranteed to finish in any finite time, but promise to find the optimal solution. (Usually they are also required to have a finite expected running time.) Examples of common Monte Carlo algorithms: MCMC, simulated annealing, and Miller-Rabin primality testing. Quicksort with randomized pivot choice is a Las Vegas algorithm that always finishes in finite time. An algorithm that does not use any randomness is deterministic.
Heuristic: Not guaranteed to find the correct answer. An algorithm that is not heuristic is exact.
Many heuristics are sensitive to "incidental" properties of the input that don't affect the true solution, such as the order items are considered in the First-Fit heuristic for the Bin Packing problem. In this case they can be thought of as Monte Carlo randomized algorithms: you can randomly permute the inputs and rerun them, always keeping the best answer you find. OTOH, other heuristics don't have this property -- e.g. the First-Fit-Decreasing heuristic is deterministic, since it always first sorts the items in decreasing size order.
If the set of possible outputs of a particular randomized algorithm is finite and contains the true answer, then running it long enough is "practically guaranteed" to eventually find it (in the sense that the probability of not finding it can be made arbitrarily small, but never 0). Note that it's not automatically the case that some permutation of the inputs to a heuristic will result in getting the exact answer -- in the case of First-Fit, it turns out that this is true, but this was only proven in 2009.
Sometimes stronger statements about convergence of randomized algorithms can be made: these are usually along the lines of "For any given small threshold d, after t steps we will be within d of the optimal solution with probability f(t, d)", with f(t, d) an increasing function of t and d.
Booth approaches are usually used to speed up genere and test solutions to NP complete problems
Stochastic algorithms use randomness
They use all combinations but not in order but instead they use random ones from the whole range of possibilities hoping to hit the solution sooner. Implementation is fast easy and single iteration is also fast (constant time)
Heuristics algorithms
They pick up the combinations not randomly but based on some knowledge on used process, input dataset, or usage instead. So they lower the number of combinations significantly to only those they are probably the solution and use only those but usually all of them until solution is found.
Implementation complexity depends on the problem, single iteration is usually much much slower then stochastic approach (constant time) so heuristics is used only if the number of possibilities is lowered enough to actual speed up is visible because even if algorithm complexity with heuristic is usually much lower sometimes the constant time is big enough to even slow things down ... (in runtime terms)
Booth approaches can be combined together

I need to solve an NP-hard problem. Is there hope?

There are a lot of real-world problems that turn out to be NP-hard. If we assume that P ≠ NP, there aren't any polynomial-time algorithms for these problems.
If you have to solve one of these problems, is there any hope that you'll be able to do so efficiently? Or are you just out of luck?
If a problem is NP-hard, under the assumption that P ≠ NP there is no algorithm that is
deterministic,
exactly correct on all inputs all the time, and
efficient on all possible inputs.
If you absolutely need all of the above guarantees, then you're pretty much out of luck. However, if you're willing to settle for a solution to the problem that relaxes some of these constraints, then there very well still might be hope! Here are a few options to consider.
Option One: Approximation Algorithms
If a problem is NP-hard and P ≠ NP, it means that there's is no algorithm that will always efficiently produce the exactly correct answer on all inputs. But what if you don't need the exact answer? What if you just need answers that are close to correct? In some cases, you may be able to combat NP-hardness by using an approximation algorithm.
For example, a canonical example of an NP-hard problem is the traveling salesman problem. In this problem, you're given as input a complete graph representing a transportation network. Each edge in the graph has an associated weight. The goal is to find a cycle that goes through every node in the graph exactly once and which has minimum total weight. In the case where the edge weights satisfy the triangle inequality (that is, the best route from point A to point B is always to follow the direct link from A to B), then you can get back a cycle whose cost is at most 3/2 optimal by using the Christofides algorithm.
As another example, the 0/1 knapsack problem is known to be NP-hard. In this problem, you're given a bag and a collection of objects with different weights and values. The goal is to pack the maximum value of objects into the bag without exceeding the bag's weight limit. Even though computing an exact answer requires exponential time in the worst case, it's possible to approximate the correct answer to an arbitrary degree of precision in polynomial time. (The algorithm that does this is called a fully polynomial-time approximation scheme or FPTAS).
Unfortunately, we do have some theoretical limits on the approximability of certain NP-hard problems. The Christofides algorithm mentioned earlier gives a 3/2 approximation to TSP where the edges obey the triangle inequality, but interestingly enough it's possible to show that if P ≠ NP, there is no polynomial-time approximation algorithm for TSP that can get within any constant factor of optimal. Usually, you need to do some research to learn more about which problems can be well-approximated and which ones can't, since many NP-hard problems can be approximated well and many can't. There doesn't seem to be a unified theme.
Option Two: Heuristics
In many NP-hard problems, standard approaches like greedy algortihms won't always produce the right answer, but often do reasonably well on "reasonable" inputs. In many cases, it's reasonable to attack NP-hard problems with heuristics. The exact definition of a heuristic varies from context to context, but typically a heuristic is either an approach to a problem that "often" gives back good answers at the cost of sometimes giving back wrong answers, or is a useful rule of thumb that helps speed up searches even if it might not always guide the search the right way.
As an example of the first type of heuristic, let's look at the graph-coloring problem. This NP-hard problem asks, given a graph, to find the minimum number of colors necessary to paint the nodes in the graph such that no edge's endpoints are the same color. This turns out to be a particularly tough problem to solve with many other approaches (the best known approximation algorithms have terrible bounds, and it's not suspected to have a parameterized efficient algorithm). However, there are many heuristics for graph coloring that do quite well in practice. Many greedy coloring heuristics exist for assigning colors to nodes in a reasonable order, and these heuristics often do quite well in practice. Unfortunately, sometimes these heuristics give terrible answers back, but provided that the graph isn't pathologically constructed the heuristics often work just fine.
As an example of the second type of heuristic, it's helpful to look at SAT solvers. SAT, the Boolean satisfiability problem, was the first problem proven to be NP-hard. The problem asks, given a propositional formula (often written in conjunctive normal form), to determine whether there is a way to assign values to the variables such that the overall formula evaluates to true. Modern SAT solvers are getting quite good at solving SAT in many cases by using heuristics to guide their search over possible variable assignments. One famous SAT-solving algorithm, DPLL, essentially tries all possible assignments to see if the formula is satisfiable, using heuristics to speed up the search. For example, if it finds that a variable is either always true or always false, DPLL will try assigning that variable its forced value before trying other variables. DPLL also finds unit clauses (clauses with just one literal) and sets those variables' values before trying other variables. The net effect of these heuristics is that DPLL ends up being very fast in practice, even though it's known to have exponential worst-case behavior.
Option Three: Pseudopolynomial-Time Algorithms
If P ≠ NP, then no NP-hard problem can be solved in polynomial time. However, in some cases, the definition of "polynomial time" doesn't necessarily match the standard intuition of polynomial time. Formally speaking, polynomial time means polynomial in the number of bits necessary to specify the input, which doesn't always sync up with what we consider the input to be.
As an example, consider the set partition problem. In this problem, you're given a set of numbers and need to determine whether there's a way to split the set into two smaller sets, each of which has the same sum. The naive solution to this problem runs in time O(2n) and works by just brute-force testing all subsets. With dynamic programming, though, it's possible to solve this problem in time O(nN), where n is the number of elements in the set and N is the maximum value in the set. Technically speaking, the runtime O(nN) is not polynomial time because the numeric value N is written out in only log2 N bits, but assuming that the numeric value of N isn't too large, this is a perfectly reasonable runtime.
This algorithm is called a pseudopolynomial-time algorithm because the runtime O(nN) "looks" like a polynomial, but technically speaking is exponential in the size of the input. Many NP-hard problems, especially ones involving numeric values, admit pseudopolynomial-time algorithms and are therefore easy to solve assuming that the numeric values aren't too large.
For more information on pseudopolynomial time, check out this earlier Stack Overflow question about pseudopolynomial time.
Option Four: Randomized Algorithms
If a problem is NP-hard and P ≠ NP, then there is no deterministic algorithm that can solve that problem in worst-case polynomial time. But what happens if we allow for algorithms that introduce randomness? If we're willing to settle for an algorithm that gives a good answer on expectation, then we can often get relatively good answers to NP-hard problems in not much time.
As an example, consider the maximum cut problem. In this problem, you're given an undirected graph and want to find a way to split the nodes in the graph into two nonempty groups A and B with the maximum number of edges running between the groups. This has some interesting applications in computational physics (unfortunately, I don't understand them at all, but you can peruse this paper for some details about this). This problem is known to be NP-hard, but there's a simple randomized approximation algorithm for it. If you just toss each node into one of the two groups completely at random, you end up with a cut that, on expectation, is within 50% of the optimal solution.
Returning to SAT, many modern SAT solvers use some degree of randomness to guide the search for a satisfying assignment. The WalkSAT and GSAT algorithms, for example, work by picking a random clause that isn't currently satisfied and trying to satisfy it by flipping some variable's truth value. This often guides the search toward a satisfying assignment, causing these algorithms to work well in practice.
It turns out there's a lot of open theoretical problems about the ability to solve NP-hard problems using randomized algorithms. If you're curious, check out the complexity class BPP and the open problem of its relation to NP.
Option Five: Parameterized Algorithms
Some NP-hard problems take in multiple different inputs. For example, the long path problem takes as input a graph and a length k, then asks whether there's a simple path of length k in the graph. The subset sum problem takes in as input a set of numbers and a target number k, then asks whether there's a subset of the numbers that dds up to exactly k.
Interestingly, in the case of the long path problem, there's an algorithm (the color-coding algorithm) whose runtime is O((n3 log n) · bk), where n is the number of nodes, k is the length of the requested path, and b is some constant. This runtime is exponential in k, but is only polynomial in n, the number of nodes. This means that if k is fixed and known in advance, the runtime of the algorithm as a function of the number of nodes is only O(n3 log n), which is quite a nice polynomial. Similarly, in the case of the subset sum problem, there's a dynamic programming algorithm whose runtime is O(nW), where n is the number of elements of the set and W is the maximum weight of those elements. If W is fixed in advance as some constant, then this algorithm will run in time O(n), meaning that it will be possible to exactly solve subset sum in linear time.
Both of these algorithms are examples of parameterized algorithms, algorithms for solving NP-hard problems that split the hardness of the problem into two pieces - a "hard" piece that depends on some input parameter to the problem, and an "easy" piece that scales gracefully with the size of the input. These algorithms can be useful for finding exact solutions to NP-hard problems when the parameter in question is small. The color-coding algorithm mentioned above, for example, has proven quite useful in practice in computational biology.
However, some problems are conjectured to not have any nice parameterized algorithms. Graph coloring, for example, is suspected to not have any efficient parameterized algorithms. In the cases where parameterized algorithms exist, they're often quite efficient, but you can't rely on them for all problems.
For more information on parameterized algorithms, check out this earlier Stack Overflow question.
Option Six: Fast Exponential-Time Algorithms
Exponential-time algorithms don't scale well - their runtimes approach the lifetime of the universe for inputs as small as 100 or 200 elements.
What if you need to solve an NP-hard problem, but you know the input is reasonably small - say, perhaps its size is somewhere between 50 and 70. Standard exponential-time algorithms are probably not going to be fast enough to solve these problems. What if you really do need an exact solution to the problem and the other approaches here won't cut it?
In some cases, there are "optimized" exponential-time algorithms for NP-hard problems. These are algorithms whose runtime is exponential, but not as bad an exponential as the naive solution. For example, a simple exponential-time algorithm for the 3-coloring problem (given a graph, determine if you can color the nodes one of three colors each so that no edge's endpoints are the same color) might work checking each possible way of coloring the nodes in the graph, testing if any of them are 3-colorings. There are 3n possible ways to do this, so in the worst case the runtime of this algorithm will be O(3n · poly(n)) for some small polynomial poly(n). However, using more clever tricks and techniques, it's possible to develop an algorithm for 3-colorability that runs in time O(1.3289n). This is still an exponential-time algorithm, but it's a much faster exponential-time algorithm. For example, 319 is about 109, so if a computer can do one billion operations per second, it can use our initial brute-force algorithm to (roughly speaking) solve 3-colorability in graphs with up to 19 nodes in one second. Using the O((1.3289n)-time exponential algorithm, we could solve instances of up to about 73 nodes in about a second. That's a huge improvement - we've grown the size we can handle in one second by more than a factor of three!
As another famous example, consider the traveling salesman problem. There's an obvious O(n! · poly(n))-time solution to TSP that works by enumerating all permutations of the nodes and testing the paths resulting from those permutations. However, by using a dynamic programming algorithm similar to that used by the color-coding algorithm, it's possible to improve the runtime to "only" O(n2 2n). Given that 13! is about one billion, the naive solution would let you solve TSP for 13-node graphs in roughly a second. For comparison, the DP solution lets you solve TSP on 28-node graphs in about one second.
These fast exponential-time algorithms are often useful for boosting the size of the inputs that can be exactly solved in practice. Of course, they still run in exponential time, so these approaches are typically not useful for solving very large problem instances.
Option Seven: Solve an Easy Special Case
Many problems that are NP-hard in general have restricted special cases that are known to be solvable efficiently. For example, while in general it’s NP-hard to determine whether a graph has a k-coloring, in the specific case of k = 2 this is equivalent to checking whether a graph is bipartite, which can be checked in linear time using a modified depth-first search. Boolean satisfiability is, generally speaking, NP-hard, but it can be solved in polynomial time if you have an input formula with at most two literals per clause, or where the formula is formed from clauses using XOR rather than inclusive-OR, etc. Finding the largest independent set in a graph is generally speaking NP-hard, but if the graph is bipartite this can be done efficiently due to König’s theorem.
As a result, if you find yourself needing to solve what might initially appear to be an NP-hard problem, first check whether the inputs you actually need to solve that problem on have some additional restricted structure. If so, you might be able to find an algorithm that applies to your special case and runs much faster than a solver for the problem in its full generality.
Conclusion
If you need to solve an NP-hard problem, don't despair! There are lots of great options available that might make your intractable problem a lot more approachable. No one of the above techniques works in all cases, but by using some combination of these approaches, it's usually possible to make progress even when confronted with NP-hardness.

Confused about the definition of "Exact Algorithm"

What does it mean by saying "an algorithm is exact" in terms of Optimization and/or Computer Science? I need a precisely logical/epistemological definition.
Exact and approximate algorithms are methods for solving optimization problems.
Exact algorithms are algorithms that always find the optimal solution to a given optimization problem.
However, in combinatorial problems or total optimization problems, conventional methods are usually not effective enough, especially when the problem search area is large and complex. Among other methods we can use heuristics to solve that problems. Heuristics tend to give suboptimal solutions. A subset of heuristics are approximation algorithms.
When we use approximation algorithms we can prove a bound on the ratio between the optimal solution and the solution produced by the algorithm.
E.g. In some NP-hard problems there are some polynomial-time approximation algorithms while the best known exact algorithms need exponential time.
For example while there is a polynomial-time approximation algorithm for Vertex Cover, the best exact algorithm (using memoization) runs in O(1.1889n) pp 62-63.
The term exact is usually used to mean "the opposite of approximate". An approximation algorithm finds a solution to a slight variation of an optimzation problem that admits soltions that are "close" to the optimum in some sense, but nonetheless are desirable. As #Sirko said in the comments, the approximation is usually of interest because the exact problem is intractable or undecidable, where the approximate version is not. Often, more than one kind of approximation may be of interest.
Here are examples:
An exact algorithm for the Traveling Salesman problem is NP Complete. The TSP is to find a route of minimum length L for visiting each of N cities on a map. NP Completeness says the best known algorithms still need time that is an exponential function of N. An approximation algorithm for TSP finds a route of length no more than cL for some fixed c > 1. For example, you can easily construct the minimum spanning tree of the cities in time that is a polynomial in N and walk around the tree, covering each edge twice, to obtain an approximatoin algorithm for the case c = 2. The implied goal is to find algorithms for constants c as close to one as possible.
An exact algorithm for compiling a program that produces correct results in minimum time from any given source code is - under reasonable assumptions - undecidable. Yet of course we use "optimizing compilers" every day that improve the speed of code with no promise of true optimality.
In optimization, there are two kinds of algorithms. Exact and approximate algorithms.
Exact algorithms can find the optimum solution with precision.
Approximate algorithms can find a near optimum solution.
The main difference is that exact algorithms apply in "easy" problems.
What makes a problem "easy" is that it can be solved in reasonable time and the computation time doesn't scale up exponentially if the problem gets bigger. This class of problems is known as P(Deterministic Polynomial Time). Problems of this class are used to be optimized using exact algorithms.
For every other class of problems approximate algorithms are preferred.

When locally optimal solutions equal global optimal? Thinking about greedy algorithm

Recently I've been looking at some greedy algorithm problems. I am confused about locally optimal. As you know, greedy algorithms are composed of locally optimal choices. But combining of locally optimal decisions doesn't necessarily mean globally optimal, right?
Take making change as an example: using the least number of coins to make 15¢, if we have
10¢, 5¢, and 1¢ coins then you can achieve this with one 10¢ and one 5¢. But if we add in a 12¢ coin the greedy algorithm fails as (1×12¢ + 3×1¢) uses more coins than (1×10¢ + 1×5¢).
Consider some classic greedy algorithms, e.g. Huffman, Dijkstra. In my opinion, these algorithms are successful as they have no degenerate cases which means a combination of locally optimal steps always equals global optimal. Do I understand right?
If my understanding is correct, is there a general method for checking if a greedy algorithm is optimal?
I found some discussion of greedy algorithms elsewhere on the site.
However, the problem doesn't go into too much detail.
Generally speaking, a locally optimal solution is always a global optimum whenever the problem is convex. This includes linear programming; quadratic programming with a positive definite objective; and non-linear programming with a convex objective function. (However, NLP problems tend to have a non-convex objective function.)
Heuristic search will give you a global optimum with locally optimum decisions if the heuristic function has certain properties. Consult an AI book for details on this.
In general, though, if the problem is not convex, I don't know of any methods for proving global optimality of a locally optimal solution.
There are some theorems that express problems for which greedy algorithms are optimal in terms of matroids (also:greedoids.) See this Wikipedia section for details: http://en.wikipedia.org/wiki/Matroid#Greedy_algorithms
A greedy algorithm almost never succeeds in finding the optimal solution. In the cases that it does, this is highly dependent on the problem itself. As Ted Hopp explained, with convex curves, the global optimal can be found, assuming you are to find the maximum of the objective function of course (conversely, concave curves also work if you are to minimise). Otherwise, you will almost certainly get stuck in the local optima. This assumes that you already know the objective function.
Another factor which I can think of is the neighbourhood function. Certain neighbourhoods, if large enough, will encompass both the global and local maximas, so that you can avoid the local maxima. However, you can't make the neighbourhood too large or search will be slow.
In other words, whether you find a global optimal or not with greedy algorithms is problem specific, although for most cases, you will not find the globally optimal.
You need to design a witness example where your premise that the algorithm is a global one fails. Design it according to the algorithm and the problem.
Your example of the coin change was not a valid one. Coins are designed purposely to have all the combinations possible, but not to add confusion. Your addition of 12c is not warranted and is extra.
With your addition, the problem is not coin change but a different one (even though the subject are coins, you can change the example to what you want). For this, you yourself gave a witness example to show the greedy algorithm for this problem will get stuck in a local maximum.

Minimum Bandwidth Problem

I'm interesting the NP-complete "minimum bandwidth" problem for finding the minimum bandwidth of a graph. For those not familiar, here is a link about it...
http://en.wikipedia.org/wiki/Graph_bandwidth
I've implemented the Cuthill-McKee algorithm, and this was very successful at giving me a permutation of the vertices in which the bandwidth was reduced; however, I'm looking for the minimum bandwidth, not just a reduced bandwidth that is close. If any of you have experience with this problem, what algorithms provide solutions that are the minimum and not just reduced? I don't need actual implementation of any algorithm, I just want suggestions for what algorithms to research that yield actual minimum bandwidths.
That's interesting problem, but when I read Wiki (your link):
Both the unweighted and weighted
versions are special cases of the
quadratic bottleneck assignment
problem. The bandwidth problem is
NP-hard, even for some special
cases.[4] Regarding the existence of
efficient approximation algorithms, it
is known that the bandwidth is NP-hard
to approximate within any constant,
and this even holds when the input
graphs are restricted to caterpillar
trees (Dubey, Feige & Unger 2010). On
the other hand, a number of
polynomially-solvable special cases
are known.
So wiki says it's NP-Hard to approximate it with any constant (So there is no PTAS for this problem) and your chance is just use heuristic algorithms, sure brute force algorithm works, (numbering node with numbers between 1..n randomly in startup, after that use brute force) but you should spend 1000 year to solve it for caterpillar.
You should search for heuristic algorithms, not approximation and exact algorithms.
As it is NP complete you have to use some kind of "brute force" algorith. So mainly you have the different brute force as option, e.g. like branch-and-bound or linear programming (its LIP, so its in NP).
As it is NP complete you can also take every solution to a different NP complete problem (TSP, SAT,...) by transforming the problem instance from the NP-completeness proof, apply the algorith, and transform it back.
The simplest improvement you can do, is probably to take the result of your Cuthill-McKee algorithm and throw Tabu Search on it.
See this answer for an overview on some of the algorithms that can be applied.

Resources