Is Manhattan distance still an admissible heuristic in this modified n-puzzle? - algorithm

I successfully implemented an 8-puzzle solver using the A* algorithm and now I am adding a twist to it: there could be more than one empty space in the puzzle and the numbers on the tiles are no longer unique (there could be numbers that are the same).
While the algorithm works after I modified it to generate successor states for all empty spaces, it didn't solve the game in the smallest number of moves possible (because I actually came up with a smaller number of moves when I tried to solve it by hand, surprise!)
Question: Is Manhattan distance still a viable heuristic in this puzzle? If not, what could the heuristic be?

Yes, an admissible heuristic for this problem can involve Manhattan distance.
The simplest approach is just to take the Manhattan distance to the closest possible target location for each tile.
This is clearly admissible because it's impossible to take less moves to get to any location quicker than directly moving to the closest one with ignoring all obstacles.
But we can do better - for two identical tiles A and B with target positions 1 and 2, rather than calculating the distance to the closest one for each, we can calculate the distance of all possible assignments of tiles to positions, so:
min(dist(A,1) + dist(B,2), dist(A,2) + dist(B,1))
This can be generalized to any number of tiles, but keep in mind that, for n identical tiles, there are n! such possibilities, so it gets quite expensive to calculate quite quickly.
Seeing why this is admissible is still fairly easy - since we're calculating the shortest possible distance for all assignments of tiles to positions, there's no way that the actual shortest distance could be any less.

Related

Manhattan, Euclidian and Chebyshev in a A* Algorithm

I am confused by what the purpose of manhattan, euclidian and chebyshev in an A* Algorithm. Is it just the distance calculation or does the A* algorithm find paths in different ways depending on those metrics (vertical & horizontal or diagonally or all three). My impression of these three metrics were that they have their different methods of calculating distance as seen in this website : https://lyfat.wordpress.com/2012/05/22/euclidean-vs-chebyshev-vs-manhattan-distance/
But some people tell me that the A* algorithm moves only vertical and horizontal if the manhattan metric is used and must be drawn that way. Only diagonally for
euclidian and can move in all three directions for chebyshev.
So what I wanted to clarify was does the A* algorithm run in different directions based on the metrics (Manhattan, Chebyshev and Euclidian) or does it run on all directions but have different heuristic costs based on the metrics. I am a student and have been confused by this so any clarification possible is appreciated!
Actually, things are a little bit the other way around, i.e. we usually know the movement type that we are interested in, and this movement type determines which is the best metric (Manhattan, Chebyshev, Euclidian) to be used in the heuristic.
Changing the heuristic will not change the connectivity of neighboring cells.
In order to make the A* algorithm find paths according to a particular movement type (i.e. only horizontal+vertical, or diagonal, etc), the neighbor enumeration procedure should be set accordingly. (This enumeration of the neighbors of a node is done somewhere inside the main loop of the algorithm, after a node is popped from the queue).
In brief, not the heuristic, but the way the neighbors of a node are enumerated determines which type of movements the A* algorithm allows.
Afterwards, once a movement type was established and encoded into the algorithm as described above, it is also important to find a good heuristic. The heuristic needs to satisfy certain criteria in order to be valid (it needs to not over-estimate the distance to the target), thus some heuristics are incompatible with certain movement types. Choosing an invalid heuristic no longer guarantees that A* will find the proper solution when it's done. A good choice for the heuristic is to use precisely the one measuring distance under the selected movement type (e.g. Manhattan for horizontal/vertical, and so on).
It is also worth mentioning the octile distance, which is a very accurate estimate of the distance when traveling on a grid with neighboring diagonals allowed. It basically estimates a direct path from A to B using neighboring diagonal moves which have a cost of sqrt(2) instead of 1 for cardinal movements. In other words it is a kind of Manhattan distance but with diagonals.
A very good resource on all of those grid heuristics is found here
http://theory.stanford.edu/~amitp/GameProgramming/Heuristics.html

Hill climbing in an n-dimensional space: finding the neighbor

In hill climbing for 1 dimension, I try two neighbors - a small delta to the left and one to the right of my current point, and then keep the one that gives a higher value of the objective function. How do I extend it to an n-dimensional space? How does one define a neighbor for an n-dimensional space? Do I have to try 2^n neighbors (a delta applied to each of the dimension)?
You don't need to compare each pair of neighbors, you need to compute a set of neighbors, e.g. on a circle (sphere/ hypersphere in a higher dimensions) with a radius of delta, and then take the one with the highest values to "climb up". In any case you will discretize the neighborhood of your current solution and compute the score function for each neighbor. When you can differentiate your function, than, Gradient ascent/descent based algorithms may solve your problem:
1) Compute the gradient (direction of steepest ascent)
2) Go a small step into the direction of the gradient
3) Stop if solution does not change
A common problem with those algorithms is, that you often only find local maxima / minima. You can find a great overview on gradient descent/ascent algorithms here: http://sebastianruder.com/optimizing-gradient-descent/
If you are using IEEE-754 floating point numbers then the obvious answer is something like (2^52*(log_2(delta)+1023))^(n-1)+1 if delta>=2^(-1022) (more or less depending on your search space...) as that is the only way you can be certain that there are no more neighboring solutions with a distance of delta.
Even assuming you instead take a random fixed size sample of all points within a given distance of delta, lets say delta=.1, you would still have the problem that if the distance from the local optimum was .0001 the probability of finding an improvement in just 1 dimension would be less than .0001/.1/2=0.05% so you would need to take more and more random samples as you get closer to the local optimum (of which you don't know the value...).
Obviously hill climbing is not intended for the real number space or theoretical graph spaces with infinite degree. You should instead be using a global search algorithm.
One example of a multidimensional search algorithm which needs only O(n) neighbours instead of O(2^n) neighbours is the Torczon simplex method described in Multidirectional search: A direct search algorithm for parallel machines (1989). I chose this over the more widely known Nelder-Mead method because the Torczon simplex method has a convergence proof (convergence to a local optimum given some reasonable conditions).

How to tell when the A* algorithm is a good option, and how to choose a good heuristic?

Recently I wrote a solver for the famous "15 puzzle" using the A* algorithm by using a heuristic function based off the sum of the Manhattan distances of the distances for each tile to their destination spots.
This led me to wonder two things:
How do you know when the A* algorithm is even something to use? Unless I came across online tutorials, I would have never guessed that the 15 puzzle could be solved this way.
How do you know which heuristic function to use? At first, for the 15 puzzle, I considered a simple "sum of tiles not in position" heuristic. So if all pieces weren't in their right spots, the heuristic for the 15 puzzle might return 15, whereas 0 would indicate a solved board. But somehow the sum of the distances are better. How does one know, going into it?
If you're exploring a graph to find a path that is in some way "shortest" (the cost doesn't have to be a "distance", but it has to be monotone), you can already use Dijkstra's. Your problem will typically look nothing like path-finding at a first glance though, as in, you're not planning to "travel over a route". It's more abstract than that.
Then if you can use Dijkstra and you have some admissible heuristic (that's the hard part), you can use A*.
An often used technique for finding heuristics is dropping some constraint of your problem. For example, if you can teleport each tile to its destination regardless of whether there's already a tile there, it will take #displacements teleports. So there's the first heuristic. If you have to slide the tiles but they can slide through each other, the cost for each tile is the Manhattan distance to its destination. Then you can look at improving the heuristic, for example the Manhattan distance heuristic obviously ignores that tiles interfere with each other as they move, but there is a simple case where we know where tiles must conflict and use more moves: consider two tiles (pretend there are no other tiles) in the same row and their destinations are also on that row but in order to get there they'd have to pass through each other. They'd have to go around each other, adding two vertical moves. This gives the Linear Conflicts heuristic. Even more interference can be taken into account, for example with pattern databases.

Find member of set with minimizing total distance property

I'm looking for efficient solution of following problem: For given set of points in n-dimensional euclidian space find such member of this set that minimizes total distance to other points in set.
The obvious naïve approach is quadratic, so I'm looking for something less than quadratic.
My first thought was that all I need is just to find the center of bounding sphere and then, find the closest point in set to this point. But this is actually not true, imagine right triangle - all its vertices are equidistant from such center, nevertheless, exactly one vertice meets our requirements.
It would be nice it one will shed some light on this issue.
What minimizes the distance to all of the points is their average. Only a guess, but after you'll find the average you could find a point closest to it. As correctly pointed out in comments, median instead of average will actually minimize the distance (average will minimize squared distance). Median can also be calculated in O(n). For high dimensional datasets this solution would be O(n*m) of course, where m is the number of dimensions.
Also some links:
See accepted answer here: Algorithm to find point of minimum total distance from locations
And link provided by mcdowella: http://en.wikipedia.org/wiki/Geometric_median
I am making this up as I go along, but there appears to be a close connection between "best point of a set" and "best point" in convex optimization.
Your score function is a sum of distances. Each distance is convex U-shaped (OK V-shaped in this case) so their sum is convex U-shaped. In particular it has a perfectly good derivative everywhere except at points in the set, and this derivative is optimistic - if you take the value at a point and its derivative, neglecting any point at the point you are looking at, then predictions based on this will be optimistic - the line formed using the derivative lies almost entirely beneath the correct answer but grazes it at a single point.
This leads to the following algorithm:
Repeatedly
Pick a point at random and look to see if is the best point so far. If so, take note of it. Take the derivative of the sum of distances at this point. Use this, and the value at that point, to work out the predicted sum of distances at every other point and discard the points where this prediction is worse than the best answer so far as possible answers (although you still need to take them into account when working out distances and derivatives). These will be the points on the far side of a plane drawn through the chosen point normal to the derivative.
Now discard the chosen point as a contender as well and repeat if there are any points left to consider.
I would expect this to be something like n log n on randomly chosen points. However, if the set of points form the vertices of a regular polygon in n dimensions then it will cost N^2, discarding only the chosen point each time - any of the N points is in fact a correct answer and they all have the same sum of distances from each other.
I will of course up-vote anybody who can confirm or deny this general principle for finding the best of a set of given points under a convex objective function.
OK - I was interested enough in this to program this up - so I have 200+ lines of Java to dump in here if anybody cares. In 2 dimensions it's very fast, but at 20 dimensions you gain only a factor of two or so - this is reasonably understandable - each iteration cuts off points by projecting the problem down to a line and chopping off a fraction of the points outside the line. A randomly chosen point will be about half as far away from the centre as the other points - and very roughly you can expect the projection to cut off all but some multiple of the d-th root of 1/2 so as d increases the fraction of points you can discard in each iteration reduces.

What are some good methods to finding a heuristic for the A* algorithm?

You have a map of square tiles where you can move in any of the 8 directions. Given that you have function called cost(tile1, tile2) which tells you the cost of moving from one adjacent tile to another, how do you find a heuristic function h(y, goal) that is both admissible and consistent? Can a method for finding the heuristic be generalized given this setting, or would it be vary differently depending on the cost function?
Amit's tutorial is one of the best I've seen on A* (Amit's page). You should find some very useful hint about heuristics on this page .
Here is the quote about your problem :
On a square grid that allows 8 directions of movement, use Diagonal distance (L∞).
It depends on the cost function.
There are a couple of common heuristics, such as Euclidean distance (the absolute distance between two tiles on a 2d plane) and Manhattan distance (the sum of the absolute x and y deltas). But these assume that the actual cost is never less than a certain amount. Manhattan distance is ruled out if your agent can efficiently move diagonally (i.e. the cost of moving to a diagonal is less than 2). Euclidean distance is ruled out if the cost of moving to a neighbouring tile is less than the absolute distance of that move (e.g. maybe if the adjacent tile was "downhill" from this one).
Edit
Regardless of your cost function, you always have an admissable and consistent heuristic in h(t1, t2) = -∞. It's just not a good one.
Yes, the heuristic is dependent on the cost function, in a couple of ways. First, it must be in the same units. Second, you can't have a lower-cost path through actual nodes than the cost of the heuristic.
In the real world, used for things like navigation on a road network, your heuristic might be "the time a car would take on a direct path at 1.5x the speed limit." The cost for each road segment would use the actual speed limit, which will give a higher cost.
So, what is your cost function between tiles? Is it based on physical properties, or defined outside of your graph?

Resources