Manhattan, Euclidian and Chebyshev in a A* Algorithm - algorithm

I am confused by what the purpose of manhattan, euclidian and chebyshev in an A* Algorithm. Is it just the distance calculation or does the A* algorithm find paths in different ways depending on those metrics (vertical & horizontal or diagonally or all three). My impression of these three metrics were that they have their different methods of calculating distance as seen in this website : https://lyfat.wordpress.com/2012/05/22/euclidean-vs-chebyshev-vs-manhattan-distance/
But some people tell me that the A* algorithm moves only vertical and horizontal if the manhattan metric is used and must be drawn that way. Only diagonally for
euclidian and can move in all three directions for chebyshev.
So what I wanted to clarify was does the A* algorithm run in different directions based on the metrics (Manhattan, Chebyshev and Euclidian) or does it run on all directions but have different heuristic costs based on the metrics. I am a student and have been confused by this so any clarification possible is appreciated!

Actually, things are a little bit the other way around, i.e. we usually know the movement type that we are interested in, and this movement type determines which is the best metric (Manhattan, Chebyshev, Euclidian) to be used in the heuristic.
Changing the heuristic will not change the connectivity of neighboring cells.
In order to make the A* algorithm find paths according to a particular movement type (i.e. only horizontal+vertical, or diagonal, etc), the neighbor enumeration procedure should be set accordingly. (This enumeration of the neighbors of a node is done somewhere inside the main loop of the algorithm, after a node is popped from the queue).
In brief, not the heuristic, but the way the neighbors of a node are enumerated determines which type of movements the A* algorithm allows.
Afterwards, once a movement type was established and encoded into the algorithm as described above, it is also important to find a good heuristic. The heuristic needs to satisfy certain criteria in order to be valid (it needs to not over-estimate the distance to the target), thus some heuristics are incompatible with certain movement types. Choosing an invalid heuristic no longer guarantees that A* will find the proper solution when it's done. A good choice for the heuristic is to use precisely the one measuring distance under the selected movement type (e.g. Manhattan for horizontal/vertical, and so on).

It is also worth mentioning the octile distance, which is a very accurate estimate of the distance when traveling on a grid with neighboring diagonals allowed. It basically estimates a direct path from A to B using neighboring diagonal moves which have a cost of sqrt(2) instead of 1 for cardinal movements. In other words it is a kind of Manhattan distance but with diagonals.
A very good resource on all of those grid heuristics is found here
http://theory.stanford.edu/~amitp/GameProgramming/Heuristics.html

Related

Shortest distance between two point on surface

I'm working on my bachelor thesis (on Computer Science) and right now I'm having a problem about finding shortest path between two points on 3D triangular mesh that is manifold. I already read about MMP, but which computes distance function $d(x)$ between given point and vertex $x$ on mesh.
I got to know that the problem I'm solving is named Geodesics but What I really couldn't find is some good algorithm which uses A* for finding shortest path between two given points on two given vertices.
I 'invented' also algorithm which uses A* by using Euclidian Distance Heuristics and correction after finding new Point on any Edge..
I also have edges saved in half-edge structure.
So my main idea is this:
We will find closest edge by A* algorithm and find on this edge point with minimalizing function $f(x) + g(x)$ where $f$ is our current distance and $g$ is heuristics(euclidean distance)
Everytime we find new edge, we will unfold current mesh and find closest path to our starting point
So now my questions:
Do you know some research paper which talks about this problem ??
Why nobody wrote about algorithm that uses A* ??
What are your opinions about algorithm I proposed ?
Here are some papers and tools related to finding geodesics (or approximations) on a surface mesh:
A Survey of Algorithms for Geodesic Paths and Distances
You Can Find Geodesic Paths in Triangle Meshes by Just Flipping Edges (code)
The Vector Heat Method
(code)
You can find more papers in the survey paper.
I implemented the algorithm you mentionned (MMP) a long time ago and it's quite difficult to get it right and quite time consuming since the number of splits along an edge grows quite fast.
I am no expert in the matter so read with prejudice. Also sorry this is more of a comment than answer...
First You should clarify some things:
the mesh is convex or concave?
are the path always on surface or can fly between faces on the outside (if concave) but never inside?
are the start/end points on edges of faces or can be inside?
Assuming concave, points on edges and only surface paths...
I think the graph A* approach is unusable as there is infinite possible paths between point and any edge of the same face so how you test all of them?
If you really want A* then you can do something similar to raster A*
so resample all your edges to more points
so either n points or use some density like 10 points per average edge length or some detail size.
use graph A* on resampled points (do not handle them as edges anymore)
However this will produce only close to shortest path so in order to improve the accuracy you should recursively resample the edges near used point with higher and higher density until the distance between resampled points get smaller than accuracy.
Another option would be using something similar to CCD (cyclic coordinate descent) so:
create plane that goes through your 2 points and center of your mesh
create path that goes through all intersection of plane and faces between the 2 points (use the shorter on from the 2 options)
iterativelly move intersections back and forward and use greedy approach to get the result
However this might get stuck in local minima... You could use search/fitting approaches instead but those will get very slow with increasing number of faces
I got the feeling you might also do this using RANSAC ...
From my point of view I think the first A* approach is the most promising, you just need linked list of points per each edge and one cost counter per each its point from this you can simply encode even the recursive improvement of accuracy. It can be done even in-place so no reallocation needed in the recursion ... And also the algo is not complicated so you should have no problems implementing it, and the result is guaranteed which is not the case with other approaches I mention... Another pros is that it can be used even if start/endpoint does not belong to edge...

Manhattan distance generalization

For a research I'm working on I'm trying to find a satisfying heuristic that is based on Manhattan distance which can work with any problem and domain as an input. Which is also known as domain-independent heuristic.
For now, I know how to apply Manhattan distance on a grid based problems.
Can someone give a tip how to generalize it to work on every domain and problem and not just grid based problems?
The generalization of Manhattan distance is simple. It is a metric which defines the distance between two multi-dimensional points as the sum of the distances along each dimension:
md(A, B) = dist(a1, b1) + dist(a2, b2) + . . .
The distances along each dimension are assumed to be simple to calculate. For numbers, the distance is the absolute value of the difference between the values.
This can be extended to other domains as well. For instance, the distance between two strings could be taken as the Levenshtein distance -- and that would prove to be an interesting metric in conjunction with other dimensions.
The manhattan distance heuristic is an attempt to measure the minimum number of steps required to find a path to the goal state. The closer you get to the actual number of steps, the fewer nodes have to be expanded during search, where at the extreme with a perfect heuristic, you only expand nodes that are guaranteed to be on the goal path.
For a more academic approach to generalizing this idea, you want to search around for domain independent heuristics; there was a lot of research done on this in the late 1990s early 2000s although even today, a small amount of domain knowledge can usually get you much better results. That being said, there are some good places to start:
delete relaxation: the expand function probably contains some restrictions, remove one or more of those restrictions and you'll end up with a much easier problem, one that can probably be solved in real time and you'll and use the value generated by that relaxed problem as the heuristic value. e.g. in the sliding tile puzzle, delete the constraint that a piece cannot move on top of other pieces and you end up with the manhattan distance, relax that a piece can only move to adjacent squares and you end up with the hamming distance heuristic.
abstraction: mapping every state in the real search to a smaller abstract state space that you can fully evaluate. Pattern databases are a very popular tool in this area.
critical paths: when you know you must pass through specific states (in either the real state space or an abstract state space) you can perform multiple searches between only the critical points to cut down greatly the number of nodes you would have to search in the full state space
landmarks: very accurate heuristics at the cost of typically high computation time. landmarks are specific locations in which you precompute the distance to every possible other state from (typically 5-25 landmarks are used depending on graph size) and then you compute the lower bound possible distance using those precomputed values when evaluating each node.
There are a few other classes of domain independent heuristics, but these are the most popular and widely used in classical planning applications.

How to tell when the A* algorithm is a good option, and how to choose a good heuristic?

Recently I wrote a solver for the famous "15 puzzle" using the A* algorithm by using a heuristic function based off the sum of the Manhattan distances of the distances for each tile to their destination spots.
This led me to wonder two things:
How do you know when the A* algorithm is even something to use? Unless I came across online tutorials, I would have never guessed that the 15 puzzle could be solved this way.
How do you know which heuristic function to use? At first, for the 15 puzzle, I considered a simple "sum of tiles not in position" heuristic. So if all pieces weren't in their right spots, the heuristic for the 15 puzzle might return 15, whereas 0 would indicate a solved board. But somehow the sum of the distances are better. How does one know, going into it?
If you're exploring a graph to find a path that is in some way "shortest" (the cost doesn't have to be a "distance", but it has to be monotone), you can already use Dijkstra's. Your problem will typically look nothing like path-finding at a first glance though, as in, you're not planning to "travel over a route". It's more abstract than that.
Then if you can use Dijkstra and you have some admissible heuristic (that's the hard part), you can use A*.
An often used technique for finding heuristics is dropping some constraint of your problem. For example, if you can teleport each tile to its destination regardless of whether there's already a tile there, it will take #displacements teleports. So there's the first heuristic. If you have to slide the tiles but they can slide through each other, the cost for each tile is the Manhattan distance to its destination. Then you can look at improving the heuristic, for example the Manhattan distance heuristic obviously ignores that tiles interfere with each other as they move, but there is a simple case where we know where tiles must conflict and use more moves: consider two tiles (pretend there are no other tiles) in the same row and their destinations are also on that row but in order to get there they'd have to pass through each other. They'd have to go around each other, adding two vertical moves. This gives the Linear Conflicts heuristic. Even more interference can be taken into account, for example with pattern databases.

Is Manhattan distance still an admissible heuristic in this modified n-puzzle?

I successfully implemented an 8-puzzle solver using the A* algorithm and now I am adding a twist to it: there could be more than one empty space in the puzzle and the numbers on the tiles are no longer unique (there could be numbers that are the same).
While the algorithm works after I modified it to generate successor states for all empty spaces, it didn't solve the game in the smallest number of moves possible (because I actually came up with a smaller number of moves when I tried to solve it by hand, surprise!)
Question: Is Manhattan distance still a viable heuristic in this puzzle? If not, what could the heuristic be?
Yes, an admissible heuristic for this problem can involve Manhattan distance.
The simplest approach is just to take the Manhattan distance to the closest possible target location for each tile.
This is clearly admissible because it's impossible to take less moves to get to any location quicker than directly moving to the closest one with ignoring all obstacles.
But we can do better - for two identical tiles A and B with target positions 1 and 2, rather than calculating the distance to the closest one for each, we can calculate the distance of all possible assignments of tiles to positions, so:
min(dist(A,1) + dist(B,2), dist(A,2) + dist(B,1))
This can be generalized to any number of tiles, but keep in mind that, for n identical tiles, there are n! such possibilities, so it gets quite expensive to calculate quite quickly.
Seeing why this is admissible is still fairly easy - since we're calculating the shortest possible distance for all assignments of tiles to positions, there's no way that the actual shortest distance could be any less.

What are some good methods to finding a heuristic for the A* algorithm?

You have a map of square tiles where you can move in any of the 8 directions. Given that you have function called cost(tile1, tile2) which tells you the cost of moving from one adjacent tile to another, how do you find a heuristic function h(y, goal) that is both admissible and consistent? Can a method for finding the heuristic be generalized given this setting, or would it be vary differently depending on the cost function?
Amit's tutorial is one of the best I've seen on A* (Amit's page). You should find some very useful hint about heuristics on this page .
Here is the quote about your problem :
On a square grid that allows 8 directions of movement, use Diagonal distance (Lāˆž).
It depends on the cost function.
There are a couple of common heuristics, such as Euclidean distance (the absolute distance between two tiles on a 2d plane) and Manhattan distance (the sum of the absolute x and y deltas). But these assume that the actual cost is never less than a certain amount. Manhattan distance is ruled out if your agent can efficiently move diagonally (i.e. the cost of moving to a diagonal is less than 2). Euclidean distance is ruled out if the cost of moving to a neighbouring tile is less than the absolute distance of that move (e.g. maybe if the adjacent tile was "downhill" from this one).
Edit
Regardless of your cost function, you always have an admissable and consistent heuristic in h(t1, t2) = -āˆž. It's just not a good one.
Yes, the heuristic is dependent on the cost function, in a couple of ways. First, it must be in the same units. Second, you can't have a lower-cost path through actual nodes than the cost of the heuristic.
In the real world, used for things like navigation on a road network, your heuristic might be "the time a car would take on a direct path at 1.5x the speed limit." The cost for each road segment would use the actual speed limit, which will give a higher cost.
So, what is your cost function between tiles? Is it based on physical properties, or defined outside of your graph?

Resources