Pathfinding through four dimensional data

Pathfinding through four dimensional data - algorithm

The problem is finding an optimal route for a plane through four dimensional winds (winds at differing heights and that change as you travel (predicative wind model)).
I've used the traditional A* search algorithm and hacked it to get it to work in 3 dimensions and with wind vectors.
It works in a lot the cases but is extremely slow (im dealing with huge amounts of data nodes) and doesnt work for some edge cases.
I feel like I've got it working "well" but its feels very hacked together.
Is there a better more efficient way for path finding through data like this (maybe a genetic algorithm or neural network), or something I havent even considered? Maybe fluid dynamics? I dont know?
Edit: further details.
Data is wind vectors (direction, magnitude).
Data is spaced 15x15km at 25 different elevation levels.
By "doesnt always work" I mean it will pick a stupid path for an aircraft because the path weight is the same as another path. Its fine for path finding but sub-optimal for a plane.
I take many things into account for each node change:
Cost of elevation over descending.
Wind resistance.
Ignoring nodes with too high of a resistance.
Cost of diagonal tavel vs straight etc.
I use euclidean distance as my heuristic or H value.
I use various factors for my weight or G value (the list above).
Thanks!

You can always have a trade off of time-optimilaity by using a weighted A*.
Weighted A* [or A* epsilon], is expected to find a path faster then A*, but the path won't be optimal [However, it gives you a bound on its optimality, as a paremeter of your epsilon/weight].

A* isn't advertised to be the fastest search algorithm; it does guarantee that the first solution it finds will be the best (assuming you provide an admissible heuristic). If yours isn't working for some cases, then something is wrong with some aspect of your implementation (maybe w/ the mechanics of A*, maybe the domain-specific stuff; given you haven't provided any details, can't say more than that).
If it is too slow, you might want to reconsider the heuristic you are using.
If you don't need an optimal solution, then some other technique might be more appropriate. Again, given how little you have provided about the problem, hard to say more than that.

Are you planning offline or online?
Typically for these problems you don't know what the winds are until you're actually flying through them. If this really is an online problem you may want to consider trying to construct a near-optimal policy. There is a quite a lot of research in this area already, one of the best is "Autonomous Control of Soaring Aircraft by Reinforcement Learning" by John Wharington.

Related

Reduce number of nodes in 3D A* pathfinding using (part of a) uniform grid representation

I am calculating pathfinding inside a mesh which I have build a uniform grid around. The nodes (cells in the 3D grid) close to what I deem a "standable" surface I mark as accessible and they are used in my pathfinding. To get alot of detail (like being able to pathfind up small stair cases) the ammount of accessible cells in my grid have grown quite large, several thousand in larger buildings. (every grid cell is 0.5x0.5x0.5 m and the meshes are rooms with real world dimensions). Even though I only use a fraction of the actual cells in my grid for pathfinding the huge ammount slows the algorithm down. Other than that it works fine and finds the correct path through the mesh, using a weighted manhattan distance heuristic.
Imagine my grid looks like that and the mesh is inside it (can be more or less cubes but its always cubical), however the pathfinding will not be calculated on all the small cubes just a few marked as accessible (usually at the bottom of the grid but that can depend on how many floors the mesh has).
I am looking to reduce the search space for the pathfinding... I have looked at clustering like how HPA* does it and other clustering algorithms like Markov but they all seem to be best used with node graphs and not grids. One obvious solution would be to just increase the size of the small cubes building the grid but then I would lose alot of detail in the pathfinding and it would not be as robust. How could I cluster these small cubes? This is how a typical search space looks when I do my pathfinding (blue are accessible, green is path):
and as you see there is a lot of cubes to search through because the distance between them is quite small!
Never mind that the grid is an unoptimal solution for pathfinding for now.
Does anyone have an idea on how to reduce the ammount of cubes in the grid I have to search through and how would I access the neighbors after I reduce the space? :) Right now it only looks at the closest neighbors while expanding the search space.

A couple possibilities come to mind.
Higher-level Pathfinding
The first is that your A* search may be searching the entire problem space. For example, you live in Austin, Texas, and want to get into a particular building somewhere in Alberta, Canada. A simple A* algorithm would search a lot of Mexico and the USA before finally searching Canada for the building.
Consider creating a second layer of A* to solve this problem. You'd first find out which states to travel between to get to Canada, then which provinces to reach Alberta, then Calgary, and then the Calgary Zoo, for example. In a sense, you start with an overview, then fill it in with more detailed paths.
If you have enormous levels, such as skyrim's, you may need to add pathfinding layers between towns (multiple buildings), regions (multiple towns), and even countries (multiple regions). If you were making a GPS system, you might even need continents. If we'd become interstellar, our spaceships might contain pathfinding layers for planets, sectors, and even galaxies.
By using layers, you help to narrow down your search area significantly, especially if different areas don't use the same co-ordinate system! (It's fairly hard to estimate distance for one A* pathfinder if one of the regions needs latitude-longitude, another 3d-cartesian, and the next requires pathfinding through a time dimension.)
More efficient algorithms
Finding efficient algorithms becomes more important in 3 dimensions because there are more nodes to expand while searching. A Dijkstra search which expands x^2 nodes would search x^3, with x being the distance between the start and goal. A 4D game would require yet more efficiency in pathfinding.
One of the benefits of grid-based pathfinding is that you can exploit topographical properties like path symmetry. If two paths consist of the same movements in a different order, you don't need to find both of them. This is where a very efficient algorithm called Jump Point Search comes into play.
Here is a side-by-side comparison of A* (left) and JPS (right). Expanded/searched nodes are shown in red with walls in black:
Notice that they both find the same path, but JPS easily searched less than a tenth of what A* did.
As of now, I haven't seen an official 3-dimensional implementation, but I've helped another user generalize the algorithm to multiple dimensions.
Simplified Meshes (Graphs)
Another way to get rid of nodes during the search is to remove them before the search. For example, do you really need nodes in wide-open areas where you can trust a much more stupid AI to find its way? If you are building levels that don't change, create a script that parses them into the simplest grid which only contains important nodes.
This is actually called 'offline pathfinding'; basically finding ways to calculate paths before you need to find them. If your level will remain the same, running the script for a few minutes each time you update the level will easily cut 90% of the time you pathfind. After all, you've done most of the work before it became urgent. It's like trying to find your way around a new city compared to one you grew up in; knowing the landmarks means you don't really need a map.
Similar approaches to the 'symmetry-breaking' that Jump Point Search uses were introduced by Daniel Harabor, the creator of the algorithm. They are mentioned in one of his lectures, and allow you to preprocess the level to store only jump-points in your pathfinding mesh.
Clever Heuristics
Many academic papers state that A*'s cost function is f(x) = g(x) + h(x), which doesn't make it obvious that you may use other functions, multiply the weight of the cost functions, and even implement heatmaps of territory or recent deaths as functions. These may create sub-optimal paths, but they greatly improve the intelligence of your search. Who cares about the shortest path when your opponent has a choke point on it and has been easily dispatching anybody travelling through it? Better to be certain the AI can reach the goal safely than to let it be stupid.
For example, you may want to prevent the algorithm from letting enemies access secret areas so that they avoid revealing them to the player, and so that they AI seems to be unaware of them. All you need to achieve this is a uniform cost function for any point within those 'off-limits' regions. In a game like this, enemies would simply give up on hunting the player after the path grew too costly. Another cool option is to 'scent' regions the player has been recently (by temporarily increasing the cost of unvisited locations because many algorithms dislike negative costs).
If you know what places you won't need to search, but can't implement in your algorithm's logic, a simple increase to their cost will prevent unnecessary searching. There's a lot of ways to take advantage of heuristics to simplify and inform your pathfinding, but your biggest gains will come from Jump Point Search.
EDIT: Jump Point Search implicitly selects pathfinding direction using the same heuristics as A*, so you may be able to implement heuristics to a small degree, but their cost function won't be the cost of a node, but rather, the cost of traveling between the two nodes. (A* generally searches adjacent nodes, so the distinction between a node's cost and the cost of traveling to it tends to break down.)
Summary
Although octrees/quad-trees/b-trees can be useful in collision-detection, they aren't as applicable to searches because they section a graph based on its coordinates; not on its connections. Layering your graph (mesh in your vocabulary) into super graphs (regions) is a more effective solution.
Hopefully I've covered anything you'll find useful.
Good luck!

How to tell when the A* algorithm is a good option, and how to choose a good heuristic?

Recently I wrote a solver for the famous "15 puzzle" using the A* algorithm by using a heuristic function based off the sum of the Manhattan distances of the distances for each tile to their destination spots.
This led me to wonder two things:
How do you know when the A* algorithm is even something to use? Unless I came across online tutorials, I would have never guessed that the 15 puzzle could be solved this way.
How do you know which heuristic function to use? At first, for the 15 puzzle, I considered a simple "sum of tiles not in position" heuristic. So if all pieces weren't in their right spots, the heuristic for the 15 puzzle might return 15, whereas 0 would indicate a solved board. But somehow the sum of the distances are better. How does one know, going into it?

If you're exploring a graph to find a path that is in some way "shortest" (the cost doesn't have to be a "distance", but it has to be monotone), you can already use Dijkstra's. Your problem will typically look nothing like path-finding at a first glance though, as in, you're not planning to "travel over a route". It's more abstract than that.
Then if you can use Dijkstra and you have some admissible heuristic (that's the hard part), you can use A*.
An often used technique for finding heuristics is dropping some constraint of your problem. For example, if you can teleport each tile to its destination regardless of whether there's already a tile there, it will take #displacements teleports. So there's the first heuristic. If you have to slide the tiles but they can slide through each other, the cost for each tile is the Manhattan distance to its destination. Then you can look at improving the heuristic, for example the Manhattan distance heuristic obviously ignores that tiles interfere with each other as they move, but there is a simple case where we know where tiles must conflict and use more moves: consider two tiles (pretend there are no other tiles) in the same row and their destinations are also on that row but in order to get there they'd have to pass through each other. They'd have to go around each other, adding two vertical moves. This gives the Linear Conflicts heuristic. Even more interference can be taken into account, for example with pattern databases.

Algorithm to align and compare two sets of vectors which may be incomplete and ignoring scaling?

Here is the problem:
I have many sets of points, and want to come up with a function that can take one set and rank matches based on their similarity to the first. Scaling, translation, and rotation do not matter, and some points may be missing from any of the sets of points. The best match is the one that if scaled and translated in the ideal way has the least mean square error between points (maybe with a cap on penalty, or considering only the best fraction of points to handle missing points).
I am trying to come up with a good way to do this, and am wondering if there are any well known algorithms that can handle this type of problem? Just the name of something would be awesome! I lack a formal CSCI or math education, and am doing the best to teach myself.
A few things I have tried
The first thing that comes to mind is to normalize the points somehow, but I dont think that this is helpful because the missing points may throw things off.
The best way I can think of is to estimate a starting point by translating to match their centroids, scaling so that the largest distances from the centroid of the sets match. From there, do an A* search, scaling, rotating, and translating until I reach a maximum, and then compare the two sets. (I hope I am using the term A* correctly, I mean trying small translations and scalings and selecting the move giving the best match) I think this will find the global maximum most of the time, but is not guaranteed to. I am looking for a better way that will always be correct.
Thanks a ton for the help! It has been fun and interesting trying to figure this out so far, so I hope it is for you as well.

There's a very clever algorithm for identifying starfields. You find 4 points in a diamond shape and then using the two stars farthest apart you define a coordinate system locating the other two stars. This is scale and rotation invariant because the locations are relative to the first two stars. This forms a hash. You generate several of these hashes and use those to generate candidates. Once you have the candidates you look for ones where multiple hashes have the correct relationships.
This is described in a paper and a presentation on http://astrometry.net/ .

This paper may be useful: Shape Matching and Object Recognition Using Shape Contexts
Edit:
There is a couple of relatively simple methods to solve the problem:
To combine all possible pairs of points (one for each set) to nodes, connect these nodes where distances in both sets match, then solve the maximal clique problem for this graph. Since the maximal clique problem is NP-complete, the complexity is probably O(exp(n^2)), so if you have too many points, don't use this algorithm directly, use some approximation.
Use Generalised Hough transform to match two sets of points. This approach has less complexity (O(n^4)). But it is more complicated, so I cannot explain it here.
You can find the details in computer vision books, for example "Machine vision: theory, algorithms, practicalities" by E. R. Davies (2005).

efficient algorithm to find nearest point in a graph that does not have a known equation

I'm asking this questions out of curiostity, since my quick and dirty implementation seems to be good enough. However I'm curious what a better implementation would be.
I have a graph of real world data. There are no duplicate X values and the X value increments at a consistant rate across the graph, but Y data is based off of real world output. I want to find the nearest point on the graph from an arbitrary given point P programmatically. I'm trying to find an efficient (ie fast) algorithm for doing this. I don't need the the exact closest point, I can settle for a point that is 'nearly' the closest point.
The obvious lazy solution is to increment through every single point in the graph, calculate the distance, and then find the minimum of the distance. This however could theoretically be slow for large graphs; too slow for what I want.
Since I only need an approximate closest point I imagine the ideal fastest equation would involve generating a best fit line and using that line to calculate where the point should be in real time; but that sounds like a potential mathematical headache I'm not about to take on.
My solution is a hack which works only because I assume my point P isn't arbitrary, namely I assume that P will usually be close to my graph line and when that happens I can cross out the distant X values from consideration. I calculating how close the point on the line that shares the X coordinate with P is and use the distance between that point and P to calculate the largest/smallest X value that could possible be closer points.
I can't help but feel there should be a faster algorithm then my solution (which is only useful because I assume 99% of the time my point P will be a point close to the line already). I tried googling for better algorithms but found so many algorithms that didn't quite fit that it was hard to find what I was looking for amongst all the clutter of inappropriate algorithms. So, does anyone here have a suggested algorithm that would be more efficient? Keep in mind I don't need a full algorithm since what I have works for my needs, I'm just curious what the proper solution would have been.

If you store the [x,y] points in a quadtree you'll be able to find the closest one quickly (something like O(log n)). I think that's the best you can do without making assumptions about where the point is going to be. Rather than repeat the algorithm here have a look at this link.
Your solution is pretty good, by examining how the points vary in y couldn't you calculate a bound for the number of points along the x axis you need to examine instead of using an arbitrary one.

Let's say your point P=(x,y) and your real-world data is a function y=f(x)
Step 1: Calculate r=|f(x)-y|.
Step 2: Find points in the interval I=(x-r,x+r)
Step 3: Find the closest point in I to P.

If you can use a data structure, some common data structures for spacial searching (including nearest neighbour) are...
quad-tree (and octree etc).
kd-tree
bsp tree (only practical for a static set of points).
r-tree
The r-tree comes in a number of variants. It's very closely related to the B+ tree, but with (depending on the variant) different orderings on the items (points) in the leaf nodes.
The Hilbert R tree uses a strict ordering of points based on the Hilbert curve. The Hilbert curve (or rather a generalization of it) is very good at ordering multi-dimensional data so that nearby points in space are usually nearby in the linear ordering.
In principle, the Hilbert ordering could be applied by sorting a simple array of points. The natural clustering in this would mean that a search would usually only need to search a few fairly-short spans in the array - with the complication being that you need to work out which spans they are.
I used to have a link for a good paper on doing the Hilbert curve ordering calculations, but I've lost it. An ordering based on Gray codes would be simpler, but not quite as efficient at clustering. In fact, there's a deep connection between Gray codes and Hilbert curves - that paper I've lost uses Gray code related functions quite a bit.
EDIT - I found that link - http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.133.7490

Beating a greedy algo

For class we have a grid and a bunch of squares on the grid that we need to detect and travel to. We start at (0,0). We scan tiny regions of the grid at a time (for reasons regarding our data structure it's mandatory), and when we detect squares that we need to travel, and then we travel. There are 32 locations on the grid, but we only need to travel 16 of them, and as fast as possible (we're racing someone else to them).
Dijkstra's algo would find the shortest path from our current location to our next location. This is sub-optimal however, because our next location may be really far from the location after that. It would be more beneficial if we could somehow figure out the density of locations as we scan, and then choose to go to a very dense place and travel all the locations within that area.
Is there an algorithm best suited for a situation like this? I know the greedy heuristic isn't optimal. A* and Dijkstra's are the ones we thought about at first but we are hoping there's a completely different solution.
PS this is being done in assembly, unfortunately.

Finding dense clumps of points (e.g. locations you have to visit) is called cluster analysis. See the link for several classes of algorithms.
Assembly language is a really painful way to experiment with high-level algorithms. Is your prof a sadist??

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio