3D Pathfinding AI algorithm recommendation and analysis - algorithm

I am trying to solve the following problem:
I have a 2D tiled game which consists in airplanes flying in the airspace, trying to land in the nearest airport (there can be 'n' goals). The idea is making the planes search for the best path by themselves, avoiding colisions.
So I was going to try the A* algorithm, but then I found this other restriction: The planes can change their altitude if they need to. So I had the idea to implement the same philosophy of A*, but in 3D (of expanding nodes to the possible moves, letting the plane move also up, down, down-north, up-east, etc., making an abstract 3D to handle a relative altitude, and thus letting the algorithm find the best path with 3D moves).
About the heuristics, I discarded the manhattan dinstance because I wanted the algorithm to be more efficient (because you know a good heuristic makes a more efficient search, manhattan overstimates the cost, and I am using diagonal moves), so I decided to implement the diagonal distance (which combines aspects from both manhattan and euclidean), recommended to 8-adjacencies (expanding nodes also in diagonal moves). But I have a lot more adjacencies, so I was trying to adapt the diagonal distance formulas to 16-adjacencies (excluding the up and down diagonals like up-northeast, down-sowthwest, and so on), so the manhattan estimate for every 'diagonal move' (except those I mention) has the same cost value (1 diagonal move = 2 ortogonal moves, not 3 as it'd be in the "up and down diagonals" I have excluded), and with that the formulas for this heuristic were generalized like this:
Let node A be the start, and B the goal, and their respective locations be (xa,ya,za) and (xb,yb,zb)
numberOfDiagonalSteps = min{|xa-xb|,|ya-yb|,|za-zb|}
manhattanDistance = |xa-xb| + |ya-yb| + |za-zb|
numberOfStraightSteps = manhattanDistance - 2*numberOfDiagonalSteps
And assuming diagonal steps cost sqrt(3) (you know, Pythagoras, having ortogonal costing 1):
The heuristic is: h(n) = numberOfStraightSteps + sqrt(3)*numberOfDiagonalSteps
Well... one of my questions is that, as planes are moving ("obstacle nodes"), the algorithm has to be refreshing, re-executing, so, what do you recommend me to do best?
I mean... is it better to try it like that, or better try to implement the D*-Lite?
And my other question is about time complexity. It is clear that the worst case for these algorithms is exponential, but it can be really improved from a good heuristic. But I don't find how the algorithm in my problem can be precisely analyzed. What time complexity can I give to that algorithm, or what do you suggest me to do in my case?
Thank you for your attention.

I would use simple map filling see:
http://en.wikipedia.org/wiki/Dijkstra's_algorithm
https://www.allegro.cc/forums/thread/599046
but the map will have more layers (flight altitudes). There can be just few of them (to limit time/memory wasting) for example 8 layers should be enough for up to 128 airplanes.
Of course it depends on 2D map area size and also after filling the map just take the shortest path from it. In filling the map consider any plane as obstacle (with some border around for safety). In this algorithm you can very simply add fuel consumption criteria or any other.
Also airfield selection can be very simple by this (first wants first gets the closest one). You have to have map for each airplane in time of decision (or refiling the same one for each plane separately). Do not need to be the whole map ... just the area between destination and plane
If you have to obey air traffic regulations then you need to apply flight plans + ad hoc scheduling instead. That is not an easy task (took me almost half a year to code it) and also the air traffic control is a bit complex especially the waiting ques in air and field sharing on the ground. All must by dynamically changeable (weather,waits,technical/political or security issues,...) but I strongly doubt this is the case so simple map filling above should do :)

Related

Manhattan distance generalization

For a research I'm working on I'm trying to find a satisfying heuristic that is based on Manhattan distance which can work with any problem and domain as an input. Which is also known as domain-independent heuristic.
For now, I know how to apply Manhattan distance on a grid based problems.
Can someone give a tip how to generalize it to work on every domain and problem and not just grid based problems?
The generalization of Manhattan distance is simple. It is a metric which defines the distance between two multi-dimensional points as the sum of the distances along each dimension:
md(A, B) = dist(a1, b1) + dist(a2, b2) + . . .
The distances along each dimension are assumed to be simple to calculate. For numbers, the distance is the absolute value of the difference between the values.
This can be extended to other domains as well. For instance, the distance between two strings could be taken as the Levenshtein distance -- and that would prove to be an interesting metric in conjunction with other dimensions.
The manhattan distance heuristic is an attempt to measure the minimum number of steps required to find a path to the goal state. The closer you get to the actual number of steps, the fewer nodes have to be expanded during search, where at the extreme with a perfect heuristic, you only expand nodes that are guaranteed to be on the goal path.
For a more academic approach to generalizing this idea, you want to search around for domain independent heuristics; there was a lot of research done on this in the late 1990s early 2000s although even today, a small amount of domain knowledge can usually get you much better results. That being said, there are some good places to start:
delete relaxation: the expand function probably contains some restrictions, remove one or more of those restrictions and you'll end up with a much easier problem, one that can probably be solved in real time and you'll and use the value generated by that relaxed problem as the heuristic value. e.g. in the sliding tile puzzle, delete the constraint that a piece cannot move on top of other pieces and you end up with the manhattan distance, relax that a piece can only move to adjacent squares and you end up with the hamming distance heuristic.
abstraction: mapping every state in the real search to a smaller abstract state space that you can fully evaluate. Pattern databases are a very popular tool in this area.
critical paths: when you know you must pass through specific states (in either the real state space or an abstract state space) you can perform multiple searches between only the critical points to cut down greatly the number of nodes you would have to search in the full state space
landmarks: very accurate heuristics at the cost of typically high computation time. landmarks are specific locations in which you precompute the distance to every possible other state from (typically 5-25 landmarks are used depending on graph size) and then you compute the lower bound possible distance using those precomputed values when evaluating each node.
There are a few other classes of domain independent heuristics, but these are the most popular and widely used in classical planning applications.

Trouble finding shortest path across a 2D mesh surface

I asked this question three days ago and I got burned by contributors because I didn't include enough information. I am sorry about that.
I have a 2D matrix and each array position relates to the depth of water in a channel, I was hoping to apply Dijkstra's or a similar "least cost path" algorithm to find out the least amount of concrete needed to build a bridge across the water.
It took some time to format the data into a clean version so I've learned some rudimentary Matlab skills doing that. I have removed most of the land so that now the shoreline is standardised to a certain value, my plan is to use a loop to move through each "pixel" on the "west" shore and run a least cost algorithm against it to the closest "east" shore and move through the entire mesh ultimately finding the least cost one.
This is my problem, fitting the data to any of the algorithms. Unfortunately I get overwhelmed by options and different formats because the other examples are for other use cases.
My other consideration is that when the shortest cost path is calculated that it will be a jagged line which would not be suitable for a bridge so I need to constrain the bend radius in the path if at all possible and I don't know how to go about doing that.
A picture of the channel:
Any advice in an approach method would be great, I just need to know if someone knows a method that should work, then I will spend the time learning how to fit the data.
You can apply Dijkstra to your problem in this way:
the two "dry" regions you want to connect correspond to matrix entries with value 0; the other cells have a positive value designating the depth (or the cost of filling this place with concrete)
your edges are the connections of neighbouring cells in your matrix. (It can be a 4- or 8-neighbourhood.) The weight of the edge is the arithmetic mean of the values of the connected cells.
then you apply the Dijkstra algorithm with a starting point in one "dry" region and an end point in the other "dry" region.
The cheapest path will connect two cells of value 0 and its weight will correspond to sum of costs of cells visited. (One half of each cell weight is coming from the edge going to the cell, the other half from the edge leaving the cell.)
This way you will get a possibly rather crooked path leading over the water, which may be a helpful hint for where to build a cheap bridge.
You can speed up the calculation by using the A*-algorithm. Therefore one needs a lower bound of the remaining costs for reaching the other side for each cell. Such a lower bound can be calculated by examining the "concentric rings" around a point as long as rings do not contain a 0-cell of the other side. The sum of the minimal cell values for each ring is then a lower bound of the remaining costs.
An alternative approach, which emphasizes the constraint that you require a non-jagged shape for your bridge, would be to use Monte-Carlo, simulated annealing or a genetic algorithm, where the initial "bridge" consisted a simple spline curve between two randomly chosen end points (one on each side of the chasm), plus a small number or randomly chosen intermediate points in the chasm. You would end up with a physically 'realistic' bridge and a reasonably optimized cost of concrete.

Reduce number of nodes in 3D A* pathfinding using (part of a) uniform grid representation

I am calculating pathfinding inside a mesh which I have build a uniform grid around. The nodes (cells in the 3D grid) close to what I deem a "standable" surface I mark as accessible and they are used in my pathfinding. To get alot of detail (like being able to pathfind up small stair cases) the ammount of accessible cells in my grid have grown quite large, several thousand in larger buildings. (every grid cell is 0.5x0.5x0.5 m and the meshes are rooms with real world dimensions). Even though I only use a fraction of the actual cells in my grid for pathfinding the huge ammount slows the algorithm down. Other than that it works fine and finds the correct path through the mesh, using a weighted manhattan distance heuristic.
Imagine my grid looks like that and the mesh is inside it (can be more or less cubes but its always cubical), however the pathfinding will not be calculated on all the small cubes just a few marked as accessible (usually at the bottom of the grid but that can depend on how many floors the mesh has).
I am looking to reduce the search space for the pathfinding... I have looked at clustering like how HPA* does it and other clustering algorithms like Markov but they all seem to be best used with node graphs and not grids. One obvious solution would be to just increase the size of the small cubes building the grid but then I would lose alot of detail in the pathfinding and it would not be as robust. How could I cluster these small cubes? This is how a typical search space looks when I do my pathfinding (blue are accessible, green is path):
and as you see there is a lot of cubes to search through because the distance between them is quite small!
Never mind that the grid is an unoptimal solution for pathfinding for now.
Does anyone have an idea on how to reduce the ammount of cubes in the grid I have to search through and how would I access the neighbors after I reduce the space? :) Right now it only looks at the closest neighbors while expanding the search space.
A couple possibilities come to mind.
Higher-level Pathfinding
The first is that your A* search may be searching the entire problem space. For example, you live in Austin, Texas, and want to get into a particular building somewhere in Alberta, Canada. A simple A* algorithm would search a lot of Mexico and the USA before finally searching Canada for the building.
Consider creating a second layer of A* to solve this problem. You'd first find out which states to travel between to get to Canada, then which provinces to reach Alberta, then Calgary, and then the Calgary Zoo, for example. In a sense, you start with an overview, then fill it in with more detailed paths.
If you have enormous levels, such as skyrim's, you may need to add pathfinding layers between towns (multiple buildings), regions (multiple towns), and even countries (multiple regions). If you were making a GPS system, you might even need continents. If we'd become interstellar, our spaceships might contain pathfinding layers for planets, sectors, and even galaxies.
By using layers, you help to narrow down your search area significantly, especially if different areas don't use the same co-ordinate system! (It's fairly hard to estimate distance for one A* pathfinder if one of the regions needs latitude-longitude, another 3d-cartesian, and the next requires pathfinding through a time dimension.)
More efficient algorithms
Finding efficient algorithms becomes more important in 3 dimensions because there are more nodes to expand while searching. A Dijkstra search which expands x^2 nodes would search x^3, with x being the distance between the start and goal. A 4D game would require yet more efficiency in pathfinding.
One of the benefits of grid-based pathfinding is that you can exploit topographical properties like path symmetry. If two paths consist of the same movements in a different order, you don't need to find both of them. This is where a very efficient algorithm called Jump Point Search comes into play.
Here is a side-by-side comparison of A* (left) and JPS (right). Expanded/searched nodes are shown in red with walls in black:
Notice that they both find the same path, but JPS easily searched less than a tenth of what A* did.
As of now, I haven't seen an official 3-dimensional implementation, but I've helped another user generalize the algorithm to multiple dimensions.
Simplified Meshes (Graphs)
Another way to get rid of nodes during the search is to remove them before the search. For example, do you really need nodes in wide-open areas where you can trust a much more stupid AI to find its way? If you are building levels that don't change, create a script that parses them into the simplest grid which only contains important nodes.
This is actually called 'offline pathfinding'; basically finding ways to calculate paths before you need to find them. If your level will remain the same, running the script for a few minutes each time you update the level will easily cut 90% of the time you pathfind. After all, you've done most of the work before it became urgent. It's like trying to find your way around a new city compared to one you grew up in; knowing the landmarks means you don't really need a map.
Similar approaches to the 'symmetry-breaking' that Jump Point Search uses were introduced by Daniel Harabor, the creator of the algorithm. They are mentioned in one of his lectures, and allow you to preprocess the level to store only jump-points in your pathfinding mesh.
Clever Heuristics
Many academic papers state that A*'s cost function is f(x) = g(x) + h(x), which doesn't make it obvious that you may use other functions, multiply the weight of the cost functions, and even implement heatmaps of territory or recent deaths as functions. These may create sub-optimal paths, but they greatly improve the intelligence of your search. Who cares about the shortest path when your opponent has a choke point on it and has been easily dispatching anybody travelling through it? Better to be certain the AI can reach the goal safely than to let it be stupid.
For example, you may want to prevent the algorithm from letting enemies access secret areas so that they avoid revealing them to the player, and so that they AI seems to be unaware of them. All you need to achieve this is a uniform cost function for any point within those 'off-limits' regions. In a game like this, enemies would simply give up on hunting the player after the path grew too costly. Another cool option is to 'scent' regions the player has been recently (by temporarily increasing the cost of unvisited locations because many algorithms dislike negative costs).
If you know what places you won't need to search, but can't implement in your algorithm's logic, a simple increase to their cost will prevent unnecessary searching. There's a lot of ways to take advantage of heuristics to simplify and inform your pathfinding, but your biggest gains will come from Jump Point Search.
EDIT: Jump Point Search implicitly selects pathfinding direction using the same heuristics as A*, so you may be able to implement heuristics to a small degree, but their cost function won't be the cost of a node, but rather, the cost of traveling between the two nodes. (A* generally searches adjacent nodes, so the distinction between a node's cost and the cost of traveling to it tends to break down.)
Summary
Although octrees/quad-trees/b-trees can be useful in collision-detection, they aren't as applicable to searches because they section a graph based on its coordinates; not on its connections. Layering your graph (mesh in your vocabulary) into super graphs (regions) is a more effective solution.
Hopefully I've covered anything you'll find useful.
Good luck!

How to break a geometry into blocks?

I am certain there is already some algorithm that does what I need, but I am not sure what phrase to Google, or what is the algorithm category.
Here is my problem: I have a polyhedron made up by several contacting blocks (hyperslabs), i. e. the edges are axis aligned and the angles between edges are 90°. There may be holes inside the polyhedron.
I want to break up this concave polyhedron in as little convex rectangular axis-aligned whole blocks are possible (if the original polyhedron is convex and has no holes, then it is already such a block, and therefore, the solution). To illustrate, some 2-D images I made (but I need the solution for 3-D, and preferably, N-D):
I have this geometry:
One possible breakup into blocks is this:
But the one I want is this (with as few blocks as possible):
I have the impression that an exact algorithm may be too expensive (is this problem NP-hard?), so an approximate algorithm is suitable.
One detail that maybe make the problem easier, so that there could be a more appropriated/specialized algorithm for it is that all edges have sizes multiple of some fixed value (you may think all edges sizes are integer numbers, or that the geometry is made up by uniform tiny squares, or voxels).
Background: this is the structured grid discretization of a PDE domain.
What algorithm can solve this problem? What class of algorithms should I
search for?
Update: Before you upvote that answer, I want to point out that my answer is slightly off-topic. The original poster have a question about the decomposition of a polyhedron with faces that are axis-aligned. Given such kind of polyhedron, the question is to decompose it into convex parts. And the question is in 3D, possibly nD. My answer is about the decomposition of a general polyhedron. So when I give an answer with a given implementation, that answer applies to the special case of polyhedron axis-aligned, but it might be that there exists a better implementation for axis-aligned polyhedron. And when my answer says that a problem for generic polyhedron is NP-complete, it might be that there exists a polynomial solution for the special case of axis-aligned polyhedron. I do not know.
Now here is my (slightly off-topic) answer, below the horizontal rule...
The CGAL C++ library has an algorithm that, given a 2D polygon, can compute the optimal convex decomposition of that polygon. The method is mentioned in the part 2D Polygon Partitioning of the manual. The method is named CGAL::optimal_convex_partition_2. I quote the manual:
This function provides an implementation of Greene's dynamic programming algorithm for optimal partitioning [2]. This algorithm requires O(n4) time and O(n3) space in the worst case.
In the bibliography of that CGAL chapter, the article [2] is:
[2] Daniel H. Greene. The decomposition of polygons into convex parts. In Franco P. Preparata, editor, Computational Geometry, volume 1 of Adv. Comput. Res., pages 235–259. JAI Press, Greenwich, Conn., 1983.
It seems to be exactly what you are looking for.
Note that the same chapter of the CGAL manual also mention an approximation, hence not optimal, that run in O(n): CGAL::approx_convex_partition_2.
Edit, about the 3D case:
In 3D, CGAL has another chapter about Convex Decomposition of Polyhedra. The second paragraph of the chapter says "this problem is known to be NP-hard [1]". The reference [1] is:
[1] Bernard Chazelle. Convex partitions of polyhedra: a lower bound and worst-case optimal algorithm. SIAM J. Comput., 13:488–507, 1984.
CGAL has a method CGAL::convex_decomposition_3 that computes a non-optimal decomposition.
I have the feeling your problem is NP-hard. I suggest a first step might be to break the figure into sub-rectangles along all hyperplanes. So in your example there would be three hyperplanes (lines) and four resulting rectangles. Then the problem becomes one of recombining rectangles into larger rectangles to minimize the final number of rectangles. Maybe 0-1 integer programming?
I think dynamic programming might be your friend.
The first step I see is to divide the polyhedron into a trivial collection of blocks such that every possible face is available (i.e. slice and dice it into the smallest pieces possible). This should be trivial because everything is an axis aligned box, so k-tree like solutions should be sufficient.
This seems reasonable because I can look at its cost. The cost of doing this is that I "forget" the original configuration of hyperslabs, choosing to replace it with a new set of hyperslabs. The only way this could lead me astray is if the original configuration had something to offer for the solution. Given that you want an "optimal" solution for all configurations, we have to assume that the original structure isn't very helpful. I don't know if it can be proven that this original information is useless, but I'm going to make that assumption in this answer.
The problem has now been reduced to a graph problem similar to a constrained spanning forest problem. I think the most natural way to view the problem is to think of it as a graph coloring problem (as long as you can avoid confusing it with the more famous graph coloring problem of trying to color a map without two states of the same color sharing a border). I have a graph of nodes (small blocks), each of which I wish to assign a color (which will eventually be the "hyperslab" which covers that block). I have the constraint that I must assign colors in hyperslab shapes.
Now a key observation is that not all possibilities must be considered. Take the final colored graph we want to see. We can partition this graph in any way we please by breaking any hyperslab which crosses the partition into two pieces. However, not every partition is meaningful. The only partitions that make sense are axis aligned cuts, which always break a hyperslab into two hyperslabs (as opposed to any more complicated shape which could occur if the cut was not axis aligned).
Now this cut is the reverse of the problem we're really trying to solve. That cutting is actually the thing we did in the first step. While we want to find the optimal merging algorithm, undoing those cuts. However, this shows a key feature we will use in dynamic programming: the only features that matter for merging are on the exposed surface of a cut. Once we find the optimal way of forming the central region, it generally doesn't play a part in the algorithm.
So let's start by building a collection of hyperslab-spaces, which can define not just a plain hyperslab, but any configuration of hyperslabs such as those with holes. Each hyperslab-space records:
The number of leaf hyperslabs contained within it (this is the number we are eventually going to try to minimize)
The internal configuration of hyperslabs.
A map of the surface of the hyperslab-space, which can be used for merging.
We then define a "merge" rule to turn two or more adjacent hyperslab-spaces into one:
Hyperslab-spaces may only be combined into new hyperslab-spaces (so you need to combine enough pieces to create a new hyperslab, not some more exotic shape)
Merges are done simply by comparing the surfaces. If there are features with matching dimensionalities, they are merged (because it is trivial to show that, if the features match, it is always better to merge hyperslabs than not to)
Now this is enough to solve the problem with brute force. The solution will be NP-complete for certain. However, we can add an additional rule which will drop this cost dramatically: "One hyperslab-space is deemed 'better' than another if they cover the same space, and have exactly the same features on their surface. In this case, the one with fewer hyperslabs inside it is the better choice."
Now the idea here is that, early on in the algorithm, you will have to keep track of all sorts of combinations, just in case they are the most useful. However, as the merging algorithm makes things bigger and bigger, it will become less likely that internal details will be exposed on the surface of the hyperslab-space. Consider
+===+===+===+---+---+---+---+
| : : A | X : : : :
+---+---+---+---+---+---+---+
| : : B | Y : : : :
+---+---+---+---+---+---+---+
| : : | : : : :
+===+===+===+ +---+---+---+
Take a look at the left side box, which I have taken the liberty of marking in stronger lines. When it comes to merging boxes with the rest of the world, the AB:XY surface is all that matters. As such, there are only a handful of merge patterns which can occur at this surface
No merges possible
A:X allows merging, but B:Y does not
B:Y allows merging, but A:X does not
Both A:X and B:Y allow merging (two independent merges)
We can merge a larger square, AB:XY
There are many ways to cover the 3x3 square (at least a few dozen). However, we only need to remember the best way to achieve each of those merge processes. Thus once we reach this point in the dynamic programming, we can forget about all of the other combinations that can occur, and only focus on the best way to achieve each set of surface features.
In fact, this sets up the problem for an easy greedy algorithm which explores whichever merges provide the best promise for decreasing the number of hyperslabs, always remembering the best way to achieve a given set of surface features. When the algorithm is done merging, whatever that final hyperslab-space contains is the optimal layout.
I don't know if it is provable, but my gut instinct thinks that this will be an O(n^d) algorithm where d is the number of dimensions. I think the worst case solution for this would be a collection of hyperslabs which, when put together, forms one big hyperslab. In this case, I believe the algorithm will eventually work its way into the reverse of a k-tree algorithm. Again, no proof is given... it's just my gut instinct.
You can try a constrained delaunay triangulation. It gives very few triangles.
Are you able to determine the equations for each line?
If so, maybe you can get the intersection (points) between those lines. Then if you take one axis, and start to look for a value which has more than two points (sharing this value) then you should "draw" a line. (At the beginning of the sweep there will be zero points, then two (your first pair) and when you find more than two points, you will be able to determine which points are of the first polygon and which are of the second one.
Eg, if you have those lines:
verticals (red):
x = 0, x = 2, x = 5
horizontals (yellow):
y = 0, y = 2, y = 3, y = 5
and you start to sweep through of X axis, you will get p1 and p2, (and we know to which line-equation they belong ) then you will get p3,p4,p5 and p6 !! So here you can check which of those points share the same line of p1 and p2. In this case p4 and p5. So your first new polygon is p1,p2,p4,p5.
Now we save the 'new' pair of points (p3, p6) and continue with the sweep until the next points. Here we have p7,p8,p9 and p10, looking for the points which share the line of the previous points (p3 and p6) and we get p7 and p10. Those are the points of your second polygon.
When we repeat the exercise for the Y axis, we will get two points (p3,p7) and then just three (p1,p2,p8) ! On this case we should use the farest point (p8) in the same line of the new discovered point.
As we are using lines equations and points 2 or more dimensions, the procedure should be very similar
ps, sorry for my english :S
I hope this helps :)

efficient algorithm to find nearest point in a graph that does not have a known equation

I'm asking this questions out of curiostity, since my quick and dirty implementation seems to be good enough. However I'm curious what a better implementation would be.
I have a graph of real world data. There are no duplicate X values and the X value increments at a consistant rate across the graph, but Y data is based off of real world output. I want to find the nearest point on the graph from an arbitrary given point P programmatically. I'm trying to find an efficient (ie fast) algorithm for doing this. I don't need the the exact closest point, I can settle for a point that is 'nearly' the closest point.
The obvious lazy solution is to increment through every single point in the graph, calculate the distance, and then find the minimum of the distance. This however could theoretically be slow for large graphs; too slow for what I want.
Since I only need an approximate closest point I imagine the ideal fastest equation would involve generating a best fit line and using that line to calculate where the point should be in real time; but that sounds like a potential mathematical headache I'm not about to take on.
My solution is a hack which works only because I assume my point P isn't arbitrary, namely I assume that P will usually be close to my graph line and when that happens I can cross out the distant X values from consideration. I calculating how close the point on the line that shares the X coordinate with P is and use the distance between that point and P to calculate the largest/smallest X value that could possible be closer points.
I can't help but feel there should be a faster algorithm then my solution (which is only useful because I assume 99% of the time my point P will be a point close to the line already). I tried googling for better algorithms but found so many algorithms that didn't quite fit that it was hard to find what I was looking for amongst all the clutter of inappropriate algorithms. So, does anyone here have a suggested algorithm that would be more efficient? Keep in mind I don't need a full algorithm since what I have works for my needs, I'm just curious what the proper solution would have been.
If you store the [x,y] points in a quadtree you'll be able to find the closest one quickly (something like O(log n)). I think that's the best you can do without making assumptions about where the point is going to be. Rather than repeat the algorithm here have a look at this link.
Your solution is pretty good, by examining how the points vary in y couldn't you calculate a bound for the number of points along the x axis you need to examine instead of using an arbitrary one.
Let's say your point P=(x,y) and your real-world data is a function y=f(x)
Step 1: Calculate r=|f(x)-y|.
Step 2: Find points in the interval I=(x-r,x+r)
Step 3: Find the closest point in I to P.
If you can use a data structure, some common data structures for spacial searching (including nearest neighbour) are...
quad-tree (and octree etc).
kd-tree
bsp tree (only practical for a static set of points).
r-tree
The r-tree comes in a number of variants. It's very closely related to the B+ tree, but with (depending on the variant) different orderings on the items (points) in the leaf nodes.
The Hilbert R tree uses a strict ordering of points based on the Hilbert curve. The Hilbert curve (or rather a generalization of it) is very good at ordering multi-dimensional data so that nearby points in space are usually nearby in the linear ordering.
In principle, the Hilbert ordering could be applied by sorting a simple array of points. The natural clustering in this would mean that a search would usually only need to search a few fairly-short spans in the array - with the complication being that you need to work out which spans they are.
I used to have a link for a good paper on doing the Hilbert curve ordering calculations, but I've lost it. An ordering based on Gray codes would be simpler, but not quite as efficient at clustering. In fact, there's a deep connection between Gray codes and Hilbert curves - that paper I've lost uses Gray code related functions quite a bit.
EDIT - I found that link - http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.133.7490

Resources