Nearest neighbor search in Octree

Nearest neighbor search in Octree - algorithm

How does a NN algorithm work on an octree? I have searched for a good explanation, but most of the time people just say used KD-tree instead. I cant do it, i need to visualize NN algorithm on octree step-by-step.
As i can think the most logical way would be to:
1) Find the sub-octant where the point belongs to.
2) Calculate distance to the nearest point in that octant
3) Check if there is any overlap with neighboring octants within that distance
4) If a closer point is found, recalculate the search distance.
5) Repeat until all possible octants have been traversed
6) Return the closest point
But i cant think up a good step by step visualization for this one.

To find the point closest to a search point, or to get list of points in order of increasing distance, you can use a priority queue that can hold both points and internal nodes of the tree, which lets you remove them in order of distance.
For points (leaves), the distance is just the distance of the point from the search point. For internal nodes (octants), the distance is the smallest distance from the search point to any point that could possibly be in the octant.
Now, to search, you just put the root of the tree in the priority queue, and repeat:
Remove the head of the priority queue;
If the head of the queue was a point, then it is the closest point in the tree that you have not yet returned. You know this, because if any internal node could possibly have a closer point, then it would have been returned first from the priority queue;
If the head of the queue was an internal node, then put its children back into the queue
This will produce all the points in the tree in order of increasing distance from the search point. The same algorithm works for KD trees as well.

Related

Finding the nearest neighboring edge in a Quadtree

I have stored a set of Points in a Quadtree. Once the quadtree has been created with the points, I then add all the edges to the quadtree such that each edge gets stored in ALL the leaf nodes that it crosses, begins or ends at.
Now, I have a point, say A, and I need to find the closest edge to it. In my current Algorithm, I recurse to the leaf node that contains this Point A and find the distances between A and all line segments contained by this leaf node.
Now this may look like the right solution but it isn't as I have to compare edges that are there in adjacent nodes as well to be able to give an accurate answer.
Now my questions are
a)How do I go about extracting the closest edges?
b)Should I just compare all edges contained in the parent(to the point of interest) node?
(But I know for a fact that putting a hard limit on the number of levels one must go up to find the nearest edge is incorrect based on intuition)

Every node on the quad-tree represents a cube in space (where some sides may be at infinitum) and you can calculate the minimum distance between that cube and the target point A. Note that the distance is 0 for cubes containing A.
Starting from the root node you have to calculate the distance for every of its child cubes (nodes) to A and insert it into a min-heap.
Iteratively, you get the nearest cube at the top of heap and repeat the process. When you reach some leaf node, you just search for the nearest edge to A inside it using brute force.
Once the distance of the cube at top of the heap is greater than the distance of the nearest edge found so far, you can stop the search.
Update: BTW, this is actually the general approach for searching for anything using a quad-tree or a kd-tree or probably most spatial structures.

You can try a voronoi diagram and look for edges inside the voronoi cell only.

How to find certain sized clusters of points

Given a list of points, I'd like to find all "clusters" of N points. My definition of cluster is loose and can be adjusted to whatever allows an easiest solution: it could be N points within a certain size circle or N points that are all within a distance of each other or something else that makes sense. Heuristics are acceptable.
Where N=2, and we're just looking for all point pairs that are close together, it's pretty easy to do ~efficiently with a k-d tree (e.g. recursively break the space into octants or something, where each area is a different branch on the tree and then for each point, compare it to other points with the same parent (if near the edge of an area, check up the appropriate number of levels as well)). I recognize that inductively with a solution for N=N', I can find solution for N=N'+1 by taking the intersections between different N' solutions, but that's super inefficient.
Anyone know a decent way to go about this?

You start by calculating the Euclidean minimum spanning tree, e.g CGAL can do this. From there the precise algorithm depends on your specific requirements, but it goes roughly like this: You sort the edges in that graph by length. Then delete edges, starting with the longest one. It's a singly connected graph, so with each deleted edge you split the graph into two sub-graphs. Check each created sub-graph if it forms a cluster according to your conditions. If not, continue deleting edges.

Minimum manhattan distance with certain blocked points

The minimum Manhattan distance between any two points in the cartesian plane is the sum of the absolute differences of the respective X and Y axis. Like, if we have two points (X,Y) and (U,V) then the distance would be: ABS(X-U) + ABS(Y-V). Now, how should I determine the minimum distance between several pairs of points moving only parallel to the coordinate axis such that certain given points need not be visited in the selected path. I need a very efficient algorithm, because the number of avoided points can range up to 10000 with same range for the number of queries. The coordinates of the points would be less than ABS(50000). I would be given the set of points to be avoided in the beginning, so I might use some offline algorithm and/or precomputation.
As an example, the Manhattan distance between (0,0) and (1,1) is 2 from either path (0,0)->(1,0)->(1,1) or (0,0)->(0,1)->(1,1). But, if we are given the condition that (1,0) and (0,1) cannot be visited, the minimum distance increases to 6. One such path would then be: (0,0)->(0,-1)->(1,-1)->(2,-1)->(2,0)->(2,1)->(1,1).

This problem can be solved by breadth-first search or depth-first search, with breadth-first search being the standard approach. You can also use the A* algorithm which may give better results in practice, but in theory (worst case) is no better than BFS.
This is provable because your problem reduces to solving a maze. Obviously you can have so many obstacles that the grid essentially becomes a maze. It is well known that BFS or DFS are the only way to solve mazes. See Maze Solving Algorithms (wikipedia) for more information.
My final recommendation: use the A* algorithm and hope for the best.

You are not understanding the solutions here or we are not understanding the problem:
1) You have a cartesian plane. Therefore, every node has exactly 4 adjacent nodes, given by x+/-1, y+/-1 (ignoring the edges)
2) Do a BFS (or DFS,A*). All you can traverse is x/y +/- 1. Prestore your 10000 obstacles and just check if the node x/y +/-1 is visitable on demand. you don't need a real graph object
If it's too slow, you said you can do an offline calculation - 10^10 only requires 1.25GB to store an indexed obstacle lookup table. leave the algorithm running?
Where am I going wrong?

Given a source node, dest node, and intermediate nodes, how does one detect if the shortest Manhattan Distance is blocked?

Here is the full title I would have posted, but it happens to be too long:
Given a source node, dest node, and intermediate nodes, how does one detect if the shortest Manhattan Distance is blocked by the intermediate nodes?
I've drawn a diagram to make it more clear. On the left side, "u" is the source node and "v" is the destination node. The nodes labeled 1 through 6 are the intermediate nodes. The shortest Manhattan Distance from u -> v would be 12, but the intermediate nodes form a wall blocking it. The diagram on the right, with u' being the source, and v' being the destination, shows that the intermediate nodes 1 through 5 do not block the shortest Manhattan distance from u' to v'.
I'm trying to find an algorithm that won't require me to actually do a graph search (e.g. BFS), because the distance between u and v could potentially be very large.

If all you want to do is detect whether the shortest path (one consisting of moves that monotonically take you in the right direction) is blocked, then you are trying to check whether the blocking nodes cut the rectangle whose corners are given by the source and destination node into two different regions that are disconnected. If no shortest path from the source to the destination is possible, then every path must have some point in it that's blocked.
Let's suppose for simplicity that your start point is below and to the left of the destination point. In that case, find, in O(n), all of the other points that are obstacle points contained in the bounding box holding the start and end point. You now want to see if there is some subset of those nodes that cuts the rectangle into two pieces, one containing the bottom-left corner and one containing the top-right corner. This is possible iff there is a path of the blocking nodes from the left side to the right side, from the left side to the bottom side, from the top side to the right side, or from the top side to the bottom side. Thus we just need to check if any of these are possible.
Fortunately, this can be done efficiently by modeling the problem as a graph search in a graph that has size O(n), where n is the number of blocking points, and has nothing to do with the size of the bounding box. That is, no matter how far apart the test points are, the size of the graph to search depends solely on the number of blocking points.
The graph you want to construct has two parts. First, build a graph where each blocking point is connected to each other blocking point in the 3x3 square surrounding it. These edges link together blocking points that could be part of the same barrier, in that no path from the source to the target could pass between two blocking points joined by an edge. Now, add in four new nodes representing the top wall, left wall, right wall, and bottom wall and connect them to each node that is adjacent to the appropriate wall. That way, for example, a path from the left wall node to the right wall node would represent a series of blocking nodes that make it impossible to get from the bottom-left node to the top-right node.
This graph has size O(n), where n is the number of blocking nodes, since there are O(n) nodes and each node can have at most 12 edges - one for each of the 8 neighbors and potentially one for each of the four walls. You could construct it in at worst quadratic time by scanning over each node and, for each other node, seeing if they are adjacent. There is probably a better way to do this, but nothing comes to me at the moment.
Now that you have the graph, for each of the pairs of walls that, if connected, would disconnect the graph, run a graph search in this graph between those two wall nodes. If a path exists, report that the shortest path is blocked. If not, report that some shortest path is unblocked. These searches could be done with a simple DFS, or since you're running multiple searches and just want to know if they're connected, using a strongly connected components algorithm once and checking if any pair of important nodes are in the same SCC. Either approach takes time O(n).
Thus the time to solve this problem is at most O(n2), with the bottleneck being the time required to construct the graph.
Hope this helps!

Here's my idea:
I'll describe the case when the destination is upper and to right from the source, for other cases, rotate. (For simple cases where the nodes have the same x/y coordinate, just checks whether there's a blocking node directly between them)
Take the matrix with source and destination in corners. Now, a column at a time, from left to right and inside the column, bottom up, mark blocked nodes. A node B is blocked iff any of following is true:
B is an intermediate node
the nodes left to B and bottom from B are both blocked (both were already checked given the order of processing) or outside the bounds of the matrix
In the end, if destination is blocked, there's no free shortest path.
The time required is O(m*n), where m, n are the lengths of sides of the matrix. So when you'll only have several intermediate nodes, templatetypedef's solution may be more appropriate.
EDIT: Got it a little wrong previously, now I hope I didn't miss anything

A* for finding shortest path and avoiding lines as obstacles

I have to get the (shortest)/(an optimal) distance between two points in 2D. I have to avoid shapes that are lines, that may be connected together. Any suggestion on how to represent the nodes I can travel on? I have thought of making a grid, but this doesn't sound very accurate or elegant. I would consider a node as unwalkable if any point of a line is inside a square (with the node being the center of the square).
An example would be going from point A to B.
Is the grid the recommended way to solve this? Thanks A LOT in advance!

I think this is essentially Larsmans' answer made a bit more algorithmic:
Nodes of the graph are the obstacle vertices. Each internal vertex actually represents two nodes: the concave and the convex side.
Push the start node onto the priority queue with a Euclidean heuristic distance.
Pop the top node from the priority queue.
Do a line intersection test from the node to the goal(possibly using ray-tracing data-structure techniques for speedup). If it fails,
Consider a ray from the current node to every other vertex. If there are no intersections between the current node the vertex under consideration, and the vertex is convex from the perspective of the current node, add the vertex to the priority queue, sorted using the accumulated distance in the current node plus the distance from the current node to the vertex plus the heuristic distance.
Return to 2.
You have to do extra pre-processing work if there are things like 'T' junctions in the obstacles and I wouldn't be surprised to discover that it breaks in a number of cases. You might be able to make things faster by only considering the vertices of the connected component that lies between the current node and the goal.
So in your example, after first attempting A,B, you'd push A,8, A,5, A,1, A,11, and A,2. The first nodes of consideration would be A,8, A,1, and A,5, but they can't get out and the nodes they can reach are already pushed on the queue with shorter accumulated distance. A,2 and A,11 will be considered and things will go from there.

I think you can solve this by an A* search on a graph defined as:
Vertices: origin, destination and all endpoints of obstacle edges.
Edges (successor function): any pair of the previous where the corresponding line does not cross any obstacle's edges. A naive implementation would just check for intersection between a potential edge and all obstacle edges. A faster implementation might use some kind of 2-d ray tracing algorithm.
I.e., it's not necessary to discretize the plane into a grid, but without it, the successor function becomes difficult to define.

A grid, and A* running through it is the way to go. All games i am aware of use A* or an adaptation of it for the global route and complement it with steering behaviour for local accuracy. Maybe you were not aware that using steering behaviour will resolve all your concerns about accuracy (search-engine it).
EDIT: i just remember a strategy game that uses flow-field. But it is not main-stream.
BTW: to get a feel for what steering behavior does for your objects take a look at the many youtube-videos about it. Some of them use the term path-finding in a more general sense: including the global algorithm (A*) as well as collision-avoidance, path-smoothing and object-inertia involved in steering-behavior.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio