Finding the nearest neighboring edge in a Quadtree - algorithm

I have stored a set of Points in a Quadtree. Once the quadtree has been created with the points, I then add all the edges to the quadtree such that each edge gets stored in ALL the leaf nodes that it crosses, begins or ends at.
Now, I have a point, say A, and I need to find the closest edge to it. In my current Algorithm, I recurse to the leaf node that contains this Point A and find the distances between A and all line segments contained by this leaf node.
Now this may look like the right solution but it isn't as I have to compare edges that are there in adjacent nodes as well to be able to give an accurate answer.
Now my questions are
a)How do I go about extracting the closest edges?
b)Should I just compare all edges contained in the parent(to the point of interest) node?
(But I know for a fact that putting a hard limit on the number of levels one must go up to find the nearest edge is incorrect based on intuition)

Every node on the quad-tree represents a cube in space (where some sides may be at infinitum) and you can calculate the minimum distance between that cube and the target point A. Note that the distance is 0 for cubes containing A.
Starting from the root node you have to calculate the distance for every of its child cubes (nodes) to A and insert it into a min-heap.
Iteratively, you get the nearest cube at the top of heap and repeat the process. When you reach some leaf node, you just search for the nearest edge to A inside it using brute force.
Once the distance of the cube at top of the heap is greater than the distance of the nearest edge found so far, you can stop the search.
Update: BTW, this is actually the general approach for searching for anything using a quad-tree or a kd-tree or probably most spatial structures.

You can try a voronoi diagram and look for edges inside the voronoi cell only.

Related

Nearest neighbor search in Octree

How does a NN algorithm work on an octree? I have searched for a good explanation, but most of the time people just say used KD-tree instead. I cant do it, i need to visualize NN algorithm on octree step-by-step.
As i can think the most logical way would be to:
1) Find the sub-octant where the point belongs to.
2) Calculate distance to the nearest point in that octant
3) Check if there is any overlap with neighboring octants within that distance
4) If a closer point is found, recalculate the search distance.
5) Repeat until all possible octants have been traversed
6) Return the closest point
But i cant think up a good step by step visualization for this one.
To find the point closest to a search point, or to get list of points in order of increasing distance, you can use a priority queue that can hold both points and internal nodes of the tree, which lets you remove them in order of distance.
For points (leaves), the distance is just the distance of the point from the search point. For internal nodes (octants), the distance is the smallest distance from the search point to any point that could possibly be in the octant.
Now, to search, you just put the root of the tree in the priority queue, and repeat:
Remove the head of the priority queue;
If the head of the queue was a point, then it is the closest point in the tree that you have not yet returned. You know this, because if any internal node could possibly have a closer point, then it would have been returned first from the priority queue;
If the head of the queue was an internal node, then put its children back into the queue
This will produce all the points in the tree in order of increasing distance from the search point. The same algorithm works for KD trees as well.

How to find certain sized clusters of points

Given a list of points, I'd like to find all "clusters" of N points. My definition of cluster is loose and can be adjusted to whatever allows an easiest solution: it could be N points within a certain size circle or N points that are all within a distance of each other or something else that makes sense. Heuristics are acceptable.
Where N=2, and we're just looking for all point pairs that are close together, it's pretty easy to do ~efficiently with a k-d tree (e.g. recursively break the space into octants or something, where each area is a different branch on the tree and then for each point, compare it to other points with the same parent (if near the edge of an area, check up the appropriate number of levels as well)). I recognize that inductively with a solution for N=N', I can find solution for N=N'+1 by taking the intersections between different N' solutions, but that's super inefficient.
Anyone know a decent way to go about this?
You start by calculating the Euclidean minimum spanning tree, e.g CGAL can do this. From there the precise algorithm depends on your specific requirements, but it goes roughly like this: You sort the edges in that graph by length. Then delete edges, starting with the longest one. It's a singly connected graph, so with each deleted edge you split the graph into two sub-graphs. Check each created sub-graph if it forms a cluster according to your conditions. If not, continue deleting edges.

Making a fully connected graph using a distance metric

Say I have a series of several thousand nodes. For each pair of nodes I have a distance metric. This distance metric could be a physical distance ( say x,y coordinates for every node ) or other things that make nodes similar.
Each node can connect to up to N other nodes, where N is small - say 6.
How can I construct a graph that is fully connected ( e.g. I can travel between any two nodes following graph edges ) while minimizing the total distance between all graph nodes.
That is I don't want a graph where the total distance for any traversal is minimized, but where for any node the total distance of all the links from that node is minimized.
I don't need an absolute minimum - as I think that is likely NP complete - but a relatively efficient method of getting a graph that is close to the true absolute minimum.
I'd suggest a greedy heuristic where you select edges until all vertices have 6 neighbors. For example, start with a minimum spanning tree. Then, for some random pairs of vertices, find a shortest path between them that uses at most one of the unselected edges (using Dijkstra's algorithm on two copies of the graph with the selected edges, connected by the unselected edges). Then select the edge that yielded in total the largest decrease of distance.
You can use a kernel to create edges only for nodes under a certain cutoff distance.
If you want non-weighted edges You could simply use a basic cutoff to start with. You add an edge between 2 points if d(v1,v2) < R
You can tweak your cutoff R to get the right average number of edges between nodes.
If you want a weighted graph, the preferred kernel is often the gaussian one, with
K(x,y) = e^(-d(x,y)^2/d_0)
with a cutoff to keep away nodes with too low values. d_0 is the parameter to tweak to get the weights that suits you best.
While looking for references, I found this blog post that I didn't about, but that seems very explanatory, with many more details : http://charlesmartin14.wordpress.com/2012/10/09/spectral-clustering/
This method is used in graph-based semi-supervised machine learning tasks, for instance in image recognition, where you tag a small part of an object, and have an efficient label propagation to identify the whole object.
You can search on google for : semi supervised learning with graph

Minimum manhattan distance with certain blocked points

The minimum Manhattan distance between any two points in the cartesian plane is the sum of the absolute differences of the respective X and Y axis. Like, if we have two points (X,Y) and (U,V) then the distance would be: ABS(X-U) + ABS(Y-V). Now, how should I determine the minimum distance between several pairs of points moving only parallel to the coordinate axis such that certain given points need not be visited in the selected path. I need a very efficient algorithm, because the number of avoided points can range up to 10000 with same range for the number of queries. The coordinates of the points would be less than ABS(50000). I would be given the set of points to be avoided in the beginning, so I might use some offline algorithm and/or precomputation.
As an example, the Manhattan distance between (0,0) and (1,1) is 2 from either path (0,0)->(1,0)->(1,1) or (0,0)->(0,1)->(1,1). But, if we are given the condition that (1,0) and (0,1) cannot be visited, the minimum distance increases to 6. One such path would then be: (0,0)->(0,-1)->(1,-1)->(2,-1)->(2,0)->(2,1)->(1,1).
This problem can be solved by breadth-first search or depth-first search, with breadth-first search being the standard approach. You can also use the A* algorithm which may give better results in practice, but in theory (worst case) is no better than BFS.
This is provable because your problem reduces to solving a maze. Obviously you can have so many obstacles that the grid essentially becomes a maze. It is well known that BFS or DFS are the only way to solve mazes. See Maze Solving Algorithms (wikipedia) for more information.
My final recommendation: use the A* algorithm and hope for the best.
You are not understanding the solutions here or we are not understanding the problem:
1) You have a cartesian plane. Therefore, every node has exactly 4 adjacent nodes, given by x+/-1, y+/-1 (ignoring the edges)
2) Do a BFS (or DFS,A*). All you can traverse is x/y +/- 1. Prestore your 10000 obstacles and just check if the node x/y +/-1 is visitable on demand. you don't need a real graph object
If it's too slow, you said you can do an offline calculation - 10^10 only requires 1.25GB to store an indexed obstacle lookup table. leave the algorithm running?
Where am I going wrong?

Given a source node, dest node, and intermediate nodes, how does one detect if the shortest Manhattan Distance is blocked?

Here is the full title I would have posted, but it happens to be too long:
Given a source node, dest node, and intermediate nodes, how does one detect if the shortest Manhattan Distance is blocked by the intermediate nodes?
I've drawn a diagram to make it more clear. On the left side, "u" is the source node and "v" is the destination node. The nodes labeled 1 through 6 are the intermediate nodes. The shortest Manhattan Distance from u -> v would be 12, but the intermediate nodes form a wall blocking it. The diagram on the right, with u' being the source, and v' being the destination, shows that the intermediate nodes 1 through 5 do not block the shortest Manhattan distance from u' to v'.
I'm trying to find an algorithm that won't require me to actually do a graph search (e.g. BFS), because the distance between u and v could potentially be very large.
If all you want to do is detect whether the shortest path (one consisting of moves that monotonically take you in the right direction) is blocked, then you are trying to check whether the blocking nodes cut the rectangle whose corners are given by the source and destination node into two different regions that are disconnected. If no shortest path from the source to the destination is possible, then every path must have some point in it that's blocked.
Let's suppose for simplicity that your start point is below and to the left of the destination point. In that case, find, in O(n), all of the other points that are obstacle points contained in the bounding box holding the start and end point. You now want to see if there is some subset of those nodes that cuts the rectangle into two pieces, one containing the bottom-left corner and one containing the top-right corner. This is possible iff there is a path of the blocking nodes from the left side to the right side, from the left side to the bottom side, from the top side to the right side, or from the top side to the bottom side. Thus we just need to check if any of these are possible.
Fortunately, this can be done efficiently by modeling the problem as a graph search in a graph that has size O(n), where n is the number of blocking points, and has nothing to do with the size of the bounding box. That is, no matter how far apart the test points are, the size of the graph to search depends solely on the number of blocking points.
The graph you want to construct has two parts. First, build a graph where each blocking point is connected to each other blocking point in the 3x3 square surrounding it. These edges link together blocking points that could be part of the same barrier, in that no path from the source to the target could pass between two blocking points joined by an edge. Now, add in four new nodes representing the top wall, left wall, right wall, and bottom wall and connect them to each node that is adjacent to the appropriate wall. That way, for example, a path from the left wall node to the right wall node would represent a series of blocking nodes that make it impossible to get from the bottom-left node to the top-right node.
This graph has size O(n), where n is the number of blocking nodes, since there are O(n) nodes and each node can have at most 12 edges - one for each of the 8 neighbors and potentially one for each of the four walls. You could construct it in at worst quadratic time by scanning over each node and, for each other node, seeing if they are adjacent. There is probably a better way to do this, but nothing comes to me at the moment.
Now that you have the graph, for each of the pairs of walls that, if connected, would disconnect the graph, run a graph search in this graph between those two wall nodes. If a path exists, report that the shortest path is blocked. If not, report that some shortest path is unblocked. These searches could be done with a simple DFS, or since you're running multiple searches and just want to know if they're connected, using a strongly connected components algorithm once and checking if any pair of important nodes are in the same SCC. Either approach takes time O(n).
Thus the time to solve this problem is at most O(n2), with the bottleneck being the time required to construct the graph.
Hope this helps!
Here's my idea:
I'll describe the case when the destination is upper and to right from the source, for other cases, rotate. (For simple cases where the nodes have the same x/y coordinate, just checks whether there's a blocking node directly between them)
Take the matrix with source and destination in corners. Now, a column at a time, from left to right and inside the column, bottom up, mark blocked nodes. A node B is blocked iff any of following is true:
B is an intermediate node
the nodes left to B and bottom from B are both blocked (both were already checked given the order of processing) or outside the bounds of the matrix
In the end, if destination is blocked, there's no free shortest path.
The time required is O(m*n), where m, n are the lengths of sides of the matrix. So when you'll only have several intermediate nodes, templatetypedef's solution may be more appropriate.
EDIT: Got it a little wrong previously, now I hope I didn't miss anything

Resources