Algorithm for Octree for nearest neighbor search - algorithm

Problem Statement: To find the nearest GRID ID of each of the particles using Octree.
Fig[1]:
Fig[2]:
I have a system of particles(~6k, movable) for which I need to check which grid point (rigid; in picture) is nearest to. Somebody have suggested me to go for Octree as it is fast(est) for 3D Grids.
Is this the correct Algorithm for recursive Octree to get the nearest grid point of the grid?
Get a input as point P Start coordinate C (first time it [0,0,0])
Start Size = [Sx, Sy, Sz]
Get all 8 mid point Mi = {M1,..,M8} get minimum distance of Mi and P
Say M get start position of M as Cn set size Sn = [Sx/8, Sy/8, Sz/8]
if distance of M and P is less than 2 * (Grid Space G):
5.1. Iterate all the grid points from Cn to Sn
5.2. Print least as result
else
6.1. set Start coordinate as Cn
6.2. set Size as Sn
6.3. Goto 1
Problem: The last iteration eat all speed if the particle is out or nearly on the border as it checks all A x B x C.
Please suggest if you have a better way to solve this problem.

There is no need to use an octree here. Octree is useful for the reverse problem (given a grid point, find the nearest particule) but completely useless here.
Assuming the size of a grid cell is (a, b, c), then the nearest grid point from (x, y, z) is (a*floor(x/a+0.5), b*floor(y/b+0.5), c*floor(z/c+0.5)).

Related

Fast algorithm for testing if every rectangle built using any two points from a list contains a third point from that list

The problem is as it follows:
Given N (N <= 100,000) points by their Cartesian coordinates, test if every rectangle (with an area > 0) built using any two of them contains at least another point from that list either inside the rectangle or on his margin.
I know the algorithm for O(N^2) time complexity. I want to find the solution for O(N * logN). Memory is not a problem.
When I say two points define a rectangle, I mean that those two points are in two of its opposite corners. The sides of the rectangles are parallel to the Cartesian axes.
Here are two examples:
N = 4
Points:
1 1
3 5
2 4
8 8
INCORRECT: (the rectangle (1, 1) - (2, 4) doesn't have any point inside of it or on its margin).
N = 3
Points:
10 9
13 9
10 8
CORRECT.
Sort the points in order of the max of their coordinate pairs, from lowest to highest (O(n*log(n))). Starting with the lower-left point (lowest max coordinate), if the next point in the ordering does not share either the original point's x-value or its y-value (e.g. (1,2) and (5,2) share a y-coordinate of 2, but (1,2) and (2, 1) have neither coordinate in common), the set of points fails the test. Otherwise, move to the next point. If you reach the final point in this way (O(n)), the set is valid. The algorithm overall is O(n*log(n)).
Explanation
Two points that share a coordinate value lie along a line parallel to one of the axes, so there is no rectangle drawn between them (since such a rectangle would have area=0).
For a given point p1, if the next "bigger" point in the ordering, p2, is directly vertical or horizontal from p1 (i.e. it shares a coordinate value), then all points to the upper-right of p2 form rectangles with p1 that include p2, so there are no rectangles in the set that have p1 as the lower-left corner and lack an internal point.
If, however, the next-bigger point is diagonal from p2, then the p1 <-> p2 rectangle has no points from the set inside it, so the set is invalid.
For every point P = (a, b) in the set, search the nearest points of the form Y = (x, b) and X = (a, y) such that x > a and y > b.
Then, find if the rectangle defined by the two points X, Y contains any* internal point R besides P, X and Y. If that is the case, it's easy to see that the rectangle P, R does not contain any other point in the set.
If no point in the set exists matching the restrictions for X or Y, then you have to use (a, ∞) or (∞, b) respectively instead.
The computational cost of the algorithm is O(NlogN):
Looking for X or Y can be done using binary search [O(logN)] over a presorted list [O(NlogN)].
Looking for R can be done using some spatial tree structure as a quadtree or a k-d tree [O(logN)].
*) If X, Y contains more than one internal point, R should be selected as the nearest to P.
Update: the algorithm above works for rectangles defined by its botton-left and upper-right corners. In order to make it work also for rectangles defined by its botton-right and upper-left corners, a new point X' (such that it is the nearest one to P of the form X' = (a, y') where y' < b) and the corresponding rectangle defined by X', Y should also be considered for every point in the set.

Filling a polygon with maximum number of grid squares

I've got a 2d polygon.
I'm placing it on a square grid and marking the squares that are completely inside the shape.
I need to find the placement that maximizes number of marked squares. The polygon orientation is fixed, it can be only translated.
How can I do so?
Let's solve the problem for the case if we can only move our polygon P to the right and the width of the cell equals w.
First of all, notice that it is enough to explore shifts for the distance dP in [0; w), because if we move P to w to the right we get the same situation as if we didn't move it at all.
Let SP to be an amount of cells currently in P. Suppose we moved P for a little. What could happen? Well, some of the vertices of the grid now could be out of P (let's call this set O), and some new vertices could be now in P (set I). How to determine if we lost or acquired any cells? If a cell was in P and had corner, which is in O, then we should decrease SP. If, on the contrary, a cell has corners in I and other corners are already in P, then we should increase SP.
Let's now sort these events (acquiring and losing of vertices) by increasing it's distances from the initial position of P. Thus we formalized the order of our "little steps" in algorithm.
Now we can write some pseudocode:
def signedDistance(vertex, edge):
p = [ closest point of edge to vertex ]
return vertex.x - p.x
Events = { (vertex, edge) : 0 <= signedDistance(vertex, edge) < w }
sort(Events, [ by increasing signedDistance ])
EventsEquiv = { E' : E' is subset of Events and
for any a, b from E'
signedDistance(a.vertex, a.edge) = signedDistance(b.vertex, b.edge) }
S = [ cells in P initially ]
maxS = S
for E' in EventsEquiv:
for e in E':
if e is loss: S -= 1
else if e is acquirement: S += 1
if S > maxS:
maxS = S
The method is close to the sweep line.
UPD: To generalize this we need to notice that for any optimal position of P exists another optimal position that P has a grid vertex on an edge. So the solution is to fix some grid vertex G and move P "around" it so P always has G on an edge step by step, where steps are produced by the events occuring, which were described above. Algorithm takes O(|P| / w).
The polygon is large, the size of the grid squares small (e.g. 10x10 pixel)?
Then there is one simple bruteforce solution:
Start with one grid, count the squares.
2.1-2.10 Move the grid 1px to the right, count and update best score if nec.
Move the grid 1px up, count and update best score if nec.
repeat aka go to 2.1 until all possible solutions checked...
It's easy to check if a square is within a polygon or not. You could implement an algorithm that goes along the edges...

find all points within a range to any point of an other set

I have two sets of points A and B.
I want to find all points in B that are within a certain range r to A, where a point b in B is said to be within range r to A if there is at least one point a in A whose (Euclidean) distance to b is equal or smaller to r.
Each of the both sets of points is a coherent set of points. They are generated from the voxel locations of two non overlapping objects.
In 1D this problem fairly easy: all points of B within [min(A)-r max(A)+r]
But I am in 3D.
What is the best way to do this?
I currently repetitively search for every point in A all points in B that within range using some knn algorithm (ie. matlab's rangesearch) and then unite all those sets. But I got a feeling that there should be a better way to do this. I'd prefer a high level/vectorized solution in matlab, but pseudo code is fine too :)
I also thought of writing all the points to images and using image dilation on object A with a radius of r. But that sounds like quite an overhead.
You can use a k-d tree to store all points of A.
Iterate points b of B, and for each point - find the nearest point in A (let it be a) in the k-d tree. The point b should be included in the result if and only if the distance d(a,b) is smaller then r.
Complexity will be O(|B| * log(|A|) + |A|*log(|A|))
I archived further speedup by enhancing #amit's solution by first filtering out points of B that are definitely too far away from all points in A, because they are too far away even in a single dimension (kinda following the 1D solution mentioned in the question).
Doing so limits the complexity to O(|B|+min(|B|,(2r/res)^3) * log(|A|) + |A|*log(|A|)) where res is the minimum distance between two points and thus reduces run time in the test case to 5s (from 10s, and even more in other cases).
example code in matlab:
r=5;
A=randn(10,3);
B=randn(200,3)+5;
roughframe=[min(A,[],1)-r;max(A,[],1)+r];
sortedout=any(bsxfun(#lt,B,roughframe(1,:)),2)|any(bsxfun(#gt,B,roughframe(2,:)),2);
B=B(~sortedout,:);
[~,dist]=knnsearch(A,B);
B=B(dist<=r,:);
bsxfun() is your friend here. So, say you have 10 points in set A and 3 points in set B. You want to have them arrange so that the singleton dimension is at the row / columns. I will randomly generate them for demonstration
A = rand(10, 1, 3); % 10 points in x, y, z, singleton in rows
B = rand(1, 3, 3); % 3 points in x, y, z, singleton in cols
Then, distances among all the points can be calculated in two steps
dd = bsxfun(#(x,y) (x - y).^2, A, B); % differences of x, y, z in squares
d = sqrt(sum(dd, 3)); % this completes sqrt(dx^2 + dy^2 + dz^2)
Now, you have an array of the distance among points in A and B. So, for exampl, the distance between point 3 in A and point 2 in B should be in d(3, 2). Hope this helps.

Algorithm to calculate the distances between many geo points

I have a matrix having around 1000 geospatial points(Longitude, Latitude) and i am trying to find the points that are in 1KM range.
NOTE: "The points are dynamic, Imagine 1000 vehicles are moving, so i have to re-calculate all distances every few seconds"
I did some searches and read about Graph algorithms like (Floyd–Warshall) to solve this, and I ended up with many keywords, and i am kinda lost now. I am considering the performance and since the search radius is short, I will not consider the curvature of the earth.
Basically, It appears that i have to calculate the distance between every point to every other point then sort the distances starting from every point in the matrix and get the points that are in its range. So if I have 1000 co-ordinates, I have to perfom this process (1000^2-1000) times and I do not beleive this is the optimum solution. Thank You.
If you make a modell with a grid of 1km spacing:
0 1 2 3
___|____|____|____
0 | | |
c| b|a | d
___|____|____|____
1 | | |
| |f |
___|e___|____|____
2 | |g |
let's assume your starting point is a.
If your grid is of 1km size, points in 1km reach have to be in the same cell or one of the 8 neighbours (Points b, d, e, f).
Every other cell can be ignored (c,g).
While d is nearly of the same distance to a as c, c can be dropped early, because there are 2 barriers to cross, while a and d lie on opposite areas of their border, and are therefore nearly 2 km away from each other.
For early dropping of element, you can exclude, it is enough to check the x- or y-part of the coordinate. Since a belongs to (0,2), if x is 0 or smaller, or > 3, the point is already out of range.
After filtering only few candidates, you may use exhaustive search.
In your case, you should be looking at the GeoHash which allows you to quickly query the coordinates within a given distance.
FYI, MongoDB uses geohash internally and it's performing excellently.
Try with an R-Tree. The R-Tree supports the operation to find all the points closest to a given point that are not further away than a given radius. The execution time is optimal and I think it's O(number_of_points_in_the_result).
You could compute geocodes of 1km range around each of those 1000 coordinates and check, whether some points are in that range. May be it's not optimum, but you will save yourself some sorting.
If you want to lookup the matrix for each point vs. each point then you already got the right formula (1000^2-1000). There isn't any shortcut for this calculation. However when you know where to start the search and you want look for points within a 1KM radius you can use a grid or spatial algorithm to speed up the lookup. Most likely it's uses a divide and conquer algorithm and the cheapest of it is a geohash or a z curve. You can also try a kd-tree. Maybe this is even simpler. But if your points are in euklidian space then there is this planar method describe here: http://en.wikipedia.org/wiki/Closest_pair_of_points_problem.
Edit: When I say 1000^2-1000 then I mean the size of the grid but it's actually 1000^(1000 − 1) / 2 pairs of points so a lot less math.
I have something sort of similar on a web page I worked on, I think. The user clicks a location on the map and enters a radius, and a function returns all the locations within a database within the given radius. Do you mean you are trying to find the points that are within 1km of one of the points in the radius? Or are you trying to find the points that are within 1km of each other? I think you should do something like this.
radius = given radius
x1 = latitude of given point;
y1 = longitude of given point;
x2 = null;
y2 = null;
x = null;
y = null;
dist = null;
for ( i=0; i<locationArray.length; i++ ) {
x2 = locationArray[i].latitude;
y2 = locationArray[i].longitude;
x = x1 - x2;
y = y1 - y2;
dist = sqrt(x^2 + y^2);
if (dist <= radius)
these are your points
}
If you are trying to calculate all of the points that are within 1km of another point, you could add an outer loop giving the information of x1 and y1, which would then make the inner loop test the distance between the given point and every other point giving every point in your matrix as input. The calculations shouldn't take too long, since it is so basic.
I had the same problem but in a web service development
In my case to avoid the calculation time problem i used a simple divide & conquer solution : The idea was start the calculation of the distance between the new point and the others in every new data insertion, so that my application access directly the distance between those tow points that had been already calculated and put in my database

Algorithm to find length of a segment that connects center of two rectangles

Ok, here is the story : I found this problem in one of the pizza boxes a few weeks ago. It said if you can solve this before you could finish the pizza, you would get hired at tripadviser. Though I am not looking to get hired, this problem got my eyes and spoiled my focus on pizza and dinner. I worked out something but with some assumptions. Here is the question :
Assume we know P,Q R and S. There is the line connecting centers of each rectangle. We need to find out points C and D. I am not sure if there is some other variable that we should know to solve this.
EDIT
Looking for a programmatic or psudo-code explanation- no need to move to maxthexchange.
Any suggestions ?
It's pretty simple to do step-by-step:
Compute A = (P + Q) / 2 and B = R + S / 2 (component-by-component)
An equation for the line between A and B is L(t) = t * A + (1 - t) * (B - A). Just solve
this linear equation for a t* such that L(t*).y = Q.y to get C = L(t*). Do a similar thing with L(t).y = R.y to get D.
You can also use the values of t* that you get when solving for C and D to determine pathological cases like overlapping rectangles.
You actually don't need to find the points C and D to find the distance.
I assume you already know the coordinates of the rectangle. It's trivial to compute the coordinates of the center points and the lenghts of the edges.
Now, imagine a vertical line passing through A and a horizontal line passing through B. They intersect at a point, call it X. Also, imagine a vertical line passing through C and call its intersection point with the top edge of rectangle RS - C'.
You can trivially compute the length of AX. But the length of AX is half the height of RS + half the height of PQ (both of which you know) + the length of CC'.
So now you know the length of CC' (call it x).
You can also compute the angle (call it n) that AB makes with CC' from A and B's coordinates, since you know CC' is vertical.
Ergo, the length of the segment CD is x * cos(n).

Resources