Nearest Points between two set of points - algorithm

There are two sets of points, said set A and B, and for each point a in set A, try to get a subset sub_B of B, that the distance between b in sub_B and a is smaller than the distance between b and any other A's point. And the point is 2-dimensional. For example, A = { (0,0), (3,0) }, B = { (1,1), (2,1) }, then for (0,0) in A, its set is {(1,1)}, for (3,0) in A, its set is {(2,1)}. Obviously, a brute force method could get the result in O(mn), where m is A's size and n is B's size. My question is whether there's a better solution?

All the points in set A can be arranged into a Space Paritioned Tree and every point in B can be used as query to find the nearest neighbor in set A and taking all the set of points with smallest distance. This gives a O(N*log(N)+M*log(N)) solution using kd-trees.

Related

Given a simple polygon P, consisting of n vertices, and Set S Of k points, determine if each of the polygon vertices are covered by some point from S

Given a simple polygon P, consisting of n vertices, and Set S Of k points, determine if each of the polygon vertices are covered by some point from S.
My best solution was to check for every P vertex if there exist such point in S - total complexity of O(n*k). I belive there should be a more efficient solution. any hints?
Whether P is a polygon or not seems to be irrelevant. So the generalized question becomes: Given 2 sets of points A (with a points) and B (with b points), find out whether A is a subset of B or not?
The simple solution is O(a * b) but you can also get O(a + b) by doing some preprocessing.
Put all the points of B in a hash map with the x-coordinate as key and a hash set with the y-coordinates as values (Map<Number,Set<Number>>). This lets you query whether a point (x, y) is in B in O(1): map.containsKey(x) && map.get(x).contains(y).
Go through all the points of A and check whether the point is in B using the datastructure created above.
Step 1 is O(b) and Step 2 is O(a) which gives O(a + b).

How to find independent points in a unit square in O(n log n)?

Consider a unit square containing n 2D points. We say that two points p and q are independent in a square, if the Euclidean distance between them is greater than 1. A unit square can contain at most 3 mutually independent points. I would like to find those 3 mutually independent points in the given unit square in O(n log n). Is it possible? Please help me.
Can this problem be solved in O(n^2) without using any spatial data structures such as Quadtree, kd-tree, etc?
Use a spatial data structure such as a Quadtree to store your points. Each node in the quadtree has a bounding box and a set of 4 child nodes, and a list of points (empty except for the leaf nodes). The points are stored in the leaf nodes.
The point quadtree is an adaptation of a binary tree used to represent two-dimensional point data. It shares the features of all quadtrees but is a true tree as the center of a subdivision is always on a point. The tree shape depends on the order in which data is processed. It is often very efficient in comparing two-dimensional, ordered data points, usually operating in O(log n) time.
For each point, maintain a set of all points that are independent of that point.
Insert all your points into the quadtree, then iterate through the points and use the quadtree to find the points that are independent of each:
main()
{
for each point p
insert p into quadtree
set p's set to empty
for each point p
findIndependentPoints(p, root node of quadtree)
}
findIndependentPoints(Point p, QuadTreeNode n)
{
Point f = farthest corner of bounding box of n
if distance between f and p < 1
return // none of the points in this node or
// its children are independent of p
for each point q in n
if distance between p and q > 1
find intersection r of q's set and p's set
if r is non-empty then
p, q, r are the 3 points -> ***SOLVED***
add p to q's set of independent points
add q to p's set of independent points
for each subnode m of n (up 4 of them)
findIndependentPoints(p, m)
}
You could speed up this:
find intersection r of q's set and p's set
by storing each set as a quadtree. Then you could find the intersection by searching in q's quadtree for a point independent of p using the same early-out technique:
// find intersection r of q's set and p's set:
// r = findMututallyIndependentPoint(p, q's quadtree root)
Point findMututallyIndependentPoint(Point p, QuadTreeNode n)
{
Point f = farthest corner of bounding box of n
if distance between f and p < 1
return // none of the points in this node or
// its children are independent of p
for each point r in n
if distance between p and r > 1
return r
for each subnode m of n (up 4 of them)
findMututallyIndependentPoint(p, m)
}
An alternative to using Quadtrees is using K-d trees, which produces more balanced trees where each leaf node is a similar depth from the root. The algorithm for finding independent points in that case would be the same, except that there would only be up to 2 and not 4 child nodes for each node in the data structure, and the bounding boxes at each level would be of variable size.
You might want to try this out.
Pick the top left point (Y) with coordinate (0,1). Calculate distance from each point from the List to point Y.
Sort the result in increasing order into SortedPointList (L)
If the first point (A) and the last point (B) in list L are independent:
Foreach point P in list L:
if P is independent to both A and B:
Return A, B, P
Pick the top right point (X) with coordinate (1,1). Calculate distance from each point from the List to point X.
Sort the result in increasing order into SortedPointList (S)
If the first point (C) and the last point (D) in list L are independent:
Foreach point O in list S:
if P is independent to both C and D:
Return C, D, O
Return null
This is a wrong solution. Kept it just for comments. If one finds another solution based on smallest enclosing circle, please put a link as a comment.
Solve the Smallest-circle problem.
If diameter of a circle <= 1, return null.
If the circle is determined by 3 points, check which are "mutually independent". If there are only two of them, try to find the third by iteration.
If the circle is determined by 2 points, they are "mutually independent". Try to find the third one by iteration.
Smallest-sircle problem can be solved in O(N), thus the whole problem complexity is also O(N).

How could I calculate if XY coordinate is within certain area?

So lets imagine that I have a grid of 10x10 (can be any size, but just for the sake of example say 10), and with that grid there is 3 points marking vertexes of a triangle (again can be any amount of points delimiting any arbitrary shape).
So my question is.. Is there a way given just this information to determine programatically if any given coordinate is within that shape?
Lest say the coordinates are 3,2-7,3-5,5. Can I while iterating over the given grid pick out the cells that fall within these points?
Call P the point that you are checking, and S1,S2,...,Sn the n vertices of the shape.
Assume that P ≠ Si for all i.
Is P on the boundary?
If 1 is no, then randomly choose a line L that passes through P
Pick a point F that you know is outside the polygon
Follow the sequence of intersections of L with the shape from F until you hit P (Call the Sequence F, ..., P)
Count the sequence F, ..., P, store the value in M
If M is even, then P is in the polygon, Otherwise P is not in the polygon
NOTE: By introducing the starting point F, we change the parity mentioned in the point in polygon algorithm description on wikipedia

Matching points in 2 D space

I have 2 matrices A and B both of size Rows X 2 where Rows = m , n for A and B respectively. These m and n denote the points in the euclidean space.
The task I wish to perform is to match the maximum number of points from A and B ( assuming A has less number of points than B ) given the condition that the distance is less than a threshold d and each pair is unique.
I have seen this nearest point pairs but this won't work on my problem because for every point in A it select the minimum left in B. However it may happen that the first pair I picked from A and B was wrong leading to less number of matching pairs.
I am looking for a fast solution since both A and B consists of about 1000 points each. Again, some points will be left and I am aware that this would somehow lead to an exhaustive search.
I am looking for a solution where there is some sort of inbuilt functions in matlab or using data structures that can help whose matlab code is available such as kd-trees. As mentioned I have to find unique nearest matching points from B to A.
You can use pdist2 to compute a pairwise distance between two pairs of observations (of different sizes). The final distance matrix will be an N x M matrix which you can probe for all values above the desired threshold.
A = randn(1000, 2);
B = randn(500, 2);
D = pdist2(A, B, 'euclidean'); % euclidean distance
d = 0.5; % threshold
indexD = D > d;
pointsA = any(indexD, 2);
pointsB = any(indexD, 1);
The two vectors provide logical indexes to the points in A and B that have at least one match, defined by the minimum distance d, on the other. The resulting sets will be composed of the entire set of elements from matrix A (or B) with distance above d from any element of the other matrix B (or A).
You can also generalize to more than 2 dimensions or different distance metrics.

Algorithm to find the closest 3 points that when triangulated cover another point

Picture a canvas that has a bunch of points randomly dispersed around it. Now pick one of those points. How would you find the closest 3 points to it such that if you drew a triangle connecting those points it would cover the chosen point?
Clarification: By "closest", I mean minimum sum of distances to the point.
This is mostly out of curiosity. I thought it would be a good way to estimate the "value" of a point if it is unknown, but the surrounding points are known. With 3 surrounding points you could extrapolate the value. I haven't heard of a problem like this before, doesn't seem very trivial so I thought it might be a fun exercise, even if it's not the best way to estimate something.
Your problem description is ambiguous. Which triangle are you after in this figure, the red one or the blue one?
The blue triangle is closer based on lexicographic comparison of the distances of the points, while the red triangle is closer based on the sum of the distances of the points.
Edit: you clarified it to make it clear that you want the sum of distances to be minimized (the red triangle).
So, how about this sketch algorithm?
Assume that the chosen point is at the origin (makes description of algorithm easy).
Sort the points by distance from the origin: P(1) is closest, P(n) is farthest.
Start with i = 3, s = ∞.
For each triple of points P(a), P(b), P(i) with a < b < i, if the triangle contains the origin, let s = min(s, |P(a)| + |P(b)| + |P(i)|).
If s ≤ |P(1)| + |P(2)| + |P(i)|, stop.
If i = n, stop.
Otherwise, increment i and go back to step 4.
Obviously this is O(n³) in the worst case.
Here's a sketch of another algorithm. Consider all pairs of points (A, B). For a third point to make a triangle containing the origin, it must lie in the grey shaded region in this figure:
By representing the points in polar coordinates (r, θ) and sorting them according to θ, it is straightforward to examine all these points and pick the closest one to the origin.
This is also O(n³) in the worst case, but a sensible order of visiting pairs (A, B) should yield an early exit in many problem instances.
Just a warning on the iterative method. You may find a triangle with 3 "near points" whose "length" is greater than another resulting by adding a more distant point to the set. Sorry, can't post this as a comment.
See Graph.
Red triangle has perimeter near 4 R while the black one has 3 Sqrt[3] -> 5.2 R
Like #thejh suggests, sort your points by distance from the chosen point.
Starting with the first 3 points, look for a triangle covering the chosen point.
If no triangle is found, expand you range to include the next closest point, and try all combinations.
Once a triangle is found, you don't necessarily have the final answer. However, you have now limited the final set of points to check. The furthest possible point to check would be at a distance equal to the sum of the distances of the first triangle found. Any further than this, and the sum of the distances is guaranteed to exceed the first triangle that was found.
Increase your range of points to include the last point whose distance <= the sum of the distances of the first triangle found.
Now check all combinations, and the answer is the triangle found from this set with the minimal sum of distances.
second shot
subsolution: (analytic geometry basics, skip if you are familiar with this) finding point of the opposite half-plane
Example: Let's have two points: A=[a,b]=[2,3] and B=[c,d]=[4,1]. Find vector u = A-B = (2-4,3-1) = (-2,2). This vector is parallel to AB line, so is the vector (-1,1). The equation for this line is defined by vector u and point in AB (i.e. A):
X = 2 -1*t
Y = 3 +1*t
Where t is any real number. Get rid of t:
t = 2 - X
Y = 3 + t = 3 + (2 - X) = 5 - X
X + Y - 5 = 0
Any point that fits in this equation is in the line.
Now let's have another point to define the half-plane, i.e. C=[1,1], we get:
X + Y - 5 = 1 + 1 - 5 < 0
Any point with opposite non-equation sign is in another half-plane, which are these points:
X + Y - 5 > 0
solution: finding the minimum triangle that fits the point S
Find the closest point P as min(sqrt( (Xp - Xs)^2 + (Yp - Ys)^2 ))
Find perpendicular vector to SP as u = (-Yp+Ys,Xp-Xs)
Find two closest points A, B from the opposite half-plane to sigma = pP where p = Su (see subsolution), such as A is on the different site of line q = SP (see final part of the subsolution)
Now we have triangle ABP that covers S: calculate sum of distances |SP|+|SA|+|SB|
Find the second closest point to S and continue from 1. If the sum of distances is smaller than that in previous steps, remember it. Stop if |SP| is greater than the smallest sum of distances or no more points are available.
I hope this diagram makes it clear.
This is my first shot:
split the space into quadrants
with picked point at the [0,0]
coords
find the closest point
from each quadrant (so you have 4
points)
any triangle from these
points should be small enough (but not necesarilly the smallest)
Take the closest N=3 points. Check whether the triange fits. If not, increment N by one and try out all combinations. Do that until something fits or nothing does.

Resources