Searching for K minimum distances between points in 3D - algorithm

I have two disjoint-sets of points in 3D. I need to find the k pair of points with the minimum distances. Each point has (x, y, z) coordinates.
Constaints: The solution has to be a serial optimal solution. No multithreading please. Approaches such as divide and conquer/dynamic programming can be used.
My current approach is:
listOfPairs = []
for all points a in setA
for all points b in setB
distance = calcDistance(a, b)
listOfPairs.append((a, b, distance))
sortByDistance(distance) // using the built in sort method
PrintPointsAndDistances(listOfPairs, k) // print the first k elements
Thanks.

This can be done with a priority queue. As you have done
priorityQueue = PriorityQueue(k) // of size k
for all points a in setA
for all points b in setB
distance = calcDistance(a, b)
priorityQueue.push_with_priority((a, b), distance)
What you are left are the k shortest distance pairs, and the algorithm will run in Θ(N*log(k))

Related

What are the geometric data structures that can support nearest neighbor queries in the 2d plane?

I was wondering if there exist a data structure that can support the following operations(ideally in log(n)
time where n is the number of points):
Nearest neighbor queries where the nearest neighbor to a point is
defined as the function that takes the point and returns the point
that gives the minimum sum of its weight plus its distance from the
queried point.
Insertion of a new point into the data structure
Bulk Updating of the weight of all current points in the structure by a
given number
Assuming that the weights are never negative, we can define a distance
function on R2 × R+ (points × weights) as
d((p, w), (p′, w′)) = d(p, p′) + |w − w′|. Weird metric, but it plugs
right into the cover tree nearest neighbors algorithm. Then we query a
point p by first embedding it as (p, v) where v = 0.
To add a constant c to all of the weights, we adjust the “vantage point”
v by v ← v − c. A new point p added to the structure with weight w
embeds as (p, w − v).

Given a simple polygon P, consisting of n vertices, and Set S Of k points, determine if each of the polygon vertices are covered by some point from S

Given a simple polygon P, consisting of n vertices, and Set S Of k points, determine if each of the polygon vertices are covered by some point from S.
My best solution was to check for every P vertex if there exist such point in S - total complexity of O(n*k). I belive there should be a more efficient solution. any hints?
Whether P is a polygon or not seems to be irrelevant. So the generalized question becomes: Given 2 sets of points A (with a points) and B (with b points), find out whether A is a subset of B or not?
The simple solution is O(a * b) but you can also get O(a + b) by doing some preprocessing.
Put all the points of B in a hash map with the x-coordinate as key and a hash set with the y-coordinates as values (Map<Number,Set<Number>>). This lets you query whether a point (x, y) is in B in O(1): map.containsKey(x) && map.get(x).contains(y).
Go through all the points of A and check whether the point is in B using the datastructure created above.
Step 1 is O(b) and Step 2 is O(a) which gives O(a + b).

Maximum area rectangle

Given set of points (x[1]; y[1]), (x[2]; y[2]), ..., (x[n]; y[n]) . We need to find maximum area of rectangle that we can get. Rectangle's vertexes should be in points set. Also, rectangle is not necessary be axis-aligned. For example, answer for (1; 1), (2; 2), (2; 0); (3; 1) is 2.
n <= 1300; -10^9 <= x[i], y[i] <= 10^9.
Can someone help me with this problem? My solution is brute-force O(N^3), it's giving TLE. I select some three points and find fourth.
Every pair of points determines a line L, which has a slope m and an intercept c. (Ignore vertical lines for now.) Instead of considering the intercept, let's work with a different quantity that gives much the same information: The distance d(L) between the line and the origin, i.e., the length of a line segment R perpendicular to L and connecting L to the origin. Additionally, we can talk about the "displacement" of a point along L: We can say that the point p on L where it meets R has displacement 0, and the point on L that is x "above" p (has distance x from p and higher y coordinate) has displacement x, with negative displacements for points "below" p. In fact, we don't need the intercept or d(L) to define the displacement of a point with respect to a line L -- just the line's slope. Define disp(m, q) to be the displacement of point q on a line with slope m.
Suppose a, b, c, d are the vertices of a rectangle, with sides ab, bc, cd and da. Observe that the line containing ab has the same slope m as the line containing cd, and (disp(m, a), disp(m, b)) = (disp(m, d), disp(m, c)). So the only 4-tuples of vertices that we need to test are those comprised of pairs of vertex pairs like ab and cd -- vertex pairs having the same slope and displacement pairs. Furthermore, one side length (shared by ab and cd) is equal to |disp(m, b) - disp(m, a)|, and the other side length will be |d(Lab) - d(Lcd)|, where Lab and Lcd are the lines containing the line segments ab and cd, respectively.
To find these 4-tuples of vertices efficiently:
For all pairs of vertices i, j:
Let L be the line passing through i and j. Compute its slope m and distance d(L) from the origin. Also compute disp(m, i) and disp(m, j). If disp(m, i) <= disp(m, j), add the tuple (m, disp(m, i), disp(m, j), d(L)) to an array Z.
Sort Z lexicographically. This will place all point pairs lying on lines of the same slope and having equal displacements in a contiguous block, ordered by increasing d(L).
Scan through the array, looking for block boundaries -- positions k at which any of the first three tuple elements changes. Let prev be the last such k found (initially, prev = 0). For each such k:
Compute (Z[k-1][3] - Z[prev][3]) * (Z[k-1][2] - Z[k-1][1]). This is the area of the largest rectangle having a pair of sides with slope Z[k-1][0] and length (Z[k-1][2] - Z[k-1][1]). If this is greater than the maximum rectangle size found so far, update it.
This algorithm takes O(n^2 log n) time and O(n^2) space.

How to get distance matrix from Adjacency matrix matlab

I have adjacency matrix let it be called A size n*n
Where A(k,j)=A(j,k)=1 if k,j are connected in 1 hop.
Now it look that if I take
Dist=double(A)*double(A)>0 %getting all two hops connectivity
Dist=double(Dist)*double(A)>0 %getting all three hops connectivity
Dist=double(Dist)*double(A)>0 %getting all four hops connectivity
Is this right at all?
I tried it with some simple graphs and it looks legit
Can I use this fact to create distance matrix?
Where distance matrix will show the minimum number of hops from j to k
P.S:
If it legit I will be happy to understand why it is right, did now find info in Google
Yes, this is perfectly right: the entries of the adjacency matrix gives you the connections between vertices. Powers of the adjacency matrix are concatenating walks. The ijth entry of the kth power of the adjacency matrix tells you the number of walks of length k from vertex i to vertex j.
This can be quite easily proven by induction.
Be aware that the powers of the adjacency matrix count the number of i→j walks, not paths (a walk can repeat vertices, while a path cannot). So, to create a distance matrix you need to iterativerly power your adjacency matrix, and as soon as a ijth element is non-zero you have to assign the distance k in your distance matrix.
Here is a try:
% Adjacency matrix
A = rand(5)>0.5
D = NaN(A);
B = A;
k = 1;
while any(isnan(D(:)))
% Check for new walks, and assign distance
D(B>0 & isnan(D)) = k;
% Iteration
k = k+1;
B = B*A;
end
% Now D contains the distance matrix
Note that if you are searching for the shortest paths in a graph, you can also use Dijkstra's algorithm.
Finally, note that this is completely comptatible with sparse matrices. As adjacency matrices are often good candidates for sparse matrices, it may be highly beneficial in terms of performance.
Best,

How to find independent points in a unit square in O(n log n)?

Consider a unit square containing n 2D points. We say that two points p and q are independent in a square, if the Euclidean distance between them is greater than 1. A unit square can contain at most 3 mutually independent points. I would like to find those 3 mutually independent points in the given unit square in O(n log n). Is it possible? Please help me.
Can this problem be solved in O(n^2) without using any spatial data structures such as Quadtree, kd-tree, etc?
Use a spatial data structure such as a Quadtree to store your points. Each node in the quadtree has a bounding box and a set of 4 child nodes, and a list of points (empty except for the leaf nodes). The points are stored in the leaf nodes.
The point quadtree is an adaptation of a binary tree used to represent two-dimensional point data. It shares the features of all quadtrees but is a true tree as the center of a subdivision is always on a point. The tree shape depends on the order in which data is processed. It is often very efficient in comparing two-dimensional, ordered data points, usually operating in O(log n) time.
For each point, maintain a set of all points that are independent of that point.
Insert all your points into the quadtree, then iterate through the points and use the quadtree to find the points that are independent of each:
main()
{
for each point p
insert p into quadtree
set p's set to empty
for each point p
findIndependentPoints(p, root node of quadtree)
}
findIndependentPoints(Point p, QuadTreeNode n)
{
Point f = farthest corner of bounding box of n
if distance between f and p < 1
return // none of the points in this node or
// its children are independent of p
for each point q in n
if distance between p and q > 1
find intersection r of q's set and p's set
if r is non-empty then
p, q, r are the 3 points -> ***SOLVED***
add p to q's set of independent points
add q to p's set of independent points
for each subnode m of n (up 4 of them)
findIndependentPoints(p, m)
}
You could speed up this:
find intersection r of q's set and p's set
by storing each set as a quadtree. Then you could find the intersection by searching in q's quadtree for a point independent of p using the same early-out technique:
// find intersection r of q's set and p's set:
// r = findMututallyIndependentPoint(p, q's quadtree root)
Point findMututallyIndependentPoint(Point p, QuadTreeNode n)
{
Point f = farthest corner of bounding box of n
if distance between f and p < 1
return // none of the points in this node or
// its children are independent of p
for each point r in n
if distance between p and r > 1
return r
for each subnode m of n (up 4 of them)
findMututallyIndependentPoint(p, m)
}
An alternative to using Quadtrees is using K-d trees, which produces more balanced trees where each leaf node is a similar depth from the root. The algorithm for finding independent points in that case would be the same, except that there would only be up to 2 and not 4 child nodes for each node in the data structure, and the bounding boxes at each level would be of variable size.
You might want to try this out.
Pick the top left point (Y) with coordinate (0,1). Calculate distance from each point from the List to point Y.
Sort the result in increasing order into SortedPointList (L)
If the first point (A) and the last point (B) in list L are independent:
Foreach point P in list L:
if P is independent to both A and B:
Return A, B, P
Pick the top right point (X) with coordinate (1,1). Calculate distance from each point from the List to point X.
Sort the result in increasing order into SortedPointList (S)
If the first point (C) and the last point (D) in list L are independent:
Foreach point O in list S:
if P is independent to both C and D:
Return C, D, O
Return null
This is a wrong solution. Kept it just for comments. If one finds another solution based on smallest enclosing circle, please put a link as a comment.
Solve the Smallest-circle problem.
If diameter of a circle <= 1, return null.
If the circle is determined by 3 points, check which are "mutually independent". If there are only two of them, try to find the third by iteration.
If the circle is determined by 2 points, they are "mutually independent". Try to find the third one by iteration.
Smallest-sircle problem can be solved in O(N), thus the whole problem complexity is also O(N).

Resources