I have a large number of point-sets.
Each set is a collection of P points in three dimensions. All sets have the same P. You can think of the points as representing the vertices of a polyhedron that has been rotated many different ways. Since the polyhedron has a large number of symmetries, it is difficult to find the closest neighbouring orientation with reference to its rotation alone.
For a given set point, I want to find the nearest neighbouring point-set.
My current algorithm is as follows:
I throw all of the points into a kd-tree. For a given point-set, I use the kd-tree to find the X nearest neighbours to each point. I then determine which point-sets, if any, are represented in each neighbour-group. Since two nearby point-sets must have their points among the nearest neighbors of each other, this is a quickish way to find candidates.
At the moment, I am defining the between two point-sets like so: Of the possible pairings of points between a set A and B, I choose the one that minimizes the sum of the Euclidean distances between the paired points.
My question is whether there is a more efficient way of accomplishing this?
Related
Suppose I have some n points (in my case, 4 points) in 3 dimensions. I want to determine both the point a which minimizes the squared distance to each of these n points, as well as the largest difference that can exist between the distance from an arbitrary point b and any two of these n points (i.e. the two "farthest points").
How can this be most efficiently accomplished? I know that, in 2 dimensions and with 3 points, the solution to the point that minimized distance is the centroid of the triangle formed by the 3 points, and the solution to the largest difference can be found by taking a point located precisely at one (any?) of the 3 points. It seems that the same should be true in 3 dimensions, although I am unsure.
I want to determine both the point that minimizes distance from each of these n points
The centroid minimizes the sum of the squared distances to every point in the set. But will not minimize the max distance (the farther distance) to the points.
I suspect that you are interested in computing the center and radius of the minimal sphere containing every point in the set. This is a classic problem in CG that can be solved in linear time quite easily in an approximate way, or exactly if you program the algorithm propossed by Emmerich Welzl.
If the number of points is as small as 4, an approximate solution is search the pair of point with maximum distance (there is 12 possible pairs) and compute the midpoint as center and half-distance as radius . Then, ensure that the other two points are also inside the sphere, or make it grow if necessary.
See more information at
https://en.wikipedia.org/wiki/Bounding_sphere
https://en.wikipedia.org/wiki/Smallest-circle_problem
The largest difference between the distances of a point to two given points is achieved when the three points are aligned and the unknown point is "outside" (there are infinitely many solutions). In this configuration, the difference is just the distance between the two given points.
If you mean to maximize all differences simultaneously (or rather the sum of differences), you must go to infinity in some direction. That direction maximizes the sum of the lengths of the projections of all edges.
I have two sets of 3D points and I want to find the closest point in the second set for each point in the first set. In a more difficult case, the sets may have different numbers of points, and I need to find the closest pairs of points. I'm not sure what this problem is called, but I have some brute-force ideas for solving it. For example, I could calculate the distance between all pairs of points and choose the pairs with the shortest total distance. The maximum number of points in each set is 20, so I don't need the most efficient solution.
Consider this question relative to graph theory:
Let G a complete (every vertex is connected to all the other vertices) non-directed graph of size N x N. Two "salesmen" travel this way: the first always visits the nearest non visited vertex, the second the farthest, until they have both visited all the vertices. We must generate a matrix of distances and the starting points for the two salesmen (they can be different) such that:
All the distances are unique Edit: positive integers
The distance from a vertex to itself is always 0.
The difference between the total distance covered by the two salesmen must be a specific number, D.
The distance from A to B is equal to the distance from B to A
What efficient algorithms cn be useful to help me? I can only think of backtracking, but I don't see any way to reduce the work to be done by the program.
Geometry is helpful.
Using the distances of points on a circle seems like it would work. Seems like you could determine adjust D by making the circle radius larger or smaller.
Alternatively really any 2D shape, where the distances are all different could probably used as well. In this case you should scale up or down the shape to obtain the correct D.
Edit: Now that I think about it, the simplest solution may be to simply pick N random 2D points, say 32 bit integer coordinates to lower the chances of any distances being too close to equal. If two distances are too close, just pick a different point for one of them until it's valid.
Ideally, you'd then just need to work out a formula to determine the relationship between D and the scaling factor, which I'm not sure of offhand. If nothing else, you could also just use binary search or interpolation search or something to search for scaling factor to obtain the required D, but that's a slower method.
I'm crossposting this from the mathematics stack exchange at the suggestion of one user who thought somebody here with experience in embedding algorithms might be able to help, though it should be noted that I'm not trying to do a strict graph embedding (which would not allow for vertices to intersect).
Does anybody know of some algorithmic way to tell if it is possible to plot a set of distance constrained points on a cartesian plane. Or, better still, a method to determine the minimum number of dimensions required to accurately depict the points.
As an example: If you have three points and a constraint that says they are all one unit away from each other, you can plot this easily on a cartesian plane as an equilateral triangle.
However, if you have the constraints A->B = 1, A->C = 1, and B->C = 3 then you will not be able to plot these points while maintaining their distances.
However in my case I have a graph with many more than three vertices. The graph is definitely non-planar: one such case involves 1407 vertices all of which are connected by a weighted bidirectional edge that defines the "distance" between the two vertices.
The question is, is there some way to tell if I can depict this graph with accurate distances on a cartesian plane. I know I can't depict it without edges crossing, but I don't care about doing that. I just want the points on the plane an appropriate distance from each other.
Additional information about the graph in case it helps:
1) Each node represents a set of points. 2) The edge weights are derived by optimally overlaying the point sets from each pair of nodes and then taking the RMSD of the resulting point sets. 3) The sets of points represented by any two nodes can be paired with each other. That is, we can think of each node as a set of 8 points numbered 1-8. This numbering is static. When I overlay node A and node B, the points are numbered identically to when I overlay A and C and B and C.
My thoughts: Because RMSD is a metric on R^3 (At least I believe so. This paper claims to prove it http://onlinelibrary.wiley.com/doi/10.1107/S0108767397010325/abstract), it should be possible for me to do this in R^3 at the very least.
As my real goal here is to turn this set of points into a nice figure, a three dimensional depiction would actually suffice, as I could depict the 3D figure in 2D. I also recognize that numerical instability in the particular optimal overlay algorithm I'm using will cause issues, but I'm interested in the answer for an ideal case.
I'm wondering about Manhattan distance. It is very specific, and (I don't know if it's a good word) simple. For example when we are given a set of n points in this metric, then it is very easy to find the distance between two farthest points, in linear time. But is it also easy to find two closest points?
I heard, that there exists universal algorithm for finding two closest points in any metric, but it's complicated. I'm wondering if in this situation (Manhattan metric) it is possible to use special properties of this distance and come up with an easier algorithm, that will be more friendly in implementation?
EDIT: n points on a plane, and lets say -10^9 <= x,y <= 10^9 for all points.
Assuming you're talking about n points on a plane, find among the coordinates the minimal and maximal values of x and y coordinates. Create a matrix sized maxX-minX x maxY-minY, such that all points are representable by a cell in the matrix. Fill the matrix with the n given points (not all cells will be filled, set NaN there, for example). Scan the matrix - shortest distance is between adjacent filled cells in the matrix (there are might be several such pairs).