I have a collection of points in some N-dimensional space, where all I know is the distances between them. Let's say it's an unordered collection of structs like the following:
struct {
int first; // Just some identifier that uniquely specifies a point
int second; // No importance to which point is first or second
float separation; // The distance between the first and second points -- always positive
};
Of course the algorithm doesn't have to be C code. I just wrote the struct in this style to make the problem clear. It rather upsets me that the struct spoils the symmetry between the two end-points, but fixing this just makes things more complicated.
Let's say that the separations are defined by the Pythagorean distance between them, and the space is Euclidean. Let's also specify that the separations are internally consistent. For example, given separations AB, BC and AC, we know that AB + BC >= AC.
I want an algorithm that finds the minimal dimensional space that can contain all the points. Within this algorithm, we can assume that separations that deviate from that defined by the space by less than some specified tolerance can be ignored.
Does anyone know an algorithm that does this? So far, I've only been able to think up non-polynominal algorithms. Can anybody improve on that, or at least make something that is clean and extensible?
Why is this interesting? In Physics there are some low-level theories such as String Theory or Quantum Loop Gravity that do not obviously predict our three dimensional world. This algorithm could be part of a project to find how a 3d world can be emergent.
Thank you everybody who posted ideas here. I now have an answer to my own question. It's not great, in that it executes O(n^3) but at least it's polynomial. Roughly, it works like this:
Represent the problem as a symmetric matrix with zero diagonal -- representing the distances between any two points. This is equivalent to the representation using structs, but much easier to work with.
Assume the ordering of the points implied by the matrix (first column/row = first point) is sensible. (It may be worth pivoting to find a better ordering, but that is todo.)
Now create a rectangular coordinate system to fit the points, starting with the first point, which WLOG we take to be the origin.
Second point defines the x axis
For each subsequent point, we calculate its coordinates one at a time, starting with the x axis. We know the distance from the origin and the distance from point 2. This allows us to calculate the x coordinate, as we end up with two simultaneous equations x^2 + y^2 + ... = s1^2 and (x - x2)^2 + y^2 + ... = s2^2, which allows us to calculate x easily from x2, the x coordinate of point 2, and the distances from points 1 and 2, s1 and s2.
Each new coordinate can be calculated easily, because the matrix of coordinates calculated so far is triangular -- there is only one unknown each time.
The last coordinate for each point is on a new axis -- a dimension that has not yet been used. Calculate its coordinate using Pythagoras on the distance from the origin, as we know all the other coordinates.
It is possible that the coordinate on the new axis will come out imaginary -- a general set of distances cannot always be represented by a coordinate system of any number of dimensions -- at least not with real numbers. If this is the case, I error.
Keep going in this way for each new point, building up a vector of coordinate vectors for each point. In general, this is triangular, but there may be cases where the final coordinate we calculate is near enough to zero that we consider the point's position to be represented by the existing dimensions. I store the coordinates anyway, but keep the number of dimensions the same as the previous point. I also skip these points, as they are not needed for calculating further points (see step 10).
Finally, we have represented all points such that the distances are consistent.
As a final check, I validate that the distances match for all points, including those skipped in step 9.
The number of dimensions needed is the number used for the last point.
If anyone is interested in an implementation of this (in Haskell), it is on my GitHub page at https://github.com/MarcusRainbow/EmergentDimensions/coords.hs.
Related
I have a question.
I have N objects and N x N matrix M. Each entry M(i, j) contains (a kind of) relative gravitational force indicating how strongly i pulls j toward it (or inversely pull it away from it).
I want to place these N objects on a two-dimensional R x R plane by assigning a coordinate to each object.
Is there an algorithm/method that does this? There must be some commonly methods used in astrophysics, physics, chemistry, etc.
Thank you for your help.
You are interested in assigning co-ordinates (xi,yi,zi) and mass (mi) to each object such that gravitational force is consistent, right?
Consider 8 points at a time. You have a total of 32 unknowns and 28 equations. You can assume that first point is at origin and second point on x axis. That means, you will have 28 unknown and 28 equations.
So, first device and algorithm to solve for 8 points at a time. Then incrementally add one point at each iteration.
===Walkthrough===
Consider you are given n points in D dimensions. You only have distances between the points, but not the co-ordinates. Goal is to find co-ordinates for each point.
If D=1, you need to consider only two (+1) points at a time. Place first point at origin. Place second point on the positive side of origin. You can place third point in relation to origin, but place it right or left of it depending on the distance to first point and so on...
If D=2, place point 1 at origin, point to on positive side on x axis, third point on positive side of y axis depending on distance. From fourth point onward, you can use any two placed points to place the next and use any other point to refine the options (there will be two options).
Similar with D=3. Place first three points on xy plane (z=0) for all three. Next, place 4th point ofn postive part of z axist. And so on.
Coming back to gravity:
Your problem is complicated because you cannot exactly place mass at the origin. So you would need more than 5 points to place them. As I have shown above, you need at most 8 points though.
In case your mass are all equal, you can calculate distance (~inverse of gravity) and apply the case when D=3.
The problem is, given that we know the n*n distances between n objects, how to obtain their positions?
1. Put the first one, say a, at (0,0)
2. Put the second one b at ( |b-a|, 0 )
3. For the third one c, it is at the one of the two intersections of the two circles:
|p-a|=|c-a| and |p-b|=|c-b|.
Solve this system of quadratic equations using the well-known formula, choose
either of the solutions as the position of c.
4. For any other points p, do the same thing as we're done for c, but choose one of the
two solutions that is consistent with the distance |p-c|. And check the distance
between p and all previous points. If the check fails, return with failure.
I have read about DDA. But I just came across the term symmetric DDA. What is it ? How is it different from DDA ?
The DDA (Digital Differential Analyzer) algorithm is used to find out interpolating points between any given two points, linearly (i.e. straight line). Now since this is to be done on a digital computer - speed is an important factor.
The equation of a straight line is given by m=Δx/Δy eq(i), where Δx = x(2)-x(1) & Δy = y(2)-y(1),now using this equation we could compute successive points that lie on the line. But then this is the discrete world of raster graphics - so we require integral coordinates.
In simple DDA eq(i) is transformed to m=eΔx/eΔy where e, call it the increment factor, is a positive real number. since putting the same number in numerator and denominator does not change anything - but if suitably chosen - it can help us in generating discrete points thereby reducing the overload of having to round off the resultant points.
Basically what we need to do is: increment the coordinates by a fixed small amount, beginning from the starting point, and each time we have a new point progressing towards the end point.
In simple DDA - e is chosen as 1/max(|Δx|,|Δy|) such that one of the coordinate is integral and only the other coordinate has to be rounded. i.e. P(i+1) = P(i)+(1,Round(e*Δy)) here one coordinate is being incremented by 1 and the other by e*Δy
In symmetric DDA - e is chosen such that though both the co-ordinates of the resultant points has to be rounded off, it can be done so very efficiently, thus quickly.
Specifically e is chosen as 1/2^n where 2^(n-1) <= max(|Δx|,|Δy|) < 2^n. In other words the length of the line is taken to be 2^n aligned. The increments for the two coordinates are e*Δx and e*Δy. With suitably chosen initial fraction part of the beginning coordinates: this causes the points to be generated as mixed fractions whose fractional parts are in a cyclic series, i.e. they repeat over a small length. The resultant coordinates can thus easily be rounded off based on two fixed length look-up tables, one for each coordinate.
refer http://w3.msi.vxu.se/~gsu/DAB726-Ht06/Symm-DDA.pdf for an example.
Notice the cyclic repetition in the fractional part of the resultant coordinates.
Given a 3d heightmap (from a laser scanner), how do I find the saddle points?
I.e. given something like this:
I am looking for all points where the curvature is positive in one direction and negative in the other.
(These directions should not need to be aligned with the X and Y axis.
I know how to check whether the curvature in X direction has the opposite sign as the curvature in Y direction, but that does not cover all cases. To make matters worse, the resolution in X is different from the resolution in Y)
Ideally I am looking for an algorithm that can tolerate some amount of noise and only mark "significant" saddle points.
I've been exploring a similar problem for a computational topology class and have had some success with the method outlined below.
First you will need a comparison function that will evaluate the height at two input points and will return < or > (not equal) for any input. One way to do this is that if the points are equal height you use some position-based or random index to find the greater point. You can think of this as adding an infinitesimal perturbation to the height.
Now, for each point, you will compare the height at all the surrounding neighbors (there will be 8 neighbors on a 2D rectangular grid). The lower link for a point will be the set of all neighbors for which the height is less than the point.
If all the neighboring values are in the lower link, you are at a local maximum. If none of the points are in the lower link you are at a local minimum. Otherwise, if the lower link is a single connected set, you are at a regular point on a slope. But if the lower link is two unconnected sets, you are at a saddle.
In 2D you can construct a list of the 8 neighboring point in cyclic order around the point you are checking. You assign a value of +/-1 for each neighbor depending on your comparison function. You can then step through that list (remember to compare the two end points) and count how many times the sign changes to determine the number of connected components in the lower link.
Determining which saddles are "important" is a more difficult analysis. You may wish to look at this: http://www.cs.jhu.edu/~misha/ReadingSeminar/Papers/Gyulassy08.pdf for some guidance.
-Michael
(From a guess at the maths rather than practical experience)
Fit a quadratic to the surface in a small patch around each candidate point, e.g. with least squares. How big the patch is is one way of controlling noise, and you might gain by weighting points depending on their distance from the candidate point. In matrix notation, you can represent the quadratic as x'Ax + b'x + c, where A is symmetric.
The quadratic will have zero gradient at x = (A^-1)b/2. If this not within the patch, discard it.
If A has both +ve and -ve eigenvalues you have a saddle point at x. Since A is only 2x2 and so has at most two eigenvalues, you can ignore the case when it as a zero eigenvalue and so you couldn't invert it at the previous stage.
Problem
Given a set of known cartesian points (set A), and a 2d transformation (rotation, translation, scale) of some subset of those points (set B), find the orientation of the subset (rotation, translation, scale) relative to the original set of points.
I.E. Suppose I take a "picture" of a known set of 2d points on a wall. I want to know what position the camera was in relative to "upright and centered" when the picture was taken. Some of the points may not be visible in the picture (they may be occluded). (in this analogy, assume the camera is orthoganal and always pointed directly at the plane of the wall, so you don't need to take distortion or perspective into account)
Proposed approach:
Step 1: Scale B to the same "range" as A
Don't know how; open to suggestions. Maybe take the area of a convex hull around all the points in B, and scale it to nearly that of the convex hull around A. This is tricky, because points may be missing from B.
Step 2: Match some arbitrary point in "B" to its twin in "A"
Pick some random point in set B. Call this point K. Somehow take a "fingerprint" of K relative to all the other points in B (using distance only). Find its match in A by fingerprinting all points in A and taking the point with the most similar fingerprint of K.
Step 3: Rotate B (around K) until all points in B are aligned with a point in A
Multiple solutions are possible, so keep rotating though 360d looking for solutions.
That's just shooting from the hip, I may be way off base. Anyone have any ideas?
Assuming you don't actually know the correspondence between the points in the two clouds, you could try a statistical approach.
First, compute the mean x0 of the original cloud, then compute the mean x1 of the subset cloud. The difference of the mean vectors, x1-x0, is a good estimate of the required translation.
Now, subtract the relevant mean vector from each set to give two clouds centered at the origin. Compute the covariance matrix for each cloud and find its eigenvalues and eigenvectors. The required rotation can be found from the eigenvectors, while the scaling corresponds to the eigenvalues.
Compose all of this and you should have a good statistical estimate of the desired transform. Obviously, its quality will be a function of how well the subset spans the original set.
"Give me a place to stand on, and I will move the Earth" Archimede
I think we should follow the steps of Archimede
Arpi's algoritm:
We must choose a point (X1) of set A with coordinates (0, 0). (this will be the place to stand on)
Choose another point (X2) and put it on the OX vector (to simplify things)
All the other points' coordinates from set A will be calculated based on the coordinates of X1(0, 0) and X2(some_Coordinate, 0).
Now, choose a point from set B (Y1) and that will be the center of the B set. Choose another point from set B (Y2) and put it to OX of the B set. Now, we have a scale scalar and a rotation angle. If this will be a solution, than Y1 in the B set represents X1 from the A set and Y2 from the B set represents X2 from the A set. If we can find a map between the B set and A set based on this, using all the points of the B set and Yi <> Yj if i <> j, where i and j are the indexes of the points in our representation than we have a potential solution and we store that.
End of Arpi's algoritm
To find all the potential solutions you must do the following:
foreach point in A as X1 do
foreach point in A as X2 do
arpi's algoritm(X1, X2)
Of course, you can optimize this, but for the sake of simplicity I described it without optimizations (complications), it will be your job to optimize this and only if you need that.
I would attempt to minimize the deviation between the target points and the found points. Meaning I would pair each target point with a found point, and apply any transformation (rotation, scale or skew) to all the target points which decreases the sum of the deviations. I would repeat this for all potential pairs, eventually taking the match to be the set of pairs and the necessary transformations with the smallest total deviation.
The real question is how you optimize this so the performance to be better than O(n^2). I suppose some sort of heuristic matching, perhaps caching the intermediary results, or finding a method of eliminating some pairs earlier in the process.
I have two shapes which are cross sections of a channel. I want to calculate the cross section of an intermediate point between the two defined points.
What's the simplest (relatively simple?) algorithm to use in this situation?
P.S.: I came across several algorithms like natural neighbor and poisson, which seemed complex. I'm looking for a simple solution, which could be implemented quickly.
EDIT: I removed the word "Simplest" from the title since it might be misleading
This is simple:
On each cross section draw N points at evenly spaced intervals along the boundary of the cross-section.
Draw straight lines from the n-th point on cross-section 1 to the n-th point on cross-section 2.
Take off your new cross-section at the desired distance between the old cross-sections.
Simpler still:
Use one of the existing cross-sections without modification.
This second suggestion might be too simple I suppose, but I bet no-one suggests a simpler one !
EDIT following OP's comment: (too much for a re-comment)
Well, you did ask for a simple method ! I'm not sure I see the same problem with the first method as you do. If the cross sections are not too weird (probably best if they are convex polygons) and you don't do anything strange such as map the left side of one cross-section to the right side of the other (thereby forcing lots of crossing lines) then the method should produce some kind of sensible cross section. In the case you suggest of a triangle and a rectangle, suppose the triangle is sitting on its base, one vertex at the top. Map that point to, say, the top left corner of the rectangle, then proceed in the same direction (clockwise or anti-clockwise) around the boundaries of both cross-sections joining corresponding points. I don't see any crossing lines, and I see a well-defined shape at any distance between the two cross-sections.
Note there are some ambiguities about High Performance Mark's answers you will probably need to address and will define the quality of the output of his method. The most important one is, when you draw the n points on both cross-sections, what sort of correspondence do you determine between them, that is if you do it that way High Performance Mark suggested, then the order of labeling the points becomes important.
I suggest rotating (orthogonal) plane simultaneously through both cross sections, then the set of points which intersect that plane on one cross section just need to be matched to the set of points that intersect that plane on the other cross section. Hypothetically, there is no limit on the number of points in these sets, but it certainly reduces the complexity of the correspondence problem in the original situation.
Here is another try at the problem, which I think is a much better attempt.
Given the two cross-sections C_1, C_2
Place each C_i into a global reference frame with coordinate system (x,y) so that the way they are relatively situated makes sense. Split each C_i into an upper and lower curve U_i and L_i. The idea is going to be that you will want to continuously deform curve U_1 to U_2 and L_1 to L_2. (Note you can extend this method to split each C_i into m curves if you wish.)
The way to do this is as follows. For each T_i = U_i, or L_i sample n points, and determine the interpolating polynomial P{T_i}(x). As some one duly noted below, interpolating polynomials are susceptible to oscillation especially at the endpoints. Instead of the interpolating polynomial, one may instead use the least squares fit polynomial which would be much more robust. Then define the deformation of the polynomial P{U_1}(x) = a_0 + a_1 * x + ... + a_n * x^n to P{U_2}(x) = b_0 + b_1 * x + ... + b_n * x^n as Q{P{U_1},P{U_2}}(x, t) = ( t * a_0 + (1 - t ) b_0 ) + ... + (t * a_n + (1-t) * b_n ) * x^n where the deformation Q is defined over 0<=t<=1 where t defines at which point the deformation is at (i.e. at t=0 we are at U_2 and at t=1 we are at U_1 and at every other t we are at some continuous deformation of the two.)
The exact same follows for Q{P{L_1},P{L_2}}(x, t). These two deformations construct you a continuous representation between the two cross-sections which you can sample at any t.
Note all this is really doing is linearly interpolation the coefficients of the interpolation polynomials of the two pieces of both cross-sections. Note also when spliting the cross-sections you should probably put the constraint that they must be split at end points that match up otherwise you may have "holes" in your deformation.
I hope thats clear.
edit: addressed the issue of oscillation in interpolating polynomials.