Calculate Mapping of Nearest Points of 2 matrices - algorithm

I have two matrices A and B. Each of them has 2 columns having the coordinates of a point ( x , y ).
I need to compute a mapping of points from A to B such that the points have least euclidean distance among them.
Essentially I am trying to emulate what sift does on images but will not carry out the steps that sift does for matching the points...
Thus for all points in A, I compute euclidean distance with all points in B and then remove the mapping of 2 points which have the least distance. Then i continue to do this until A and B are both empty.
Could someone tell me what could be the most efficient way of doing this ?
EDIT
Can somebody help me ... The issue I am facing is that I need to compute all v/s all distances before selecting the minimum of them as the first mapping. Then I need to do this all over again making the computation really long...
Is there any way this can be done efficiently in MATLAB ?

Are you referring to the Procrustes distance between the two different configurations of points? If so, Matlab has a built-in function that computes the smallest-norm transformation that brings the points into alignment (this is the Procrustes distance).
See this documentation for how to use it. If you don't have the Statistics Toolbox, then you should check the Matlab Central File Exchange first to see if anyone's written a non-toolbox version of the procrustes() function before seeking to write your own.

Related

Good algorithm for finding subsets of point sets

I'm trying to find suitable algorithms for searching subsets of 2D points in larger set.
A picture is worth thousand words, so:
Any ideas on how one could achieve this? Note that the transformations are just rotation and scaling.
It seems that the most closely problem is Point set registration [1].
I was experimenting with CPD and other rigid and non-rigid algorithms' implementations, but they don't seem to perform
too well on finding small subsets in larger sets of points.
Another approach could be using star tracking algorithms like the Angle method mentioned in [2]
or more robust methods like [3]. But again, they all seem to be meant for large input sets and target sets. I'm looking for something less reliable but more minimalistic...
Thanks for any ideas!
[1]: http://en.wikipedia.org/wiki/Point_set_registration
[2]: http://www.acsu.buffalo.edu/~johnc/star_gnc04.pdf
[3]: http://arxiv.org/abs/0910.2233
here's some papers probably related to your question:
Geometric Pattern Matching under Euclidean Motion (1993) by L. Paul Chew , Michael T. Goodrich , Daniel P. Huttenlocher , Klara Kedem , Jon M. Kleinberg , Dina Kravets.
A fast expected time algorithm for the 2-D point pattern (2004) by Wamelena, Iyengarb.
Simple algorithms for partial point set pattern matching under rigid motion (2006) by Bishnua, Dasb, Nandyb, Bhattacharyab.
Exact and approximate Geometric Pattern Matching for point sets in the plane under similarity transformations (2007) by Aiger and Kedem.
and by the way, your last reference reminded me of:
An Application of Point Pattern Matching in Astronautics (1994) by G. Weber, L. Knipping and H. Alt.
I think you should start with a subset of the input points and determine the required transformation to match a subset of the large set. For example:
choose any two points of the input, say A and B.
map A and B to a pair of the large set. This will determine the scale and two rotation angles (clockwise or counter clockwise)
apply the same scaling and transformation to a third input point C and check the large set to see if a point exists there. You'll have to check two positions, one for each of rotation angle. If the point C exists where it should be in the large set, you can check the rest of the points.
repeat for each pair of points in the large set
I think you could also try to match a subset of 3 input points, knowing that the angles of a triangle will be invariant under scaling and rotations.
Those are my ideas, I hope they help solve your problem.
I would try the Iterative Closest Point algorithm. A simple version like the one you need should be easy to implement.
Take a look at geometric hashing. It allows finding geometric patterns under different transformations. If you use only rotation and scale, it will be quite simple.
The main idea is to encode the pattern in "native" coordinates, which is invariant under transformations.
You can try a geohash. Translate the points to a binary and interleave it. Measure the distance and compare it with the original. You can also try to rotate the geohash, i.e. z-curve or morton curve.

algorithm to create bounding rectangles for 2D points

The input is a series of point coordinates (x0,y0),(x1,y1) .... (xn,yn) (n is not very large, say ~ 1000). We need to create some rectangles as bounding box of these points. There's no need to find the global optimal solution. The only requirement is if the euclidean distance between two point is less than R, they should be in the same bounding rectangle. I've searched for sometime and it seems to be a clustering problem and K-means method might be a useful one.
However, the input point coordinates didn't have specific pattern from time to time. So it maybe not possible to set a specific K in K-mean. I am wondering if there is any algorithm or method possible to solve this problem?
The only requirement is if the euclidean distance between two point is less than R, they should be in the same bounding rectangle
This is the definition of single-linkage hierarchical clustering cut at a height of R.
Note that this may yield overlapping rectangles.
For much faster and highly efficient methods, have a look at bulk loading strategies for R*-trees, such as sort-tile-recursive. It won't satisfy your "only" requirement above, but it will yield well balanced, non-overlapping rectangles.
K-means is obviously not appropriate for your requirements.
With only 1000 points I would do the following:
1) Work out the difference between all pairs of points. If the distance of a pair is less than R, they need to go in the same bounding rectangle, so use http://en.wikipedia.org/wiki/Disjoint-set_data_structure to record this.
2) For each subset that comes out of your Disjoint set data structure, work out the min and max co-ordinates of the points in it and use this to create a bounding box for the points in this subset.
If you have more points or are worried about efficiency, you will want to make stage (1) more efficient. One easy way would be to go through the points in order of x co-ordinate, keeping only points at most R to the left of the most recent point seen, and using a balanced tree structure to find from these the points at most R above or below the most recent point seen, before calculating the distance to the most recent point seen. One step up from this would be to create a spatial data structure to get yet more efficiency in finding pairs with distance R of each other.
Note that for some inputs you will get just one huge bounding box because you have long chains of points, and for some other inputs you will get bounding boxes inside bounding boxes, for instance if your points are in concentric circles.

K nearest neighbour search with weights on dimensions

I have a floor on which various sensors are placed at different location on the floor. For every transmitting device, sensors may detect its readings. It is possible to have 6-7 sensors on a floor, and it is possible that a particular reading may not be detected by some sensors, but are detected by some other sensors.
For every reading I get, I would like to identify the location of that reading on the floor. We divide floor logically into TILEs (5x5 feet area) and find what ideally the reading at each TILE should be as detected by each sensor device (based on some transmission pathloss equation).
I am using the precomputed readings from 'N' sensor device at each TILE as a point in N-dimensional space. When I get a real life reading, I find the nearest neighbours of this reading, and assign this reading to that location.
I would like to know if there is a variant of K nearest neighbours, where a dimension could be REMOVED from consideration. This will especially be useful, when a particular sensor is not reporting any reading. I understand that putting weightage on a dimension will be impossible with algorithms like kd-tree or R trees. However, I would like to know if it would be possible to discard a dimension when computing nearest neighbours. Is there any such algorithm?
EDIT:
What I want to know is if the same R/kd tree could be used for k nearest search with different queries, where each query has different dimension weightage? I don't want to construct another kd-tree for every different weightage on dimensions.
EDIT 2:
Is there any library in python, which allows you to specify the custom distance function, and search for k nearest neighbours? Essentially, I would want to use different custom distance functions for different queries.
Both for R-trees and kd-trees, using weighted Minkowski norms is straightforward. Just put the weights into your distance equations!
Putting weights into Eulidean point-to-rectangle minimum distance is trivial, just look at the regular formula and plug in the weight as desired.
Distances are not used at tree construction time, so you can vary the weights as desired at query time.
After going through a lot of questions on stackoverflow, and finally going into details of scipy kd tree source code, I realised the answer by "celion" in following link is correct:
KD-Trees and missing values (vector comparison)
Excerpt:
"I think the best solution involves getting your hands dirty in the code that you're working with. Presumably the nearest-neighbor search computes the distance between the point in the tree leaf and the query vector; you should be able to modify this to handle the case where the point and the query vector are different sizes. E.g. if the points in the tree are given in 3D, but your query vector is only length 2, then the "distance" between the point (p0, p1, p2) and the query vector (x0, x1) would be
sqrt( (p0-x0)^2 + (p1-x1)^2 )
I didn't dig into the java code that you linked to, but I can try to find exactly where the change would need to go if you need help.
-Chris
PS - you might not need the sqrt in the equation above, since distance squared is usually equivalent."

Algorithm: Find 2d orientation from constellation of known points?

Problem
Given a set of known cartesian points (set A), and a 2d transformation (rotation, translation, scale) of some subset of those points (set B), find the orientation of the subset (rotation, translation, scale) relative to the original set of points.
I.E. Suppose I take a "picture" of a known set of 2d points on a wall. I want to know what position the camera was in relative to "upright and centered" when the picture was taken. Some of the points may not be visible in the picture (they may be occluded). (in this analogy, assume the camera is orthoganal and always pointed directly at the plane of the wall, so you don't need to take distortion or perspective into account)
Proposed approach:
Step 1: Scale B to the same "range" as A
Don't know how; open to suggestions. Maybe take the area of a convex hull around all the points in B, and scale it to nearly that of the convex hull around A. This is tricky, because points may be missing from B.
Step 2: Match some arbitrary point in "B" to its twin in "A"
Pick some random point in set B. Call this point K. Somehow take a "fingerprint" of K relative to all the other points in B (using distance only). Find its match in A by fingerprinting all points in A and taking the point with the most similar fingerprint of K.
Step 3: Rotate B (around K) until all points in B are aligned with a point in A
Multiple solutions are possible, so keep rotating though 360d looking for solutions.
That's just shooting from the hip, I may be way off base. Anyone have any ideas?
Assuming you don't actually know the correspondence between the points in the two clouds, you could try a statistical approach.
First, compute the mean x0 of the original cloud, then compute the mean x1 of the subset cloud. The difference of the mean vectors, x1-x0, is a good estimate of the required translation.
Now, subtract the relevant mean vector from each set to give two clouds centered at the origin. Compute the covariance matrix for each cloud and find its eigenvalues and eigenvectors. The required rotation can be found from the eigenvectors, while the scaling corresponds to the eigenvalues.
Compose all of this and you should have a good statistical estimate of the desired transform. Obviously, its quality will be a function of how well the subset spans the original set.
"Give me a place to stand on, and I will move the Earth" Archimede
I think we should follow the steps of Archimede
Arpi's algoritm:
We must choose a point (X1) of set A with coordinates (0, 0). (this will be the place to stand on)
Choose another point (X2) and put it on the OX vector (to simplify things)
All the other points' coordinates from set A will be calculated based on the coordinates of X1(0, 0) and X2(some_Coordinate, 0).
Now, choose a point from set B (Y1) and that will be the center of the B set. Choose another point from set B (Y2) and put it to OX of the B set. Now, we have a scale scalar and a rotation angle. If this will be a solution, than Y1 in the B set represents X1 from the A set and Y2 from the B set represents X2 from the A set. If we can find a map between the B set and A set based on this, using all the points of the B set and Yi <> Yj if i <> j, where i and j are the indexes of the points in our representation than we have a potential solution and we store that.
End of Arpi's algoritm
To find all the potential solutions you must do the following:
foreach point in A as X1 do
foreach point in A as X2 do
arpi's algoritm(X1, X2)
Of course, you can optimize this, but for the sake of simplicity I described it without optimizations (complications), it will be your job to optimize this and only if you need that.
I would attempt to minimize the deviation between the target points and the found points. Meaning I would pair each target point with a found point, and apply any transformation (rotation, scale or skew) to all the target points which decreases the sum of the deviations. I would repeat this for all potential pairs, eventually taking the match to be the set of pairs and the necessary transformations with the smallest total deviation.
The real question is how you optimize this so the performance to be better than O(n^2). I suppose some sort of heuristic matching, perhaps caching the intermediary results, or finding a method of eliminating some pairs earlier in the process.

Is there an efficient algorithm to generate a 2D concave hull?

Having a set of (2D) points from a GIS file (a city map), I need to generate the polygon that defines the 'contour' for that map (its boundary). Its input parameters would be the points set and a 'maximum edge length'. It would then output the corresponding (probably non-convex) polygon.
The best solution I found so far was to generate the Delaunay triangles and then remove the external edges that are longer than the maximum edge length. After all the external edges are shorter than that, I simply remove the internal edges and get the polygon I want. The problem is, this is very time-consuming and I'm wondering if there's a better way.
One of the former students in our lab used some applicable techniques for his PhD thesis. I believe one of them is called "alpha shapes" and is referenced in the following paper:
http://www.cis.rit.edu/people/faculty/kerekes/pdfs/AIPR_2007_Gurram.pdf
That paper gives some further references you can follow.
This paper discusses the Efficient generation of simple polygons for characterizing the shape of a set of points in the plane and provides the algorithm. There's also a Java applet utilizing the same algorithm here.
The guys here claim to have developed a k nearest neighbors approach to determining the concave hull of a set of points which behaves "almost linearly on the number of points". Sadly their paper seems to be very well guarded and you'll have to ask them for it.
Here's a good set of references that includes the above and might lead you to find a better approach.
The answer may still be interesting for somebody else: One may apply a variation of the marching square algorithm, applied (1) within the concave hull, and (2) then on (e.g. 3) different scales that my depend on the average density of points. The scales need to be int multiples of each other, such you build a grid you can use for efficient sampling. This allows to quickly find empty samples=squares, samples that are completely within a "cluster/cloud" of points, and those, which are in between. The latter category then can be used to determine easily the poly-line that represents a part of the concave hull.
Everything is linear in this approach, no triangulation is needed, it does not use alpha shapes and it is different from the commercial/patented offering as described here ( http://www.concavehull.com/ )
A quick approximate solution (also useful for convex hulls) is to find the north and south bounds for each small element east-west.
Based on how much detail you want, create a fixed sized array of upper/lower bounds.
For each point calculate which E-W column it is in and then update the upper/lower bounds for that column. After you processed all the points you can interpolate the upper/lower points for those columns that missed.
It's also worth doing a quick check beforehand for very long thin shapes and deciding wether to bin NS or Ew.
A simple solution is to walk around the edge of the polygon. Given a current edge om the boundary connecting points P0 and P1, the next point on the boundary P2 will be the point with the smallest possible A, where
H01 = bearing from P0 to P1
H12 = bearing from P1 to P2
A = fmod( H12-H01+360, 360 )
|P2-P1| <= MaxEdgeLength
Then you set
P0 <- P1
P1 <- P2
and repeat until you get back where you started.
This is still O(N^2) so you'll want to sort your pointlist a little. You can limit the set of points you need to consider at each iteration if you sort points on, say, their bearing from the city's centroid.
Good question! I haven't tried this out at all, but my first shot would be this iterative method:
Create a set N ("not contained"), and add all points in your set to N.
Pick 3 points from N at random to form an initial polygon P. Remove them from N.
Use some point-in-polygon algorithm and look at points in N. For each point in N, if it is now contained by P, remove it from N. As soon as you find a point in N that is still not contained in P, continue to step 4. If N becomes empty, you're done.
Call the point you found A. Find the line in P closest to A, and add A in the middle of it.
Go back to step 3
I think it would work as long as it performs well enough — a good heuristic for your initial 3 points might help.
Good luck!
You can do it in QGIS with this plug in;
https://github.com/detlevn/QGIS-ConcaveHull-Plugin
Depending on how you need it to interact with your data, probably worth checking out how it was done here.
As a wildly adopted reference, PostGIS starts with a convexhull and then caves it in, you can see it here.
https://github.com/postgis/postgis/blob/380583da73227ca1a52da0e0b3413b92ae69af9d/postgis/postgis.sql.in#L5819
The Bing Maps V8 interactive SDK has a concave hull option within the advanced shape operations.
https://www.bing.com/mapspreview/sdkrelease/mapcontrol/isdk/advancedshapeoperations?toWww=1&redig=D53FACBB1A00423195C53D841EA0D14E#JS
Within ArcGIS 10.5.1, the 3D Analyst extension has a Minimum Bounding Volume tool with the geometry types of concave hull, sphere, envelope, or convex hull. It can be used at any license level.
There is a concave hull algorithm here: https://github.com/mapbox/concaveman

Resources