With a truedepth camera, I have measured two sets of three points in 3D space -- basically keypoints on an object that has moved over time. I would like to calculate the 4x4 transformation matrix that matches one set to the other set (which won't be perfect, but want to get as close as possible). The best technique I can come up with is some sort of gradient descent optimization on the terms of the matrix, minimizing the sum of squared distances between the calculated set and target. But I feel like there's a better, more elegant approach out there. Does anyone have any thoughts or pointers? Thanks!
What you probably want is to compute the best rigid transformation matching two sets of points. There is a closed form solution for that. That problem has several names: Kabsch Problem, Whaba Problem, Procrustes Problem and Attitude Estimation problem (with some well known algorithms like QUEST and FOAM)
See:
https://en.m.wikipedia.org/wiki/Kabsch_algorithm
Related
I'm trying to find suitable algorithms for searching subsets of 2D points in larger set.
A picture is worth thousand words, so:
Any ideas on how one could achieve this? Note that the transformations are just rotation and scaling.
It seems that the most closely problem is Point set registration [1].
I was experimenting with CPD and other rigid and non-rigid algorithms' implementations, but they don't seem to perform
too well on finding small subsets in larger sets of points.
Another approach could be using star tracking algorithms like the Angle method mentioned in [2]
or more robust methods like [3]. But again, they all seem to be meant for large input sets and target sets. I'm looking for something less reliable but more minimalistic...
Thanks for any ideas!
[1]: http://en.wikipedia.org/wiki/Point_set_registration
[2]: http://www.acsu.buffalo.edu/~johnc/star_gnc04.pdf
[3]: http://arxiv.org/abs/0910.2233
here's some papers probably related to your question:
Geometric Pattern Matching under Euclidean Motion (1993) by L. Paul Chew , Michael T. Goodrich , Daniel P. Huttenlocher , Klara Kedem , Jon M. Kleinberg , Dina Kravets.
A fast expected time algorithm for the 2-D point pattern (2004) by Wamelena, Iyengarb.
Simple algorithms for partial point set pattern matching under rigid motion (2006) by Bishnua, Dasb, Nandyb, Bhattacharyab.
Exact and approximate Geometric Pattern Matching for point sets in the plane under similarity transformations (2007) by Aiger and Kedem.
and by the way, your last reference reminded me of:
An Application of Point Pattern Matching in Astronautics (1994) by G. Weber, L. Knipping and H. Alt.
I think you should start with a subset of the input points and determine the required transformation to match a subset of the large set. For example:
choose any two points of the input, say A and B.
map A and B to a pair of the large set. This will determine the scale and two rotation angles (clockwise or counter clockwise)
apply the same scaling and transformation to a third input point C and check the large set to see if a point exists there. You'll have to check two positions, one for each of rotation angle. If the point C exists where it should be in the large set, you can check the rest of the points.
repeat for each pair of points in the large set
I think you could also try to match a subset of 3 input points, knowing that the angles of a triangle will be invariant under scaling and rotations.
Those are my ideas, I hope they help solve your problem.
I would try the Iterative Closest Point algorithm. A simple version like the one you need should be easy to implement.
Take a look at geometric hashing. It allows finding geometric patterns under different transformations. If you use only rotation and scale, it will be quite simple.
The main idea is to encode the pattern in "native" coordinates, which is invariant under transformations.
You can try a geohash. Translate the points to a binary and interleave it. Measure the distance and compare it with the original. You can also try to rotate the geohash, i.e. z-curve or morton curve.
I'm developping a tool for radiotherapy inverse planning based in a pencil-beam approach. An important step in these methods (particularly in dose calculation) is a ray-tracing from many sources and one of the most used algorithms is Siddon's one (here there is a nice short description http://on-demand.gputechconf.com/gtc/2014/poster/pdf/P4218_CT_reconstruction_iterative_algebraic.pdf). Now, I will try to simplify my question:
The input data is a CT image (a 3D matrix with values) and some source positions around the image. You can imagine a cube and many points around, all at same distance but different orientation angles, where the radiation rays come from. Each ray will go through the volume and a value is assigned to each voxel according to the distance from the source. The advantage of Siddon's algorithm is that the length is calculated on-time during the iterative process of the ray-tracing. However, I know that Bresenham's algorithm is an efficient way to evaluate the path from one point to another in a matrix. Thus, the length from the source to a specific voxel could be easily calculated as the euclidean distance two points, even during Bresenham's iterative process.
So then, knowing that both are methods quite old already and efficient, there is a definitive advantage of using Siddon instead of Bresenham? Maybe I'm missing an important detail here but it is weird to me that in these dose calculation procedures Bresenham is not really an option and always Siddon appears as the gold standard.
Thanks for any comment or reply!
Good day.
It seems to me that in most applications involving medical ray tracing, you want not only the distance from a source to a particular voxel, but also the intersection lengths of that path with every single voxel on its way. Now, Bresenham gives you the voxels on that path, but not the intersection lengths, while Siddon does.
Here is the problem:
I have many sets of points, and want to come up with a function that can take one set and rank matches based on their similarity to the first. Scaling, translation, and rotation do not matter, and some points may be missing from any of the sets of points. The best match is the one that if scaled and translated in the ideal way has the least mean square error between points (maybe with a cap on penalty, or considering only the best fraction of points to handle missing points).
I am trying to come up with a good way to do this, and am wondering if there are any well known algorithms that can handle this type of problem? Just the name of something would be awesome! I lack a formal CSCI or math education, and am doing the best to teach myself.
A few things I have tried
The first thing that comes to mind is to normalize the points somehow, but I dont think that this is helpful because the missing points may throw things off.
The best way I can think of is to estimate a starting point by translating to match their centroids, scaling so that the largest distances from the centroid of the sets match. From there, do an A* search, scaling, rotating, and translating until I reach a maximum, and then compare the two sets. (I hope I am using the term A* correctly, I mean trying small translations and scalings and selecting the move giving the best match) I think this will find the global maximum most of the time, but is not guaranteed to. I am looking for a better way that will always be correct.
Thanks a ton for the help! It has been fun and interesting trying to figure this out so far, so I hope it is for you as well.
There's a very clever algorithm for identifying starfields. You find 4 points in a diamond shape and then using the two stars farthest apart you define a coordinate system locating the other two stars. This is scale and rotation invariant because the locations are relative to the first two stars. This forms a hash. You generate several of these hashes and use those to generate candidates. Once you have the candidates you look for ones where multiple hashes have the correct relationships.
This is described in a paper and a presentation on http://astrometry.net/ .
This paper may be useful: Shape Matching and Object Recognition Using Shape Contexts
Edit:
There is a couple of relatively simple methods to solve the problem:
To combine all possible pairs of points (one for each set) to nodes, connect these nodes where distances in both sets match, then solve the maximal clique problem for this graph. Since the maximal clique problem is NP-complete, the complexity is probably O(exp(n^2)), so if you have too many points, don't use this algorithm directly, use some approximation.
Use Generalised Hough transform to match two sets of points. This approach has less complexity (O(n^4)). But it is more complicated, so I cannot explain it here.
You can find the details in computer vision books, for example "Machine vision: theory, algorithms, practicalities" by E. R. Davies (2005).
I'm working on an app that lets users select regions by finger painting on top of a map. The points then get converted to a latitude/longitude and get uploaded to a server.
The touch screen is delivering way too many points to be uploaded over 3G. Even small regions can accumulate up to ~500 points.
I would like to smooth this touch data (approximate it within some tolerance). The accuracy of drawing does not really matter much as long as the general area of the region is the same.
Are there any well known algorithms to do this? Is this work for a Kalman filter?
There is the Ramer–Douglas–Peucker algorithm (wikipedia).
The purpose of the algorithm is, given
a curve composed of line segments, to
find a similar curve with fewer
points. The algorithm defines
'dissimilar' based on the maximum
distance between the original curve
and the simplified curve. The
simplified curve consists of a subset
of the points that defined the
original curve.
You probably don't need anything too exotic to dramatically cut down your data.
Consider something as simple as this:
Construct some sort of error metric. An easy one would be a normalized sum of the distances from the omitted points to the line that was approximating them. Decide what a tolerable error using this metric is.
Then starting from the first point construct the longest line segment that falls within the tolerable error range. Repeat this process until you have converted the entire path into a polyline.
This will not give you the globally optimal approximation but it will probably be good enough.
If you want the approximation to be more "curvey" you might consider using splines or bezier curves rather than straight line segments.
You want to subdivide the surface into a grid with a quadtree or a space-filling-curve. A sfc reduce the 2d complexity to a 1d complexity. You want to look for Nick's hilbert curve quadtree spatial index blog.
I was going to do something this in an app, but was intending on generating a path from the points on-the-fly. I was going to use a technique mentioned in this Point Sequence Interpolation thread
I need to evaluate if two sets of 3d points are the same (ignoring translations and rotations) by finding and comparing a proper geometric hash. I did some paper research on geometric hashing techniques, and I found a couple of algorithms, that however tend to be complicated by "vision requirements" (eg. 2d to 3d, occlusions, shadows, etc).
Moreover, I would love that, if the two geometries are slightly different, the hashes are also not very different.
Does anybody know some algorithm that fits my need, and can provide some link for further study?
Thanks
Your first thought may be trying to find the rotation that maps one object to another but this a very very complex topic... and is not actually necessary! You're not asking how to best match the two, you're just asking if they are the same or not.
Characterize your model by a list of all interpoint distances. Sort the list by that distance. Now compare the list for each object. They should be identical, since interpoint distances are not affected by translation or rotation.
Three issues:
1) What if the number of points is large, that's a large list of pairs (N*(N-1)/2). In this case you may elect to keep only the longest ones, or even better, keep the 1 or 2 longest ones for each vertex so that every part of your model has some contribution. Dropping information like this however changes the problem to be probabilistic and not deterministic.
2) This only uses vertices to define the shape, not edges. This may be fine (and in practice will be) but if you expect to have figures with identical vertices but different connecting edges. If so, test for the vertex-similarity first. If that passes, then assign a unique labeling to each vertex by using that sorted distance. The longest edge has two vertices. For each of THOSE vertices, find the vertex with the longest (remaining) edge. Label the first vertex 0 and the next vertex 1. Repeat for other vertices in order, and you'll have assigned tags which are shift and rotation independent. Now you can compare edge topologies exactly (check that for every edge in object 1 between two vertices, there's a corresponding edge between the same two vertices in object 2) Note: this starts getting really complex if you have multiple identical interpoint distances and therefore you need tiebreaker comparisons to make the assignments stable and unique.
3) There's a possibility that two figures have identical edge length populations but they aren't identical.. this is true when one object is the mirror image of the other. This is quite annoying to detect! One way to do it is to use four non-coplanar points (perhaps the ones labeled 0 to 3 from the previous step) and compare the "handedness" of the coordinate system they define. If the handedness doesn't match, the objects are mirror images.
Note the list-of-distances gives you easy rejection of non-identical objects. It also allows you to add "fuzzy" acceptance by allowing a certain amount of error in the orderings. Perhaps taking the root-mean-squared difference between the two lists as a "similarity measure" would work well.
Edit: Looks like your problem is a point cloud with no edges. Then the annoying problem of edge correspondence (#2) doesn't even apply and can be ignored! You still have to be careful of the mirror-image problem #3 though.
There a bunch of SIGGRAPH publications which may prove helpful to you.
e.g. "Global Non-Rigid Alignment of 3-D Scans" by Brown and Rusinkiewicz:
http://portal.acm.org/citation.cfm?id=1276404
A general search that can get you started:
http://scholar.google.com/scholar?q=siggraph+point+cloud+registration
spin images are one way to go about it.
Seems like a numerical optimisation problem to me. You want to find the parameters of the transform which transforms one set of points to as close as possible by the other. Define some sort of residual or "energy" which is minimised when the points are coincident, and chuck it at some least-squares optimiser or similar. If it manages to optimise the score to zero (or as near as can be expected given floating point error) then the points are the same.
Googling
least squares rotation translation
turns up quite a few papers building on this technique (e.g "Least-Squares Estimation of Transformation Parameters Between Two Point Patterns").
Update following comment below: If a one-to-one correspondence between the points isn't known (as assumed by the paper above), then you just need to make sure the score being minimised is independent of point ordering. For example, if you treat the points as small masses (finite radius spheres to avoid zero-distance blowup) and set out to minimise the total gravitational energy of the system by optimising the translation & rotation parameters, that should work.
If you want to estimate the rigid
transform between two similar
point clouds you can use the
well-established
Iterative Closest Point method. This method starts with a rough
estimate of the transformation and
then iteratively optimizes for the
transformation, by computing nearest
neighbors and minimizing an
associated cost function. It can be
efficiently implemented (even
realtime) and there are available
implementations available for
matlab, c++... This method has been
extended and has several variants,
including estimating non-rigid
deformations, if you are interested
in extensions you should look at
Computer graphics papers solving
scan registration problem, where
your problem is a crucial step. For
a starting point see the Wikipedia
page on Iterative Closest Point
which has several good external
links. Just a teaser image from a matlab implementation which was designed to match to point clouds:
(source: mathworks.com)
After aligning you could the final
error measure to say how similar the
two point clouds are, but this is
very much an adhoc solution, there
should be better one.
Using shape descriptors one can
compute fingerprints of shapes which
are often invariant under
translations/rotations. In most cases they are defined for meshes, and not point clouds, nevertheless there is a multitude of shape descriptors, so depending on your input and requirements you might find something useful. For this, you would want to look into the field of shape analysis, and probably this 2004 SIGGRAPH course presentation can give a feel of what people do to compute shape descriptors.
This is how I would do it:
Position the sets at the center of mass
Compute the inertia tensor. This gives you three coordinate axes. Rotate to them. [*]
Write down the list of points in a given order (for example, top to bottom, left to right) with your required precision.
Apply any algorithm you'd like for a resulting array.
To compare two sets, unless you need to store the hash results in advance, just apply your favorite comparison algorithm to the sets of points of step 3. This could be, for example, computing a distance between two sets.
I'm not sure if I can recommend you the algorithm for the step 4 since it appears that your requirements are contradictory. Anything called hashing usually has the property that a small change in input results in very different output. Anyway, now I've reduced the problem to an array of numbers, so you should be able to figure things out.
[*] If two or three of your axis coincide select coordinates by some other means, e.g. as the longest distance. But this is extremely rare for random points.
Maybe you should also read up on the RANSAC algorithm. It's commonly used for stitching together panorama images, which seems to be a bit similar to your problem, only in 2 dimensions. Just google for RANSAC, panorama and/or stitching to get a starting point.