Suppose we have a 2D array of gray-scale pixels. As we iterate through a large number of images, we find that each pixel has some level of correlation with all the others - across all images, certain subsets of pixels tend to be "on" at the same time as each other. How would I algorithmically rearrange the pixels' locations so that pixels which correlate with each other are also located near each other in the image?
Below is a visual (though maybe not technically accurate) depiction of what I want to do. Imagine that similarly-colored pixels are correlated across all images. We want to rearrange the pixels' locations so that these correlated pixels are co-located on the grid:
I have tried a genetic algorithm; the fitness function considered both euclidean distance between two pixels and their correlation. However it's too slow for my application, and I feel like their should be a more elegant approach.
Any thoughts would be appreciated.
Here is my formulation of your problem: You have a 2-dimensional array of vectors with correlations between these vectors. You want to rearrange the array so that vectors which are highly-correlated with each other are close together. The problem, presumably, is that any objective function which involves iterating over the entire array is too expensive to use with something like a genetic algorithm.
Here is one idea: Decide on a notion of the neighborhood of a cell (position in the array). I would go discrete rather than Euclidean since there is less computational overhead. Maybe the neighborhood of a cell is just itself and its immediate neighbors -- but you could go farther. The important thing is to keep the neighborhood small compared to the overall array. For each neighborhood -- calculate something like the average correlation between vectors in that neighborhood. Think of this as a local objective function. The overall objective function could be obtained by summing or averaging the values of these smaller objective functions.
Use some sort of hill-climbing approach (or maybe simulated annealing or tabu search) which involves making local changes on a single array rather than maintaining an entire population of arrays. Use local changes which consist of swapping two entries (and also experiment with such things as permuting triples of elements). A key insight is that such a local change only involves changing a handful of these local objective functions. In particular -- you can reject a move as being non-improving without having to recompute the overall objective function at all. Furthermore -- once you accept a move as improving, you can update the overall objective function without having to recompute very much.
I am vaguely uncomfortable with the idea of using correlation as the measure of similarity since correlation is signed. If you can't get good results with your present approach, perhaps rather than maximizing correlation you can try to minimize the squared distance between neighboring vectors.
Related
I am looking for a method that can help me in project I am working on. The idea is that there are some rectangles in 2d space, I want to query a rectangular area and see if it overlaps any rectangle in that area. If it does, it fails. If it doesn't, meaning the space is empty, it succeeds.
I was linked to z-order curves to help turn 2d coordinates into 1d. While I was reading about it, I encountered the Hilbert curve. I read that the Hilbert curve is preferred over a z-order curve because it maintains better proximity of points. I also read that the Hilbert curve is used to make more efficient quadtrees and octrees.
I was reading this article https://sigmodrecord.org/publications/sigmodRecord/0103/3.lawder.pdf for a possible solution but I don't know if this applies to my case.
I also saw this comment https://news.ycombinator.com/item?id=14217760 which mentioned multiple index entries for non point objects.
Is there an elegant method where I can use the Hilbert curve to achieve this? Is it possible with just an array of rectangles?
I am pretty sure it is possible and can be very efficient. But there is a lot of complexity involved.
I have implemented this for a z-order curve and called it PH-Tree, you find implementations in Java and C++, as well as theoretical background in the link. The PH-Tree is similar to a z-ordered quadtree but with several modifications that allow taking full advantage of the space filling curve.
There are a few things to unpack here.
Hilbert curves vs z-order curves: Yes, Hilbert curves have a slightly better proximity than z-order curves. Better proximity does two things: 1) reduce the number of elements (nodes, data) to look at and 2) improve linear access patterns (less hopping around). Both is important if the spatial index is stored on disk and I/O is expensive (especially old disk drives).
If I remember correctly, Hilbert curve and z-order are similar enough that the always access the same amount of data. The only thing that Hilbert curves are better at is linear access.
However, Hilbert curves are much more complicated to calculate, in the tests I made with an in-memory index (not very thorough testing, I admit) I found that z-order curves are considerably more efficient because the reduced CPU time outweighs the cost of accessing data slightly out of order.
Space filling curves (at least Hilbert and z-curve) allow for some neat bit-level hacks that can speed up index operations. However, even for z-ordering I found that getting these right required a lot of thinking (I wrote a whole paper about it). I believe these operations can be adapted for Hilbert curves but I may take some time and, as I wrote above, it may not improve performance at all.
Storing rectangles in a spatial curve index:
There are different approaches to encode rectangles in a spatial curve. All approaches that I am aware of encode the rectangle in a multi-dimensional point. I found the following encoding to work best (assuming axis aligned rectangles). We defined the rectangle by the lower left minimum-corner and the upper right maximum corner, e.g. min={min0, min1}={2,3}/max={max0, max1}={8,4}. We transform this (by interleaving the min/max values) into a 4-dimensional point {2,8,3,4} and store this point in the index. Other approaches use a different ordering (2,3,8,4) or, instead of two corners, store the center point and the lengths of the edges.
Querying:
If you want to find any rectangles that overlap/intersect with a given region (we call that a window query) we need to create a 4-dimensional query box, i.e. an axis aligned box that is defined by a 4D min-point and 4D max-point (copied from here):
min = {−∞, min_0, −∞, min_1} = {−∞, 2, −∞, 3}
max = {max_0, +∞, max_1, +∞} = {8, +∞, 4, +∞}
We can process the dimensions one by one. The first min/max pair is {−∞, 8}, that will match any 2D rectangle whose minimum x-coordinate is is 8 or lower. All coordinates:
d=0: min/max pair is {−∞, 8}: matches any 2D min-x <= 8
d=1: min/max pair is {2, +∞}: matches any 2D max-x >= 2
d=2: min/max pair is {−∞, 4}: matches any 2D min-y <= 4
d=3: min/max pair is {3, +∞}: matches any 2D max-y <= 3
If all these conditions hold true, then the stored rectangle overlaps with the query window.
Final words:
This sounds complicated but can be implemented very efficiently (also lends itself to vectorization). I found that is on par with other indexes (quadtree, R-Star-Tree), see some benchmarks I made here.
Specifically, I found that the z-ordered indexes have very good insertion/update/removal times (I don't know whether that matters for you) and is very good for small query result sizes (it sounds like you often expect it be zero, i.e. no overlapping rectangle found). It generally works well with large datasets (1000s or millions of entries) and datasets that have strong clusters.
For smaller datasets of if you expect to typically find many result (you can of course abort a query early once you find the first match!) other index types may be better.
On a last note, I found the dataset characteristics to have considerable influence on which index worked best. Also, implementation appears to be at least as important as the underlying algorithms. Especially for quadtrees I found huge variations in performance from different implementations.
I'm working on K-medoids algorithm implementation. It is a clustering algorithm and one of its steps includes finding the most representative point in a cluster.
So, here's the thing
I have a certain number of clusters
Each cluster contains a certain number of points
I need to find the point in each cluster that results with the least error if it is picked as a cluster representative
Distance from each point to all the other in the cluster needs to be calculated
This distance calculation could be simple as Euclidean or more complex like DTW (Dynamic Time Warping) between two signals
There are two approaches, one is to calculate distance matrix that will save values between all the points in the dataset and the other is to calculate distances during clustering, which results that distances between some points will be calculated repeatedly.
On one hand, to build distance matrix you must calculate distances between all points in the whole dataset and some of calculated values will never be used.
On the other hand, if you don't build the distance matrix, you will repeat some calculations in certain number of iterations.
Which is the better approach?
I'm also considering MapReduce implementation, so opinions from that angle are also welcome.
Thanks
A 3rd approach could be a combination of both, and is lazily evaluating the distance matrix. Initialize a matrix with default values (unrealistic values, like negative ones), and when you need to calculate distance between two points, if the values is already present in the matrix - just take it from it.
Otherwise, calculate it and store it in the matrix.
This approach trades calculations (and is optimal in doing the lowest number of possible pair calculations), for more branches in the code, and a few more instructions. However, due to branch predictors, I assume this overhead will not be that dramatic.
I predict it will have better performance when the calculation is relatively expansive.
Another optimization of it could be to dynamically switch for a plain matrix implementation (and calculate the remaining part of the matrix) when the number of already calculated exceeds a certain threshold. This can be achieved pretty nicely in OOP languages, by switching the implementation of the interface when a certain threshold is met.
Which is actually better implementation is going to rely heavily on the cost of the distance function, and the data you are clustering, as some will need to calculate the same points more often than other data sets.
I suggest doing a benchmark, and using statistical tools to evaluate which method is actually better.
In my problem, there are N points in the domain and they are somehow randomly distributed. For each point I need to find all neighbor points with distance less than a given double precision floating number, DIST.
Is there an efficient way to do this in Thrust?
In serial, I would use a neighborhood table and hope to achieve approximately O(n) instead of naive algorithm of O(n^2).
I have found a thrust example for 2D bucket sort, which is a perfect fit for the first part of my problem. But that is not enough, because for each bucket I need to find all points in the neighbor buckets, and then compute their distances and see if any of them is less than DIST. Finding neighbors and compute distance should be relatively easy, but adding those eligible points to a result array seems really difficult for me to implement in Thrust.
A way to rephrase this particular problem is this -- I have two 2D arrays A1 and A2, the column number represent the index of the 2D bucket and each column have different number of elements that are indices of my points. Each element in column(i) of A1 will form a potential pair with each element in colunm(i) of A2, and all eligible pairs should be recorded to a result array.
I could use a CUDA kernel and allocating tons of potentially unused memory as a workaround, but that would be the last thing I would want to do.
Thanks in advance.
The full solution is out of the scope of a single Stack Overflow answer, but there's a discussion on how to use Thrust to build a 2D spatial index in this repository:
https://github.com/jaredhoberock/thrust-workshop
Another possibility, simpler than creating a quad-tree, is using a neighborhood matrix.
First place all your points into a 2D square matrix (or 3D cubic grid, if you are dealing with three dimensions). Then you can run a full or partial spatial sort, so points will became ordered inside the matrix.
Points with small Y could move to the top rows of the matrix, and likewise, points with large Y would go to the bottom rows. The same will happen with points with small X coordinates, that should move to the columns on the left. And symmetrically, points with large X value will go to the right columns.
After you did the spatial sort (there are many ways to achieve this, both by serial or parallel algorithms) you can lookup the nearest points of a given point P by just visiting the adjacent cells where point P is actually stored in the neighborhood matrix.
If this matrix is placed into texture memory, you can use all the spatial caching from CUDA to have very fast accesses to all neighbors!
You can read more details for this idea in the following paper (you will find PDF copies of it online): Supermassive Crowd Simulation on GPU based on Emergent Behavior.
The sorting step gives you interesting choices. You can use just the even-odd transposition sort described in the paper, which is very simple to implement (even in CUDA). If you run just one pass of this, it will give you a partial sort, which can be already useful if your matrix is near-sorted. That is, if your points move slowly, it will save you a lot of computation.
If you need a full sort, you can run such even-odd transposition pass several times (as described in the following Wikipedia page):
http://en.wikipedia.org/wiki/Odd%E2%80%93even_sort
There is a second paper from the same authors, describing an extension to 3D and using three passes of the bitonic sort (which is highly parallel, but it is not a spatial sort). they claim it is both more precise than a single even-odd transposition pass and more efficient than a full sort. The paper is A Neighborhood Grid Data Structure for Massive 3D Crowd Simulation on GPU.
I've an indeterminate number of closed CGPath elements of various shapes and sizes all containing a single concave bezier curve, like the red and blue shapes in the diagram below.
What is the simplest and most efficient method of dividing these shapes into n regions of (roughly) equal size?
What you want is Delaunay triangulation. Here is an example which resembles what you want to do. It uses an as3 library. Here is an iOS port, that should help you:
https://github.com/czgarrett/delaunay-ios
I don't really understand the context of what you want to achieve and what the constraints are. For instance, is there a hard requirement that the subdivided regions are equal size?
Often the solutions to a performance problem is not a faster algorithm but a different approach, usually one or more of the following:
Pre-compute the values, or compute as much as possible offline. Say by using another server API which is able to do the subdivision offline and cache the results for multiple clients. You could serve the post-computed result as a bitmap where each colour indexes into the table of values you want to display. Looking up the value would be a simple matter of indexing the pixel at the touch position.
Simplify or approximate a solution. Would a grid sub-division be accurate enough? At 500 x 6 = 3000 subdivisions, you only have about 51 square points for each region, that's a region of around 7x7 points. At that size the user isn't going to notice if the region is perfectly accurate. You may need to end up aggregating adjacent regions anyway due to touch resolution.
Progressive refinement. You often don't need to compute the entire algorithm up front. Very often algorithms run in discrete (often symmetrical) units, meaning you're often re-using the information from previous steps. You could compute just the first step up front, and then use a background thread to progressively fill in the rest of the detail. You could also defer final calculation until the the touch occurs. A delay of up to a second is still tolerable at that point, or in the worst case you can display an animation while the calculation is in progress.
You could use some hybrid approach, and possibly compute one or two levels using Delaunay triangulation, and then using a simple, fast triangular sub-division for two more levels.
Depending on the required accuracy, and if discreet samples are not required, the final levels could be approximated using a weighted average between the points of the triangle, i.e., if the touch is halfway between two points, pick the average value between them.
I need to evaluate if two sets of 3d points are the same (ignoring translations and rotations) by finding and comparing a proper geometric hash. I did some paper research on geometric hashing techniques, and I found a couple of algorithms, that however tend to be complicated by "vision requirements" (eg. 2d to 3d, occlusions, shadows, etc).
Moreover, I would love that, if the two geometries are slightly different, the hashes are also not very different.
Does anybody know some algorithm that fits my need, and can provide some link for further study?
Thanks
Your first thought may be trying to find the rotation that maps one object to another but this a very very complex topic... and is not actually necessary! You're not asking how to best match the two, you're just asking if they are the same or not.
Characterize your model by a list of all interpoint distances. Sort the list by that distance. Now compare the list for each object. They should be identical, since interpoint distances are not affected by translation or rotation.
Three issues:
1) What if the number of points is large, that's a large list of pairs (N*(N-1)/2). In this case you may elect to keep only the longest ones, or even better, keep the 1 or 2 longest ones for each vertex so that every part of your model has some contribution. Dropping information like this however changes the problem to be probabilistic and not deterministic.
2) This only uses vertices to define the shape, not edges. This may be fine (and in practice will be) but if you expect to have figures with identical vertices but different connecting edges. If so, test for the vertex-similarity first. If that passes, then assign a unique labeling to each vertex by using that sorted distance. The longest edge has two vertices. For each of THOSE vertices, find the vertex with the longest (remaining) edge. Label the first vertex 0 and the next vertex 1. Repeat for other vertices in order, and you'll have assigned tags which are shift and rotation independent. Now you can compare edge topologies exactly (check that for every edge in object 1 between two vertices, there's a corresponding edge between the same two vertices in object 2) Note: this starts getting really complex if you have multiple identical interpoint distances and therefore you need tiebreaker comparisons to make the assignments stable and unique.
3) There's a possibility that two figures have identical edge length populations but they aren't identical.. this is true when one object is the mirror image of the other. This is quite annoying to detect! One way to do it is to use four non-coplanar points (perhaps the ones labeled 0 to 3 from the previous step) and compare the "handedness" of the coordinate system they define. If the handedness doesn't match, the objects are mirror images.
Note the list-of-distances gives you easy rejection of non-identical objects. It also allows you to add "fuzzy" acceptance by allowing a certain amount of error in the orderings. Perhaps taking the root-mean-squared difference between the two lists as a "similarity measure" would work well.
Edit: Looks like your problem is a point cloud with no edges. Then the annoying problem of edge correspondence (#2) doesn't even apply and can be ignored! You still have to be careful of the mirror-image problem #3 though.
There a bunch of SIGGRAPH publications which may prove helpful to you.
e.g. "Global Non-Rigid Alignment of 3-D Scans" by Brown and Rusinkiewicz:
http://portal.acm.org/citation.cfm?id=1276404
A general search that can get you started:
http://scholar.google.com/scholar?q=siggraph+point+cloud+registration
spin images are one way to go about it.
Seems like a numerical optimisation problem to me. You want to find the parameters of the transform which transforms one set of points to as close as possible by the other. Define some sort of residual or "energy" which is minimised when the points are coincident, and chuck it at some least-squares optimiser or similar. If it manages to optimise the score to zero (or as near as can be expected given floating point error) then the points are the same.
Googling
least squares rotation translation
turns up quite a few papers building on this technique (e.g "Least-Squares Estimation of Transformation Parameters Between Two Point Patterns").
Update following comment below: If a one-to-one correspondence between the points isn't known (as assumed by the paper above), then you just need to make sure the score being minimised is independent of point ordering. For example, if you treat the points as small masses (finite radius spheres to avoid zero-distance blowup) and set out to minimise the total gravitational energy of the system by optimising the translation & rotation parameters, that should work.
If you want to estimate the rigid
transform between two similar
point clouds you can use the
well-established
Iterative Closest Point method. This method starts with a rough
estimate of the transformation and
then iteratively optimizes for the
transformation, by computing nearest
neighbors and minimizing an
associated cost function. It can be
efficiently implemented (even
realtime) and there are available
implementations available for
matlab, c++... This method has been
extended and has several variants,
including estimating non-rigid
deformations, if you are interested
in extensions you should look at
Computer graphics papers solving
scan registration problem, where
your problem is a crucial step. For
a starting point see the Wikipedia
page on Iterative Closest Point
which has several good external
links. Just a teaser image from a matlab implementation which was designed to match to point clouds:
(source: mathworks.com)
After aligning you could the final
error measure to say how similar the
two point clouds are, but this is
very much an adhoc solution, there
should be better one.
Using shape descriptors one can
compute fingerprints of shapes which
are often invariant under
translations/rotations. In most cases they are defined for meshes, and not point clouds, nevertheless there is a multitude of shape descriptors, so depending on your input and requirements you might find something useful. For this, you would want to look into the field of shape analysis, and probably this 2004 SIGGRAPH course presentation can give a feel of what people do to compute shape descriptors.
This is how I would do it:
Position the sets at the center of mass
Compute the inertia tensor. This gives you three coordinate axes. Rotate to them. [*]
Write down the list of points in a given order (for example, top to bottom, left to right) with your required precision.
Apply any algorithm you'd like for a resulting array.
To compare two sets, unless you need to store the hash results in advance, just apply your favorite comparison algorithm to the sets of points of step 3. This could be, for example, computing a distance between two sets.
I'm not sure if I can recommend you the algorithm for the step 4 since it appears that your requirements are contradictory. Anything called hashing usually has the property that a small change in input results in very different output. Anyway, now I've reduced the problem to an array of numbers, so you should be able to figure things out.
[*] If two or three of your axis coincide select coordinates by some other means, e.g. as the longest distance. But this is extremely rare for random points.
Maybe you should also read up on the RANSAC algorithm. It's commonly used for stitching together panorama images, which seems to be a bit similar to your problem, only in 2 dimensions. Just google for RANSAC, panorama and/or stitching to get a starting point.