Mapping 2D points to a fixed grid - algorithm

I have any number of points on an imaginary 2D surface. I also have a grid on the same surface with points at regular intervals along the X and Y access. My task is to map each point to the nearest grid point.
The code is straight forward enough until there are a shortage of grid points. The code I've been developing finds the closest grid point, displaying an already mapped point if the distance will be shorter for the current point.
I then added a second step that compares each mapped point to another and, if swapping the mapping with another point produces a smaller sum of the total mapped distance of both points, I swap them.
This last step seems important as it reduces the number crossed map lines. (This would be used to map points on a plate to a grid on another plate, with pins connecting the two, and lines that don't cross seem to have a higher chance that the pins would not make contact.)
Questions:
Can anyone comment on my thinking that if the image above were truly optimized, (that is, the mapped points--overall--would have the smallest total distance), then none of the lines were cross?
And has anyone seen any existing algorithms to help with this. I've searched but came up with nothing.

The problem could be approached as a variation of the Assignment Problem, with the "agents" being the grid squares and the points being the "tasks", (or vice versa) with the distance between them being the "cost" for that agent-task combination. You could solve with the Hungarian algorithm.
To handle the fact that there are more grid squares than points, find a bounding box for the possible grid squares you want to consider and add dummy points that have a cost of 0 associated with all grid squares.
The Hungarian algorithm is O(n3), perhaps your approach is already good enough.
See also:
How to find the optimal mapping between two sets?
How to optimize assignment of tasks to agents with these constraints?

If I understand your main concern correctly, minimising total length of line segments, the algorithm you used does not find the best mapping and it is clear in your image. e.g. when two line segments cross each other, simple mathematic says that if you rearrange their endpoints such that they do not cross, it provides a better total sum. You can use this simple approach (rearranging crossed items) to get better approximation to the optimum, you should apply swapping for more somehow many iterations.
In the following picture you can see why crossing has longer length than non crossing (first question) and also why by swapping once there still exists crossing edges (second question and w.r.t. Comments), I just drew one sample, in fact one may need many iterations of swapping to get non crossed result.
This is a heuristic algorithm certainly not optimum but I expect to be very good and efficient and simple to implement.

Related

Solving the sliding puzzle-like problem with arbitrary number of holes

I've tried searching for a while, but haven't come across a solution, so figured I would ask my own.
Consider an MxM 2D grid of holes, and a set of N balls which are randomly placed in the grid. You are given some final configuration of the N balls in the grid, and your goal is to move the balls in the grid to achieve this final configuration in the shortest time possible.
The only move you are allowed to make is to move any contiguous subsection of the grid (on either a row or column) by one space. That sounds a bit confusing; basically you can select any set of points in a straight line in the grid, and shift all the balls in that subsection by one spot to the left or right if it is a row, or one spot up or down if it is a hole. If that is confusing, it's fine to consider the alternate problem where the only move you can make is to move a single ball to any adjacent spot. The caveat is that two balls can never overlap.
Ultimately this problem basically boils down to a version of the classic sliding tile puzzle, with two key differences: 1) there can be an arbitrary number of holes, and 2) we don't a priori know the numbering of the tiles - we don't care which balls end up in the final holes, we just want to final holes to be filled after it is all said and done.
I'm looking for suggestions about how to go about adapting classic sliding puzzle solutions to these two constraints. The arbitrary number of holes is likely pretty easy to implement efficiently, but the fact that we don't know which balls are destined to go in which holes at the start is throwing me for a loop. Any advice (or implementations of similar problems) would be greatly appreciated.
If I understood well:
all the balls are equal and cannot be distinguished - they can occupy any position on the grid, the starting state is a random configuration of balls and holes on the grid.
there are nxn = balls + holes = number of cells in the grid
your target is to reach a given configuration.
It seems a rather trivial problem, so maybe I missed some constraints. If this is indeed the problem, solving it can be approached like this:
Consider that you move the holes, not the balls.
conduct a search between each hole and each hole position in the target configuration.
Minimize the number of steps to walk the holes to their closest target. (maybe with a BFS if it is needed) - That is to say that you can use this measure as a heuristic to order the moves in a flavor of A* maybe. I think for a 50x50 grid, the search will be very fast, because your heuristic is extremely precise and nearly costless to calculate.
Solving the problem where you can move a hole along multiple positions on a line, or a file is not much more complicated; you can solve it by adding to the possible moves/next steps in your queue.

Algorithm to Produce an Evenly Spaced Grid

I'm looking for a general algorithm for creating an evenly spaced grid, and I've been surprised how difficult it is to find!
Is this a well solved problem whose name I don't know?
Or is this an unsolved problem that is best done by self organising map?
More specifically, I'm attempting to make a grid on a 2D Cartesian plane in which the Euclidean distance between each point and 4 bounding lines (or "walls" to make a bounding box) are equal or nearly equal.
For a square number, this is as simple as making a grid with sqrt(n) rows and sqrt(n) columns with equal spacing positioned in the center of the bounding box. For 5 points, the pattern would presumably either be circular or 4 points with a point in the middle.
I didn't find a very good solution, so I've sadly left the problem alone and settled with a quick function that produces the following grid:
There is no simple general solution to this problem. A self-organizing map is probably one of the best choices.
Another way to approach this problem is to imagine the points as particles that repel each others and that are also repelled by the walls. As an initial arrangement, you could already evenly distribute the points up to the next smaller square number - for this you already have a solution. Then randomly add the remaining points.
Iteratively modify the locations to minimize the energy function based on the total force between the particles and walls. The result will of course depend on the force law, i.e. how the force depends on the distance.
To solve this, you can use numerical methods like FEM.
A simplified and less efficient method that is based on the same principle is to first set up an estimated minimal distance, based on the square number case which you can calculate. Then iterate through all points a number of times and for each one calculate the distance to its closest neighbor. If this is smaller than the estimated distance, move your point into the opposite direction by a certain fraction of the difference.
This method will generally not lead to a stable minimum but should find an acceptable solution after a number ot iterations. You will have to experiment with the stepsize and the number of iterations.
To summarize, you have three options:
FEM method: Efficient but difficult to implement
Self organizing map: Slightly less efficient, medium complexity of implementation.
Iteration described in last section: Less efficient but easy to implement.
Unfortunately your problem is still not very clearly specified. You say you want the points to be "equidistant" yet in your example, some pairs of points are far apart (eg top left and bottom right) and the points are all different distances from the walls.
Perhaps you want the points to have equal minimum distance? In which case a simple solution is to draw a cross shape, with one point in the centre and the remainder forming a vertical and horizontal crossed line. The gap between the walls and the points, and the points in the lines can all be equal and this can work with any number of points.

How to break a geometry into blocks?

I am certain there is already some algorithm that does what I need, but I am not sure what phrase to Google, or what is the algorithm category.
Here is my problem: I have a polyhedron made up by several contacting blocks (hyperslabs), i. e. the edges are axis aligned and the angles between edges are 90°. There may be holes inside the polyhedron.
I want to break up this concave polyhedron in as little convex rectangular axis-aligned whole blocks are possible (if the original polyhedron is convex and has no holes, then it is already such a block, and therefore, the solution). To illustrate, some 2-D images I made (but I need the solution for 3-D, and preferably, N-D):
I have this geometry:
One possible breakup into blocks is this:
But the one I want is this (with as few blocks as possible):
I have the impression that an exact algorithm may be too expensive (is this problem NP-hard?), so an approximate algorithm is suitable.
One detail that maybe make the problem easier, so that there could be a more appropriated/specialized algorithm for it is that all edges have sizes multiple of some fixed value (you may think all edges sizes are integer numbers, or that the geometry is made up by uniform tiny squares, or voxels).
Background: this is the structured grid discretization of a PDE domain.
What algorithm can solve this problem? What class of algorithms should I
search for?
Update: Before you upvote that answer, I want to point out that my answer is slightly off-topic. The original poster have a question about the decomposition of a polyhedron with faces that are axis-aligned. Given such kind of polyhedron, the question is to decompose it into convex parts. And the question is in 3D, possibly nD. My answer is about the decomposition of a general polyhedron. So when I give an answer with a given implementation, that answer applies to the special case of polyhedron axis-aligned, but it might be that there exists a better implementation for axis-aligned polyhedron. And when my answer says that a problem for generic polyhedron is NP-complete, it might be that there exists a polynomial solution for the special case of axis-aligned polyhedron. I do not know.
Now here is my (slightly off-topic) answer, below the horizontal rule...
The CGAL C++ library has an algorithm that, given a 2D polygon, can compute the optimal convex decomposition of that polygon. The method is mentioned in the part 2D Polygon Partitioning of the manual. The method is named CGAL::optimal_convex_partition_2. I quote the manual:
This function provides an implementation of Greene's dynamic programming algorithm for optimal partitioning [2]. This algorithm requires O(n4) time and O(n3) space in the worst case.
In the bibliography of that CGAL chapter, the article [2] is:
[2] Daniel H. Greene. The decomposition of polygons into convex parts. In Franco P. Preparata, editor, Computational Geometry, volume 1 of Adv. Comput. Res., pages 235–259. JAI Press, Greenwich, Conn., 1983.
It seems to be exactly what you are looking for.
Note that the same chapter of the CGAL manual also mention an approximation, hence not optimal, that run in O(n): CGAL::approx_convex_partition_2.
Edit, about the 3D case:
In 3D, CGAL has another chapter about Convex Decomposition of Polyhedra. The second paragraph of the chapter says "this problem is known to be NP-hard [1]". The reference [1] is:
[1] Bernard Chazelle. Convex partitions of polyhedra: a lower bound and worst-case optimal algorithm. SIAM J. Comput., 13:488–507, 1984.
CGAL has a method CGAL::convex_decomposition_3 that computes a non-optimal decomposition.
I have the feeling your problem is NP-hard. I suggest a first step might be to break the figure into sub-rectangles along all hyperplanes. So in your example there would be three hyperplanes (lines) and four resulting rectangles. Then the problem becomes one of recombining rectangles into larger rectangles to minimize the final number of rectangles. Maybe 0-1 integer programming?
I think dynamic programming might be your friend.
The first step I see is to divide the polyhedron into a trivial collection of blocks such that every possible face is available (i.e. slice and dice it into the smallest pieces possible). This should be trivial because everything is an axis aligned box, so k-tree like solutions should be sufficient.
This seems reasonable because I can look at its cost. The cost of doing this is that I "forget" the original configuration of hyperslabs, choosing to replace it with a new set of hyperslabs. The only way this could lead me astray is if the original configuration had something to offer for the solution. Given that you want an "optimal" solution for all configurations, we have to assume that the original structure isn't very helpful. I don't know if it can be proven that this original information is useless, but I'm going to make that assumption in this answer.
The problem has now been reduced to a graph problem similar to a constrained spanning forest problem. I think the most natural way to view the problem is to think of it as a graph coloring problem (as long as you can avoid confusing it with the more famous graph coloring problem of trying to color a map without two states of the same color sharing a border). I have a graph of nodes (small blocks), each of which I wish to assign a color (which will eventually be the "hyperslab" which covers that block). I have the constraint that I must assign colors in hyperslab shapes.
Now a key observation is that not all possibilities must be considered. Take the final colored graph we want to see. We can partition this graph in any way we please by breaking any hyperslab which crosses the partition into two pieces. However, not every partition is meaningful. The only partitions that make sense are axis aligned cuts, which always break a hyperslab into two hyperslabs (as opposed to any more complicated shape which could occur if the cut was not axis aligned).
Now this cut is the reverse of the problem we're really trying to solve. That cutting is actually the thing we did in the first step. While we want to find the optimal merging algorithm, undoing those cuts. However, this shows a key feature we will use in dynamic programming: the only features that matter for merging are on the exposed surface of a cut. Once we find the optimal way of forming the central region, it generally doesn't play a part in the algorithm.
So let's start by building a collection of hyperslab-spaces, which can define not just a plain hyperslab, but any configuration of hyperslabs such as those with holes. Each hyperslab-space records:
The number of leaf hyperslabs contained within it (this is the number we are eventually going to try to minimize)
The internal configuration of hyperslabs.
A map of the surface of the hyperslab-space, which can be used for merging.
We then define a "merge" rule to turn two or more adjacent hyperslab-spaces into one:
Hyperslab-spaces may only be combined into new hyperslab-spaces (so you need to combine enough pieces to create a new hyperslab, not some more exotic shape)
Merges are done simply by comparing the surfaces. If there are features with matching dimensionalities, they are merged (because it is trivial to show that, if the features match, it is always better to merge hyperslabs than not to)
Now this is enough to solve the problem with brute force. The solution will be NP-complete for certain. However, we can add an additional rule which will drop this cost dramatically: "One hyperslab-space is deemed 'better' than another if they cover the same space, and have exactly the same features on their surface. In this case, the one with fewer hyperslabs inside it is the better choice."
Now the idea here is that, early on in the algorithm, you will have to keep track of all sorts of combinations, just in case they are the most useful. However, as the merging algorithm makes things bigger and bigger, it will become less likely that internal details will be exposed on the surface of the hyperslab-space. Consider
+===+===+===+---+---+---+---+
| : : A | X : : : :
+---+---+---+---+---+---+---+
| : : B | Y : : : :
+---+---+---+---+---+---+---+
| : : | : : : :
+===+===+===+ +---+---+---+
Take a look at the left side box, which I have taken the liberty of marking in stronger lines. When it comes to merging boxes with the rest of the world, the AB:XY surface is all that matters. As such, there are only a handful of merge patterns which can occur at this surface
No merges possible
A:X allows merging, but B:Y does not
B:Y allows merging, but A:X does not
Both A:X and B:Y allow merging (two independent merges)
We can merge a larger square, AB:XY
There are many ways to cover the 3x3 square (at least a few dozen). However, we only need to remember the best way to achieve each of those merge processes. Thus once we reach this point in the dynamic programming, we can forget about all of the other combinations that can occur, and only focus on the best way to achieve each set of surface features.
In fact, this sets up the problem for an easy greedy algorithm which explores whichever merges provide the best promise for decreasing the number of hyperslabs, always remembering the best way to achieve a given set of surface features. When the algorithm is done merging, whatever that final hyperslab-space contains is the optimal layout.
I don't know if it is provable, but my gut instinct thinks that this will be an O(n^d) algorithm where d is the number of dimensions. I think the worst case solution for this would be a collection of hyperslabs which, when put together, forms one big hyperslab. In this case, I believe the algorithm will eventually work its way into the reverse of a k-tree algorithm. Again, no proof is given... it's just my gut instinct.
You can try a constrained delaunay triangulation. It gives very few triangles.
Are you able to determine the equations for each line?
If so, maybe you can get the intersection (points) between those lines. Then if you take one axis, and start to look for a value which has more than two points (sharing this value) then you should "draw" a line. (At the beginning of the sweep there will be zero points, then two (your first pair) and when you find more than two points, you will be able to determine which points are of the first polygon and which are of the second one.
Eg, if you have those lines:
verticals (red):
x = 0, x = 2, x = 5
horizontals (yellow):
y = 0, y = 2, y = 3, y = 5
and you start to sweep through of X axis, you will get p1 and p2, (and we know to which line-equation they belong ) then you will get p3,p4,p5 and p6 !! So here you can check which of those points share the same line of p1 and p2. In this case p4 and p5. So your first new polygon is p1,p2,p4,p5.
Now we save the 'new' pair of points (p3, p6) and continue with the sweep until the next points. Here we have p7,p8,p9 and p10, looking for the points which share the line of the previous points (p3 and p6) and we get p7 and p10. Those are the points of your second polygon.
When we repeat the exercise for the Y axis, we will get two points (p3,p7) and then just three (p1,p2,p8) ! On this case we should use the farest point (p8) in the same line of the new discovered point.
As we are using lines equations and points 2 or more dimensions, the procedure should be very similar
ps, sorry for my english :S
I hope this helps :)

Space partitioning algorithm

I have a set of points which are contained within the rectangle. I'd like to split the rectangles into subrectangles based on point density (giving a number of subrectangles or desired density, whichever is easiest).
The partitioning doesn't have to be exact (almost any approximation better than regular grid would do), but the algorithm has to cope with the large number of points - approx. 200 millions. The desired number of subrectangles however is substantially lower (around 1000).
Does anyone know any algorithm which may help me with this particular task?
Just to understand the problem.
The following is crude and perform badly, but I want to know if the result is what you want>
Assumption> Number of rectangles is even
Assumption> Point distribution is markedly 2D (no big accumulation in one line)
Procedure>
Bisect n/2 times in either axis, looping from one end to the other of each previously determined rectangle counting "passed" points and storing the number of passed points at each iteration. Once counted, bisect the rectangle selecting by the points counted in each loop.
Is that what you want to achieve?
I think I'd start with the following, which is close to what #belisarius already proposed. If you have any additional requirements, such as preferring 'nearly square' rectangles to 'long and thin' ones you'll need to modify this naive approach. I'll assume, for the sake of simplicity, that the points are approximately randomly distributed.
Split your initial rectangle in 2 with a line parallel to the short side of the rectangle and running exactly through the mid-point.
Count the number of points in both half-rectangles. If they are equal (enough) then go to step 4. Otherwise, go to step 3.
Based on the distribution of points between the half-rectangles, move the line to even things up again. So if, perchance, the first cut split the points 1/3, 2/3, move the line half-way into the heavy half of the rectangle. Go to step 2. (Be careful not to get trapped here, moving the line in ever decreasing steps first in one direction, then the other.)
Now, pass each of the half-rectangles in to a recursive call to this function, at step 1.
I hope that outlines the proposal well enough. It has limitations: it will produce a number of rectangles equal to some power of 2, so adjust it if that's not good enough. I've phrased it recursively, but it's ideal for parallelisation. Each split creates two tasks, each of which splits a rectangle and creates two more tasks.
If you don't like that approach, perhaps you could start with a regular grid with some multiple (10 - 100 perhaps) of the number of rectangles you want. Count the number of points in each of these tiny rectangles. Then start gluing the tiny rectangles together until the less-tiny rectangle contains (approximately) the right number of points. Or, if it satisfies your requirements well enough, you could use this as a discretisation method and integrate it with my first approach, but only place the cutting lines along the boundaries of the tiny rectangles. This would probably be much quicker as you'd only have to count the points in each tiny rectangle once.
I haven't really thought about the running time of either of these; I have a preference for the former approach 'cos I do a fair amount of parallel programming and have oodles of processors.
You're after a standard Kd-tree or binary space partitioning tree, I think. (You can look it up on Wikipedia.)
Since you have very many points, you may wish to only approximately partition the first few levels. In this case, you should take a random sample of your 200M points--maybe 200k of them--and split the full data set at the midpoint of the subsample (along whichever axis is longer). If you actually choose the points at random, the probability that you'll miss a huge cluster of points that need to be subdivided will be approximately zero.
Now you have two problems of about 100M points each. Divide each along the longer axis. Repeat until you stop taking subsamples and split along the whole data set. After ten breadth-first iterations you'll be done.
If you have a different problem--you must provide tick marks along the X and Y axis and fill in a grid along those as best you can, rather than having the irregular decomposition of a Kd-tree--take your subsample of points and find the 0/32, 1/32, ..., 32/32 percentiles along each axis. Draw your grid lines there, then fill the resulting 1024-element grid with your points.
R-tree
Good question.
I think the area you need to investigate is "computational geometry" and the "k-partitioning" problem. There's a link that might help get you started here
You might find that the problem itself is NP-hard which means a good approximation algorithm is the best you're going to get.
Would K-means clustering or a Voronoi diagram be a good fit for the problem you are trying to solve?
That's looks like Cluster analysis.
Would a QuadTree work?
A quadtree is a tree data structure in which each internal node has exactly four children. Quadtrees are most often used to partition a two dimensional space by recursively subdividing it into four quadrants or regions. The regions may be square or rectangular, or may have arbitrary shapes. This data structure was named a quadtree by Raphael Finkel and J.L. Bentley in 1974. A similar partitioning is also known as a Q-tree. All forms of Quadtrees share some common features:
They decompose space into adaptable cells
Each cell (or bucket) has a maximum capacity. When maximum capacity is reached, the bucket splits
The tree directory follows the spatial decomposition of the Quadtree

Compare three-dimensional structures

I need to evaluate if two sets of 3d points are the same (ignoring translations and rotations) by finding and comparing a proper geometric hash. I did some paper research on geometric hashing techniques, and I found a couple of algorithms, that however tend to be complicated by "vision requirements" (eg. 2d to 3d, occlusions, shadows, etc).
Moreover, I would love that, if the two geometries are slightly different, the hashes are also not very different.
Does anybody know some algorithm that fits my need, and can provide some link for further study?
Thanks
Your first thought may be trying to find the rotation that maps one object to another but this a very very complex topic... and is not actually necessary! You're not asking how to best match the two, you're just asking if they are the same or not.
Characterize your model by a list of all interpoint distances. Sort the list by that distance. Now compare the list for each object. They should be identical, since interpoint distances are not affected by translation or rotation.
Three issues:
1) What if the number of points is large, that's a large list of pairs (N*(N-1)/2). In this case you may elect to keep only the longest ones, or even better, keep the 1 or 2 longest ones for each vertex so that every part of your model has some contribution. Dropping information like this however changes the problem to be probabilistic and not deterministic.
2) This only uses vertices to define the shape, not edges. This may be fine (and in practice will be) but if you expect to have figures with identical vertices but different connecting edges. If so, test for the vertex-similarity first. If that passes, then assign a unique labeling to each vertex by using that sorted distance. The longest edge has two vertices. For each of THOSE vertices, find the vertex with the longest (remaining) edge. Label the first vertex 0 and the next vertex 1. Repeat for other vertices in order, and you'll have assigned tags which are shift and rotation independent. Now you can compare edge topologies exactly (check that for every edge in object 1 between two vertices, there's a corresponding edge between the same two vertices in object 2) Note: this starts getting really complex if you have multiple identical interpoint distances and therefore you need tiebreaker comparisons to make the assignments stable and unique.
3) There's a possibility that two figures have identical edge length populations but they aren't identical.. this is true when one object is the mirror image of the other. This is quite annoying to detect! One way to do it is to use four non-coplanar points (perhaps the ones labeled 0 to 3 from the previous step) and compare the "handedness" of the coordinate system they define. If the handedness doesn't match, the objects are mirror images.
Note the list-of-distances gives you easy rejection of non-identical objects. It also allows you to add "fuzzy" acceptance by allowing a certain amount of error in the orderings. Perhaps taking the root-mean-squared difference between the two lists as a "similarity measure" would work well.
Edit: Looks like your problem is a point cloud with no edges. Then the annoying problem of edge correspondence (#2) doesn't even apply and can be ignored! You still have to be careful of the mirror-image problem #3 though.
There a bunch of SIGGRAPH publications which may prove helpful to you.
e.g. "Global Non-Rigid Alignment of 3-D Scans" by Brown and Rusinkiewicz:
http://portal.acm.org/citation.cfm?id=1276404
A general search that can get you started:
http://scholar.google.com/scholar?q=siggraph+point+cloud+registration
spin images are one way to go about it.
Seems like a numerical optimisation problem to me. You want to find the parameters of the transform which transforms one set of points to as close as possible by the other. Define some sort of residual or "energy" which is minimised when the points are coincident, and chuck it at some least-squares optimiser or similar. If it manages to optimise the score to zero (or as near as can be expected given floating point error) then the points are the same.
Googling
least squares rotation translation
turns up quite a few papers building on this technique (e.g "Least-Squares Estimation of Transformation Parameters Between Two Point Patterns").
Update following comment below: If a one-to-one correspondence between the points isn't known (as assumed by the paper above), then you just need to make sure the score being minimised is independent of point ordering. For example, if you treat the points as small masses (finite radius spheres to avoid zero-distance blowup) and set out to minimise the total gravitational energy of the system by optimising the translation & rotation parameters, that should work.
If you want to estimate the rigid
transform between two similar
point clouds you can use the
well-established
Iterative Closest Point method. This method starts with a rough
estimate of the transformation and
then iteratively optimizes for the
transformation, by computing nearest
neighbors and minimizing an
associated cost function. It can be
efficiently implemented (even
realtime) and there are available
implementations available for
matlab, c++... This method has been
extended and has several variants,
including estimating non-rigid
deformations, if you are interested
in extensions you should look at
Computer graphics papers solving
scan registration problem, where
your problem is a crucial step. For
a starting point see the Wikipedia
page on Iterative Closest Point
which has several good external
links. Just a teaser image from a matlab implementation which was designed to match to point clouds:
(source: mathworks.com)
After aligning you could the final
error measure to say how similar the
two point clouds are, but this is
very much an adhoc solution, there
should be better one.
Using shape descriptors one can
compute fingerprints of shapes which
are often invariant under
translations/rotations. In most cases they are defined for meshes, and not point clouds, nevertheless there is a multitude of shape descriptors, so depending on your input and requirements you might find something useful. For this, you would want to look into the field of shape analysis, and probably this 2004 SIGGRAPH course presentation can give a feel of what people do to compute shape descriptors.
This is how I would do it:
Position the sets at the center of mass
Compute the inertia tensor. This gives you three coordinate axes. Rotate to them. [*]
Write down the list of points in a given order (for example, top to bottom, left to right) with your required precision.
Apply any algorithm you'd like for a resulting array.
To compare two sets, unless you need to store the hash results in advance, just apply your favorite comparison algorithm to the sets of points of step 3. This could be, for example, computing a distance between two sets.
I'm not sure if I can recommend you the algorithm for the step 4 since it appears that your requirements are contradictory. Anything called hashing usually has the property that a small change in input results in very different output. Anyway, now I've reduced the problem to an array of numbers, so you should be able to figure things out.
[*] If two or three of your axis coincide select coordinates by some other means, e.g. as the longest distance. But this is extremely rare for random points.
Maybe you should also read up on the RANSAC algorithm. It's commonly used for stitching together panorama images, which seems to be a bit similar to your problem, only in 2 dimensions. Just google for RANSAC, panorama and/or stitching to get a starting point.

Resources