I'm working on a small house design project and one of its most important parts is a section where the user can give some info about how he wants his rooms (for example, a house with 10 x 10 meters, having a 3x3 living room, a 3x3 kitchen, two 4 x 5 bedrooms, and a 4x2 bathroom), and then the program generates a map of the house according to the requeriments made.
For now, I'm not worried about drawing the map, just arranging the rooms in a way they don't overlap (yes, the output can be pretty ugly). I've already made some searches and found that what I want is very similar to the packing problem, which has some algorithms that handle this problem pretty well (although it's a NP-complete problem).
But then I had one more restriction: the user can specify "links" between rooms, for example, he may wish that a room must have a "door" to a bathroom, the living room to have a direct to the kitchen, etc (that is, the rooms must be placed side by side), and this is where the things get complicated.
I'm pretty sure that what I want configures a NP-problem, so I'm asking for tips to construct a good, but not necessarily optimal implementation. The idea I have is to use graphs to represent the relationship between rooms, but I can't find out how I can adapt the existent packing algorithms to fit this new restriction. Can anyone help me?
I don't have a full answer for you, but I do have a hint: Your connectivity constraints will form what is known as a planar graph (if they don't, the solution is impossible with a single-story house). Rooms in the final solution will correspond to areas enclosed by edges in the dual of the constraint graph, so all you need to do then is take said dual, and adjust the shape of its edges, without introducing intersections, to fit sizing constraints. Note that you will need to introduce a vertex to represent 'outside' in the constraint graph, and ensure it is not surrounded in the dual. You may also need to introduce additional edges in the constraint graph to ensure all the rooms are connected (and add rooms for hallways, etc).
You might find this interesting.
It's a grammar for constructing Palladian villas.
To apply something like that to your problem, I would have a way to construct one at random, and then be able to make random changes to it, and use a simulated annealing algorithm.
Related
I want to know some more applications of the widest path problem. CLICK!
It seems like something that can be used in a multitude of places, but I couldn't get anything constructive from searching on the internet.
Can someone please share as to where else this might be used?
thanks in advance.
(what I searched for included uses in p2p networks and CDN, but I couldn't find exactly how it is used / the papers were too long for me to scout.)
The widest path problem has a variety of applications in areas such as network routing problems, digital compositing and voting theory. Some specific applications include:
Finding the route with maximum transmission speed between two nodes.
This comes almost directly from the widest-path problem definition. We want to find the path between two nodes which maximizes the minimum-weight edge in the path.
Computing the strongest path strengths in Schulze’s method.
Schulze's method is a system in voting theory for finding a single winner among multiple candidates. Each voter provides an ordered preference list. We then construct a weighted graph where vertices represents candidates and the weight of an edge (u, v) represents the number of voters who prefer candidate u over candidate v. Next, we want to find the strength of the strongest path between each pair of candidates. This is the part of Schulze's method that can be solved using the widest-path problem. We simply run an algorithm to solve the widest-path problem for each pair of vertices.
Mosaicking of digital photographic maps. This is a technique for merging two maps into a single bigger map. The challenge is the the two original photos might have different light intensity, colors, etc. One way to do mosaicking is to produce seams where each pixel in the resulting picture is represented entirely by one photo or the other. We want the seam to appear invisible in the final product. The problem of finding the optimal seam can be modeled as a widest-path problem. Details for the modeling are found in the original paper
Metabolic path analysis for living organisms. The objective of this type of analysis is identify critical reactions in living organisms. A network is constructed based on the stoichiometry of the reactions. We wish to find the path which is energetically favored in the production of a particular metabolite, ie the path where the bottleneck between two vertices is the smallest. This corresponds to the widest-path problem.
I am calculating pathfinding inside a mesh which I have build a uniform grid around. The nodes (cells in the 3D grid) close to what I deem a "standable" surface I mark as accessible and they are used in my pathfinding. To get alot of detail (like being able to pathfind up small stair cases) the ammount of accessible cells in my grid have grown quite large, several thousand in larger buildings. (every grid cell is 0.5x0.5x0.5 m and the meshes are rooms with real world dimensions). Even though I only use a fraction of the actual cells in my grid for pathfinding the huge ammount slows the algorithm down. Other than that it works fine and finds the correct path through the mesh, using a weighted manhattan distance heuristic.
Imagine my grid looks like that and the mesh is inside it (can be more or less cubes but its always cubical), however the pathfinding will not be calculated on all the small cubes just a few marked as accessible (usually at the bottom of the grid but that can depend on how many floors the mesh has).
I am looking to reduce the search space for the pathfinding... I have looked at clustering like how HPA* does it and other clustering algorithms like Markov but they all seem to be best used with node graphs and not grids. One obvious solution would be to just increase the size of the small cubes building the grid but then I would lose alot of detail in the pathfinding and it would not be as robust. How could I cluster these small cubes? This is how a typical search space looks when I do my pathfinding (blue are accessible, green is path):
and as you see there is a lot of cubes to search through because the distance between them is quite small!
Never mind that the grid is an unoptimal solution for pathfinding for now.
Does anyone have an idea on how to reduce the ammount of cubes in the grid I have to search through and how would I access the neighbors after I reduce the space? :) Right now it only looks at the closest neighbors while expanding the search space.
A couple possibilities come to mind.
Higher-level Pathfinding
The first is that your A* search may be searching the entire problem space. For example, you live in Austin, Texas, and want to get into a particular building somewhere in Alberta, Canada. A simple A* algorithm would search a lot of Mexico and the USA before finally searching Canada for the building.
Consider creating a second layer of A* to solve this problem. You'd first find out which states to travel between to get to Canada, then which provinces to reach Alberta, then Calgary, and then the Calgary Zoo, for example. In a sense, you start with an overview, then fill it in with more detailed paths.
If you have enormous levels, such as skyrim's, you may need to add pathfinding layers between towns (multiple buildings), regions (multiple towns), and even countries (multiple regions). If you were making a GPS system, you might even need continents. If we'd become interstellar, our spaceships might contain pathfinding layers for planets, sectors, and even galaxies.
By using layers, you help to narrow down your search area significantly, especially if different areas don't use the same co-ordinate system! (It's fairly hard to estimate distance for one A* pathfinder if one of the regions needs latitude-longitude, another 3d-cartesian, and the next requires pathfinding through a time dimension.)
More efficient algorithms
Finding efficient algorithms becomes more important in 3 dimensions because there are more nodes to expand while searching. A Dijkstra search which expands x^2 nodes would search x^3, with x being the distance between the start and goal. A 4D game would require yet more efficiency in pathfinding.
One of the benefits of grid-based pathfinding is that you can exploit topographical properties like path symmetry. If two paths consist of the same movements in a different order, you don't need to find both of them. This is where a very efficient algorithm called Jump Point Search comes into play.
Here is a side-by-side comparison of A* (left) and JPS (right). Expanded/searched nodes are shown in red with walls in black:
Notice that they both find the same path, but JPS easily searched less than a tenth of what A* did.
As of now, I haven't seen an official 3-dimensional implementation, but I've helped another user generalize the algorithm to multiple dimensions.
Simplified Meshes (Graphs)
Another way to get rid of nodes during the search is to remove them before the search. For example, do you really need nodes in wide-open areas where you can trust a much more stupid AI to find its way? If you are building levels that don't change, create a script that parses them into the simplest grid which only contains important nodes.
This is actually called 'offline pathfinding'; basically finding ways to calculate paths before you need to find them. If your level will remain the same, running the script for a few minutes each time you update the level will easily cut 90% of the time you pathfind. After all, you've done most of the work before it became urgent. It's like trying to find your way around a new city compared to one you grew up in; knowing the landmarks means you don't really need a map.
Similar approaches to the 'symmetry-breaking' that Jump Point Search uses were introduced by Daniel Harabor, the creator of the algorithm. They are mentioned in one of his lectures, and allow you to preprocess the level to store only jump-points in your pathfinding mesh.
Clever Heuristics
Many academic papers state that A*'s cost function is f(x) = g(x) + h(x), which doesn't make it obvious that you may use other functions, multiply the weight of the cost functions, and even implement heatmaps of territory or recent deaths as functions. These may create sub-optimal paths, but they greatly improve the intelligence of your search. Who cares about the shortest path when your opponent has a choke point on it and has been easily dispatching anybody travelling through it? Better to be certain the AI can reach the goal safely than to let it be stupid.
For example, you may want to prevent the algorithm from letting enemies access secret areas so that they avoid revealing them to the player, and so that they AI seems to be unaware of them. All you need to achieve this is a uniform cost function for any point within those 'off-limits' regions. In a game like this, enemies would simply give up on hunting the player after the path grew too costly. Another cool option is to 'scent' regions the player has been recently (by temporarily increasing the cost of unvisited locations because many algorithms dislike negative costs).
If you know what places you won't need to search, but can't implement in your algorithm's logic, a simple increase to their cost will prevent unnecessary searching. There's a lot of ways to take advantage of heuristics to simplify and inform your pathfinding, but your biggest gains will come from Jump Point Search.
EDIT: Jump Point Search implicitly selects pathfinding direction using the same heuristics as A*, so you may be able to implement heuristics to a small degree, but their cost function won't be the cost of a node, but rather, the cost of traveling between the two nodes. (A* generally searches adjacent nodes, so the distinction between a node's cost and the cost of traveling to it tends to break down.)
Summary
Although octrees/quad-trees/b-trees can be useful in collision-detection, they aren't as applicable to searches because they section a graph based on its coordinates; not on its connections. Layering your graph (mesh in your vocabulary) into super graphs (regions) is a more effective solution.
Hopefully I've covered anything you'll find useful.
Good luck!
I recently started playing Flow Free Game.
Connect matching colors with pipe to create a flow. Pair all colors, and cover the entire board to solve each puzzle in Flow Free. But watch out, pipes will break if they cross or overlap!
I realized it is just path finding game between given pair of points with conditions that no two paths overlap. I was interested in writing a solution for the game but don't know where to start. I thought of using backtracking but for very large board sizes it will have high time complexity.
Is there any suitable algorithm to solve the game efficiently. Can using heuristics to solve the problem help? Just give me a hint on where to start, I will take it from there.
I observed in most of the boards that usually
For furthest points, you need to follow path along edge.
For point nearest to each other, follow direct path if there is one.
Is this correct observation and can it be used to solve it efficiently?
Reduction to SAT
Basic idea
Reduce the problem to SAT
Use a modern SAT solver to solve the problem
Profit
Complexity
The problem is obviously in NP: If you guess a board constellation, it is easy (poly-time) to check whether it solves the problem.
Whether it is NP-hard (meaning as hard as every other problem in NP, e.g. SAT), is not clear. Surely modern SAT solvers will not care and solve large instances in a breeze anyway (I guess up to 100x100).
Literature on Number Link
Here I just copy Nuclearman's comment to the OP:
Searching for "SAT formulation of numberlink" and "NP-completeness of numberlink" leads to a couple references. Unsurprisingly, the two most interesting ones are in Japanese. The first is the actual paper proof of NP-completeness. The second describes how to solve NumberLink using the SAT solver, Sugar. –
Hint for reduction to SAT
There are several possibilities to encode the problem. I'll give one that I could make up quickly.
Remark
j_random_hacker noted that free-standing cycles are not allowed. The following encoding does allow them. This problem makes the SAT encoding a bit less attractive. The simplest method I could think of to forbid free-standing loops would introduce O(n^2) new variables, where n is the number of tiles on the board (count distance from next sink for each tile) unless one uses log encoding for this, which would bring it down to O(n*log n), possible making the problem harder for the solver.
Variables
One variable per tile, piece type and color. Example if some variable X-Y-T-C is true it encodes that the tile at position X/Y is of type T and has color C. You don't need the empty tile type since this cannot happen in a solution.
Set initial variables
Set the variables for the sink/sources and say no other tile can be sink/source.
Constraints
For every position, exactly one color/piece combination is true (cardinality constraint).
For every variable (position, type, color), the four adjacent tiles have to be compatible (if the color matches).
I might have missed something. But it should be easily fixed.
I suspect that no polynomial-time algorithm is guaranteed to solve every instance of this problem. But since one of the requirements is that every square must be covered by pipe, a similar approach to what both people and computers use for solving Sudoku should work well here:
For every empty square, form a set of possible colours for that square, and then repeatedly perform logical deductions at each square to shrink the allowed set of colours for that square.
Whenever a square's set of possible colours shrinks to size 1, the colour for that square is determined.
If we reach a state where no more logical deductions can be performed and the puzzle is not completely solved yet (i.e. there is at least one square with more than one possible colour), pick one of these undecided squares and recurse on it, trying each of the possible colours in turn. Each try will either lead to a solution, or a contradiction; the latter eliminates that colour as a possibility for that square.
When picking a square to branch on, it's generally a good idea to pick a square with as few allowed colours as possible.
[EDIT: It's important to avoid the possibility of forming invalid "loops" of pipe. One way to do this is by maintaining, for each allowed colour i of each square x, 2 bits of information: whether the square x is connected by a path of definite i-coloured tiles to the first i-coloured endpoint, and the same thing for the second i-coloured endpoint. Then when recursing, don't ever pick a square that has two neighbours with the same bit set (or with neither bit set) for any allowed colour.]
You actually don't need to use any logical deductions at all, but the more and better deductions you use, the faster the program will run as they will (possibly dramatically) reduce the amount of recursion. Some useful deductions include:
If a square is the only possible way to extend the path for some particular colour, then it must be assigned that colour.
If a square has colour i in its set of allowed colours, but it does not have at least 2 neighbouring squares that also have colour i in their sets of allowed colours, then it can't be "reached" by any path of colour i, and colour i can be eliminated as a possibility.
More advanced deductions based on path connectivity might help further -- e.g. if you can determine that every path connecting some pair of connectors must pass through a particular square, you can immediately assign that colour to the square.
This simple approach infers a complete solution without any recursion in your 5x5 example: the squares at (5, 2), (5, 3), (4, 3) and (4, 4) are forced to be orange; (4, 5) is forced to be green; (5, 5) is also forced to be green by virtue of the fact that no other colour could get to this square and then back again; now the orange path ending at (4, 4) has nowhere to go except to complete the orange path at (3, 4). Also (3, 1) is forced to be red; (3, 2) is forced to be yellow, which in turn forces (2, 1) and then (2, 2) to be red, which finally forces the yellow path to finish at (3, 3). The red pipe at (2, 2) forces (1, 2) to be blue, and the red and blue paths wind up being completely determined, "forcing each other" as they go.
I found a blog post on Needlessly Complex that completely explains how to use SAT to solve this problem.
The code is open-source as well, so you can look at it (and understand it) in action.
I'll provide a quote from it here that describes the rules you need to implement in SAT:
Every cell is assigned a single color.
The color of every endpoint cell is known and specified.
Every endpoint cell has exactly one neighbor which matches its color.
The flow through every non-endpoint cell matches exactly one of the six direction types.
The neighbors of a cell specified by its direction type must match its color.
The neighbors of a cell not specified by its direction type must not match its color.
Thank you #Matt Zucker for creating this!
I like solutions that are similar to human thinking. You can (pretty easily) get the answer of a Sudoku by brute force, but it's more useful to get a path you could have followed to solve the puzzle.
I observed in most of the boards that usually
1.For furthest points, you need to follow path along edge.
2.For point nearest to each other, follow direct path if there is one.
Is this correct observation and can it be used to solve it efficiently?
These are true "most of the times", but not always.
I would replace your first rule by this one : if both sinks are along edge, you need to follow path along edge. (You could build a counter-example, but it's true most of the times). After you make a path along the edge, the blocks along the edge should be considered part of the edge, so your algorithm will try to follow the new edge made by the previous pipe. I hope this sentence makes sense...
Of course, before using those "most of the times" rules, you need to follow absolutes rules (see the two deductions from j_random_hacker's post).
Another thing is to try to eliminate boards that can't lead to a solution. Let's call an unfinished pipe (one that starts from a sink but does not yet reach the other sink) a snake, and the last square of the unfinished pipe will be called the snake's head. If you can't find a path of blank squares between the two heads of the same color, it means your board can't lead to a solution and should be discarded (or you need to backtrack, depending of your implementation).
The free flow game (and other similar games) accept as a valid solution a board where there are two lines of the same color side-by-side, but I believe that there always exists a solution without side-by-side lines. That would mean that any square that is not a sink would have exactly two neighbors of the same color, and sinks would have exactly one. If the rule happens to be always true (I believe it is, but can't prove it), that would be an additional constraint to decrease your number of possibilities. I solved some of Free Flow's puzzles using side-by-side lines, but most of the times I found another solution without them. I haven't seen side-by-side lines on Free Flow's solutions web site.
A few rules that lead to a sort of algorithm to solve levels in flow, based on the IOS vertions by Big Duck Games, this company seems to produce the canonical versions. The rest of this answer assumes no walls, bridges or warps.
Even if your uncannily good, the huge 15x18 square boards are a good example of how just going at it in ways that seem likely get you stuck just before the end over and over again and practically having to start again from scratch. This is probably to do with the already mentioned exponential time complexity in the general case. But this doesn’t mean that a simple stratergy isn’t overwhelmingly effective for most boards.
Blocks are never left empty, therefore orphaned blocks mean you’ve done something wrong.
Cardinally neighbouring cells of the same colour must be connected. This rules out 2x2 blocks of the same colour and on the hexagonal grid triangles of 3 neighbouring cells.
You can often make perminent progress by establishing that a color goes or is excluded from a certain square.
Due to points 1 and 2, on the hexagonal grid on boards that are hexagonal in shape a pipe going along an edge is usually stuck going along it all the way round to the exit, effectively moving the outer edge in and making the board smaller so the process can be repeated. It is predictable what sorts of neighbouring conditions guarantee and what sorts can break this cycle for both sorts of grid.
Most if not all 3rd party variants I’ve found lack 1 to 4, but given these restraints generating valid boards may be a difficult task.
Answer:
Point 3 suggests a value be stored for each cell that is able to be either a colour, or a set of false/indeterminate values there being one for each colour.
A solver could repeatedly use points 1 and 2 along with the data stored for point 3 on small neighbourhoods of paths around the ends of pipes to increasingly set colours or set the indeterminate values to false.
A few of us have spent quite a bit of time thinking about this. I summarised our work in a Medium article here: https://towardsdatascience.com/deep-learning-vs-puzzle-games-e996feb76162
Spoiler: so far, good old SAT seems to beat fancy AI algorithms!
If you're not familiar with it, the game consists of a collection of cars of varying sizes, set either horizontally or vertically, on a NxM grid that has a single exit.
Each car can move forward/backward in the directions it's set in, as long as another car is not blocking it. You can never change the direction of a car.
There is one special car, usually it's the red one. It's set in the same row that the exit is in, and the objective of the game is to find a series of moves (a move - moving a car N steps back or forward) that will allow the red car to drive out of the maze.
I've been trying to think how to generate instances for this problem, generating levels of difficulty based on the minimum number to solve the board.
Any idea of an algorithm or a strategy to do that?
Thanks in advance!
The board given in the question has at most 4*4*4*5*5*3*5 = 24.000 possible configurations, given the placement of cars.
A graph with 24.000 nodes is not very large for todays computers. So a possible approach would be to
construct the graph of all positions (nodes are positions, edges are moves),
find the number of winning moves for all nodes (e.g. using Dijkstra) and
select a node with a large distance from the goal.
One possible approach would be creating it in reverse.
Generate a random board, that has the red car in the winning position.
Build the graph of all reachable positions.
Select a position that has the largest distance from every winning position.
The number of reachable positions is not that big (probably always below 100k), so (2) and (3) are feasible.
How to create harder instances through local search
It's possible that above approach will not yield hard instances, as most random instances don't give rise to a complex interlocking behavior of the cars.
You can do some local search, which requires
a way to generate other boards from an existing one
an evaluation/fitness function
(2) is simple, maybe use the length of the longest solution, see above. Though this is quite costly.
(1) requires some thought. Possible modifications are:
add a car somewhere
remove a car (I assume this will always make the board easier)
Those two are enough to reach all possible boards. But one might to add other ways, because of removing makes the board easier. Here are some ideas:
move a car perpendicularly to its driving direction
swap cars within the same lane (aaa..bb.) -> (bb..aaa.)
Hillclimbing/steepest ascend is probably bad because of the large branching factor. One can try to subsample the set of possible neighbouring boards, i.e., don't look at all but only at a few random ones.
I know this is ancient but I recently had to deal with a similar problem so maybe this could help.
Constructing instances by applying random operators from a terminal state (i.e., reverse) will not work well. This is due to the symmetry in the state space. On average you end up in a state that is too close to the terminal state.
Instead, what worked better was to generate initial states (by placing random cars on the grid) and then to try to solve it with some bounded heuristic search algorithm such as IDA* or branch and bound. If an instance cannot be solved under the bound, discard it.
Try to avoid A*. If you have your definition of what you mean is a "hard" instance (I find 16 moves to be pretty difficult) you can use A* with a pruning rule that prevents expansion of nodes x with g(x)+h(x)>T (T being your threshold (e.g., 16)).
Heuristics function - Since you don't have to be optimal when solving it, you can use any simple inadmissible heuristic such as number of obstacle squares to the goal. Alternatively, if you need a stronger heuristic function, you can implement a manhattan distance function by generating the entire set of winning states for the generated puzzle and then using the minimal distance from a current state to any of the terminal state.
I need to evaluate if two sets of 3d points are the same (ignoring translations and rotations) by finding and comparing a proper geometric hash. I did some paper research on geometric hashing techniques, and I found a couple of algorithms, that however tend to be complicated by "vision requirements" (eg. 2d to 3d, occlusions, shadows, etc).
Moreover, I would love that, if the two geometries are slightly different, the hashes are also not very different.
Does anybody know some algorithm that fits my need, and can provide some link for further study?
Thanks
Your first thought may be trying to find the rotation that maps one object to another but this a very very complex topic... and is not actually necessary! You're not asking how to best match the two, you're just asking if they are the same or not.
Characterize your model by a list of all interpoint distances. Sort the list by that distance. Now compare the list for each object. They should be identical, since interpoint distances are not affected by translation or rotation.
Three issues:
1) What if the number of points is large, that's a large list of pairs (N*(N-1)/2). In this case you may elect to keep only the longest ones, or even better, keep the 1 or 2 longest ones for each vertex so that every part of your model has some contribution. Dropping information like this however changes the problem to be probabilistic and not deterministic.
2) This only uses vertices to define the shape, not edges. This may be fine (and in practice will be) but if you expect to have figures with identical vertices but different connecting edges. If so, test for the vertex-similarity first. If that passes, then assign a unique labeling to each vertex by using that sorted distance. The longest edge has two vertices. For each of THOSE vertices, find the vertex with the longest (remaining) edge. Label the first vertex 0 and the next vertex 1. Repeat for other vertices in order, and you'll have assigned tags which are shift and rotation independent. Now you can compare edge topologies exactly (check that for every edge in object 1 between two vertices, there's a corresponding edge between the same two vertices in object 2) Note: this starts getting really complex if you have multiple identical interpoint distances and therefore you need tiebreaker comparisons to make the assignments stable and unique.
3) There's a possibility that two figures have identical edge length populations but they aren't identical.. this is true when one object is the mirror image of the other. This is quite annoying to detect! One way to do it is to use four non-coplanar points (perhaps the ones labeled 0 to 3 from the previous step) and compare the "handedness" of the coordinate system they define. If the handedness doesn't match, the objects are mirror images.
Note the list-of-distances gives you easy rejection of non-identical objects. It also allows you to add "fuzzy" acceptance by allowing a certain amount of error in the orderings. Perhaps taking the root-mean-squared difference between the two lists as a "similarity measure" would work well.
Edit: Looks like your problem is a point cloud with no edges. Then the annoying problem of edge correspondence (#2) doesn't even apply and can be ignored! You still have to be careful of the mirror-image problem #3 though.
There a bunch of SIGGRAPH publications which may prove helpful to you.
e.g. "Global Non-Rigid Alignment of 3-D Scans" by Brown and Rusinkiewicz:
http://portal.acm.org/citation.cfm?id=1276404
A general search that can get you started:
http://scholar.google.com/scholar?q=siggraph+point+cloud+registration
spin images are one way to go about it.
Seems like a numerical optimisation problem to me. You want to find the parameters of the transform which transforms one set of points to as close as possible by the other. Define some sort of residual or "energy" which is minimised when the points are coincident, and chuck it at some least-squares optimiser or similar. If it manages to optimise the score to zero (or as near as can be expected given floating point error) then the points are the same.
Googling
least squares rotation translation
turns up quite a few papers building on this technique (e.g "Least-Squares Estimation of Transformation Parameters Between Two Point Patterns").
Update following comment below: If a one-to-one correspondence between the points isn't known (as assumed by the paper above), then you just need to make sure the score being minimised is independent of point ordering. For example, if you treat the points as small masses (finite radius spheres to avoid zero-distance blowup) and set out to minimise the total gravitational energy of the system by optimising the translation & rotation parameters, that should work.
If you want to estimate the rigid
transform between two similar
point clouds you can use the
well-established
Iterative Closest Point method. This method starts with a rough
estimate of the transformation and
then iteratively optimizes for the
transformation, by computing nearest
neighbors and minimizing an
associated cost function. It can be
efficiently implemented (even
realtime) and there are available
implementations available for
matlab, c++... This method has been
extended and has several variants,
including estimating non-rigid
deformations, if you are interested
in extensions you should look at
Computer graphics papers solving
scan registration problem, where
your problem is a crucial step. For
a starting point see the Wikipedia
page on Iterative Closest Point
which has several good external
links. Just a teaser image from a matlab implementation which was designed to match to point clouds:
(source: mathworks.com)
After aligning you could the final
error measure to say how similar the
two point clouds are, but this is
very much an adhoc solution, there
should be better one.
Using shape descriptors one can
compute fingerprints of shapes which
are often invariant under
translations/rotations. In most cases they are defined for meshes, and not point clouds, nevertheless there is a multitude of shape descriptors, so depending on your input and requirements you might find something useful. For this, you would want to look into the field of shape analysis, and probably this 2004 SIGGRAPH course presentation can give a feel of what people do to compute shape descriptors.
This is how I would do it:
Position the sets at the center of mass
Compute the inertia tensor. This gives you three coordinate axes. Rotate to them. [*]
Write down the list of points in a given order (for example, top to bottom, left to right) with your required precision.
Apply any algorithm you'd like for a resulting array.
To compare two sets, unless you need to store the hash results in advance, just apply your favorite comparison algorithm to the sets of points of step 3. This could be, for example, computing a distance between two sets.
I'm not sure if I can recommend you the algorithm for the step 4 since it appears that your requirements are contradictory. Anything called hashing usually has the property that a small change in input results in very different output. Anyway, now I've reduced the problem to an array of numbers, so you should be able to figure things out.
[*] If two or three of your axis coincide select coordinates by some other means, e.g. as the longest distance. But this is extremely rare for random points.
Maybe you should also read up on the RANSAC algorithm. It's commonly used for stitching together panorama images, which seems to be a bit similar to your problem, only in 2 dimensions. Just google for RANSAC, panorama and/or stitching to get a starting point.