What is the efficient Algorithm for Solving Jigsaw Puzzle? - algorithm

Yesterday I was just playing Jigsaw Puzzle and somehow wondered what would be algorithm for solving it.
As human, steps which I followed where:
Separate all pieces in 3 parts, single flat edge, double flat edge and no edge at all.
Separate flat edge pieces as they would be corners of image
Separate single edge pieces as they would form 4 end edges of images
Lastly, pieces with no edges would form internal of the image.
Match the color and image pieces to put pieces together.
I was wondering what would be the efficient algorithm to solve this puzzle efficiently and what datastructure would provide optimum efficient solution.
Thanks.

Solving problems like this can be deceptively complicated, especially if no constraints are placed on the size and complexity of the puzzle.
Here's my thoughts on an approach to writing a program to solve such a puzzle.
There are four key pieces of information that you can use individually and together as clues to solving a jigsaw puzzle:
The shape information of each of the pieces (how their edges appear)
The color information of each of the pieces (adjacent pieces will generally have smooth transitions)
The orientation information of each piece (where flat and corner edges may lie)
The overall size and number of pieces provide the general dimensions of the puzzle
So what kind of information will the program will be supplied - let's assume that each puzzle piece is an small rectangular image with transparency information used to identify the portion of the puzzle piece that represent non-rectangular edges.
From this, it is relatively easy to identify the four corner pieces (in a typical jigsaw). These would have exactly two edges that have flat contours (see contour map below).
Next, I would build information about the shape of each of the four edges of a puzzle piece. This information can be used to build an adjacency matrix indicating which pieces fit together.
Now we can prune this adjacency matrix to identify just those pieces that have smooth color transitions in their adjacent configuration. This is somewhat tricky because it requires a level of fuzzy matching - since not every pixel-to-pixel boundary will necessarily have a smooth color transition.
Using the four corner pieces originally identified, we should now be able to reconstruct the dimensions and positions of all of the pieces in the puzzle.
A convenient data structure for representing edge shapes may be a contour map - essentially a set of integers representing the incremental deltas in distance from the opposing side of the image to the last non-transparent pixel in each of the four sides of the puzzle piece. Pieces that match should have mirror-image contour maps.

Make sure to scan for male/female portions of a piece--this will cut the search in half.

Assuming you're not going to get into any computer vision stuff, it would be very small variations on a search of the entire problem space, i.e. trying every piece until one fits, and repeating. The major optimization would be not trying the same piece in the same place if you know it doesn't fit. Side/corner pieces make up relatively few of the pieces and probably couldn't be considered in any major optimization.
The data structure would probably be something like a hash matrix, where you could quickly check if you're already tried a piece in a position.
An easy optimization that includes computer vision would be to try pieces at each position after sorting pieces by how close their average color matches adjacent positions.
This just off the top of my head of course.

I don't think that the human way would be that helpful for an implementation - a computer can look at all pieces many times a second and I see no (big) win by categorizing the pieces into corner, edge, and inner pieces, especially because there are only three categories and they have very different sizes.
Given a set of images of all pieces I would try to derive a simple descriptor for every piece or edge. The descriptor must contain information about the rough shape and the color of the piece respectively the four edges. Given a puzzle with 1000 pieces, there are 4000 edges and always two must be equal (ignoring the border of the puzzle). In consequence the descriptor must be able to distinguish 2000 edges requiring at least 11 bits.
Dividing one piece into a 3 x 3 check board pattern with nine fields will give three colors per edge - with eight bits per channel we already have 72 bits. I first tended to suggest to reduce the color resolution, but this seems not to be a good idea - for example a blue sky might really benefit from a high color resolution. Note that calculating the colors probably requires separating the piece from the background and trying to align the edges to the horizontal and vertical axises.
In very uniform areas like blue skies the color information will probably not be enough to find good matches and additional geometric information will be required. I would try to describe the shape of the edge by its curvature or a derived measure. Maybe sampled at ten to twenty points per edge. This probably again relies on background separation and edge alignment.
Finally the computer can do the easy part - compare all pairs of edge descriptors and find the best matches. This process should probably be controlled to become more local instead of simple best match first because when ever you have found a corner (Correct English word? I mean three pieces in a L-shape.) you have two edges constraining the piece to find and one can track back early if no good match can be found (indicating an error made before or a hard puzzle).

Passing over this I thought of an interesting solution which solves it at increasing costs over a series of steps.
Separate all puzzle pieces into sets of two. Test to see if they fit together. If not, try a different piece it hasn't seen before. If it does, put the set into a correct pile. Repeat until all sets of two has found a match.
From the correct pile combine the set of twos to make a set with sets of twos i.e {{1,2},{5,6}}. See if at least one puzzle piece from one set of two fits with at least another puzzle piece from the other set of two. If not, try a different set of two it hasn't seen before. If it does, combine the two sets into one set of four in the correct orientation with the piece you found to fit together and put the combined set into a correct pile. Repeat until all sets of four has been found.
Repeat the steps until the final problem where set n/2 is combined with set n/2.
Not positive what the computation time for this would be.

Related

Solving the sliding puzzle-like problem with arbitrary number of holes

I've tried searching for a while, but haven't come across a solution, so figured I would ask my own.
Consider an MxM 2D grid of holes, and a set of N balls which are randomly placed in the grid. You are given some final configuration of the N balls in the grid, and your goal is to move the balls in the grid to achieve this final configuration in the shortest time possible.
The only move you are allowed to make is to move any contiguous subsection of the grid (on either a row or column) by one space. That sounds a bit confusing; basically you can select any set of points in a straight line in the grid, and shift all the balls in that subsection by one spot to the left or right if it is a row, or one spot up or down if it is a hole. If that is confusing, it's fine to consider the alternate problem where the only move you can make is to move a single ball to any adjacent spot. The caveat is that two balls can never overlap.
Ultimately this problem basically boils down to a version of the classic sliding tile puzzle, with two key differences: 1) there can be an arbitrary number of holes, and 2) we don't a priori know the numbering of the tiles - we don't care which balls end up in the final holes, we just want to final holes to be filled after it is all said and done.
I'm looking for suggestions about how to go about adapting classic sliding puzzle solutions to these two constraints. The arbitrary number of holes is likely pretty easy to implement efficiently, but the fact that we don't know which balls are destined to go in which holes at the start is throwing me for a loop. Any advice (or implementations of similar problems) would be greatly appreciated.
If I understood well:
all the balls are equal and cannot be distinguished - they can occupy any position on the grid, the starting state is a random configuration of balls and holes on the grid.
there are nxn = balls + holes = number of cells in the grid
your target is to reach a given configuration.
It seems a rather trivial problem, so maybe I missed some constraints. If this is indeed the problem, solving it can be approached like this:
Consider that you move the holes, not the balls.
conduct a search between each hole and each hole position in the target configuration.
Minimize the number of steps to walk the holes to their closest target. (maybe with a BFS if it is needed) - That is to say that you can use this measure as a heuristic to order the moves in a flavor of A* maybe. I think for a 50x50 grid, the search will be very fast, because your heuristic is extremely precise and nearly costless to calculate.
Solving the problem where you can move a hole along multiple positions on a line, or a file is not much more complicated; you can solve it by adding to the possible moves/next steps in your queue.

How to fill an area with a collection of rectangles and no space left

I have a collection of different sized rectangles that need to be placed on a 2D surface of a known dimension, with the condition that:
there is no overlap between the blocks,
the rectangles may not be rotated,
there is no area left blank (whole surface needs to be filled).
The rectangles are actually generated by breaking down the surface area, so they ought to fill up the area completely once you start putting everything together.
I figured there would be a dozen algorithms available for this problem, but the algorithms I find are more like best effort algorithms such as sprite generators that do not have the precondition that the whole area needs to be (can be...) filled -- which obviously is not necessary when building sprites, however, it is in my case.
I am a bit lost here, either this problem isn't as simple as I thought, or I am searching on wrong keywords.
Some topics I have found but do not fully suit my needs:
What algorithm can be used for packing rectangles of different sizes into the smallest rectangle possible in a fairly optimal way? (in my case, the area is preset)
How to arrange N rectangles to cover minimum area (in my case, minimum area must equal zero)
Is there any algorithm out there that may suit my needs?
IMHO, the most natural solution is recursive. For the form of source area is not set. And after removing a rectangle from it, we have the same task, only with smaller area and -1 rectangle.
I would start from the edges, because there the possible variants are already limited. So, simply go by spiral, trying to put rectangles along the edge. If no rectangle fits, go back. That will be the simplest and not so slow raw force method.

How to fit more than one line to data points

I am trying to fit more than one line to a list of points in 2D. My points are quite low in number (16 or 32).
These points are coming from a simulated environment of a robot with laser range finders attached to its side. If the points lie on a line it means that they detected a wall, if not, it means they detected an obstacle. I am trying to detect the walls and calculate their intersection, and for this I thought the best idea is to fit lines on the dataset.
Fitting one line to a set of points is not a problem, if we know all those points line on or around a line.
My problem is that I don't know how can I detect which sets of points should be classified for fitting on the same line and which should not be, for each line. Also, I don't even now the number of points on a line, while naturally it would be the best to detect the longest possible line segment.
How would you solve this problem? If I look at all the possibilities for example for groups of 5 points for all the 32 points then it gives 32 choose 5 = 201376 possibilities. I think it takes way too much time to try all the possibilities and try to fit a line to all 5-tuples.
So what would be a better algorithm what would run much faster? I could connect points within limit and create polylines. But even connecting the points is a hard task, as the edge distances change even within a single line.
Do you think it is possible to do some kind of Hough transform on a discrete dataset with such a low number of entries?
Note: if this problem is too hard to solve, I was thinking about using the order of the sensors and use it for filtering. This way the algorithm could be easier but if there is a small obstacle in front of a wall, it would distract the continuity of the line and thus break the wall into two halves.
A good way to find lines in noisy point data like this is to use RANSAC. Standard RANSAC usage is to pick the best hypothesis (=line in this case) but you can just as easy pick the best 2 or 4 lines given your data. Have a look at the example here:
http://www.janeriksolem.net/2009/06/ransac-using-python.html
Python code is available here
http://www.scipy.org/Cookbook/RANSAC
The first thing I would point out is that you seem to be ignoring a vital aspect of the data, which is you know which sensors (or readings) are adjacent to each other. If you have N laser sensors you know where they are bolted to the robot and if you are rotating a sensor you know the order (and position) in which the measurements are taken. So, connecting points together to form a piecewise linear fit (polylines) should be trivial. Once you had these correspondances you could take each set of four points and determine if they can be modeled effectively by only 2 lines, or something.
Secondly, it's well known that finding a globally optimal fit for even two lines to an arbitrary set of points is NP-Hard as it can be reduced to k-means clustering, so I would not expect to find an efficient algorithm for this. When I used the Hough transform it was for finding corners based on pixel intensities, in theory it is probably applicable to this sort of problem but it is just an approximation and it is probably going to take a fair bit of work to figure out and implement.
I hate to suggest it but it seems like you might benefit by looking at the problem in a slightly different way. When I was working in autonomous navigation with a laser range finder we solved this problem by discretizing the space into an occupancy grid, which is the default approach. Of course, this just assumes the walls are just another object, which is not a particularly outrageous idea IMO. Beyond that, can you assume you have a map of the walls and you are just trying to find the obstacles and localize (find the pose of) the robot? If so, there are a large number of papers and tech reports on this problem.
A part of the solution could be (it's where I would start) to investigate the robot. Questions such as:
How fast is the robot turning/moving?
At what interval is the robot shooting the laser?
Where was the robot and what was its orientation when a point was found?
The answers to these questions might help you better than trying to find some algorithm that relies on the points only. Especially when there are so very few.
The answers to these questions give you more information/points. E.g., if you know the robot was at a specific position when it detected a point, there are no points in between the position of the robot and the detected point. that is very valuable.
For all the triads, fit a line through them and compute how much the line is deviating or not from the points.
Then use only the good (little-deviating) triads and merge them if they have two points at common, and the grow the sets by appending all triads that have (at least) two points in the set.
As the last step you can ditch the triads that haven't been merged with any other but have some common element with result set of at least 4 elements (to get away with crossing lines) but keep those triads that did not merge with anyone but does not have any element in common with sets (this would yield three points on the right side of your example as one of the lines).
I presume this would find the left line and bottom line in the second step, and the right line the third.

Merging and splitting overlapping rectangles to produce non-overlapping ones

I am looking for an algorithm as follows:
Given a set of possibly overlapping rectangles (All of which are "not rotated", can be uniformly represented as (left,top,right,bottom) tuplets, etc...), it returns a minimal set of (non-rotated) non-overlapping rectangles, that occupy the same area.
It seems simple enough at first glance, but prooves to be tricky (at least to be done efficiently).
Are there some known methods for this/ideas/pointers?
Methods for not necessarily minimal, but heuristicly small, sets, are interesting as well, so are methods that produce any valid output set at all.
Something based on a line-sweep algorithm would work, I think:
Sort all of your rectangles' min and max x coordinates into an array, as "start-rectangle" and "end-rectangle" events
Step through the array, adding each new rectangle encountered (start-event) into a current set
Simultaneously, maintain a set of "non-overlapping rectangles" that will be your output set
Any time you encounter a new rectangle you can check whether it's completely contained already in the current / output set (simple comparisons of y-coordinates will suffice)
If it isn't, add a new rectangle to your output set, with y-coordinates set to the part of the new rectangle that isn't already covered.
Any time you hit a rectangle end-event, stop any rectangles in your output set that aren't covering anything anymore.
I'm not completely sure this covers everything, but I think with some tweaking it should get the job done. Or at least give you some ideas... :)
So, if I were trying to do this, the first thing I'd do is come up with a unified grid space. Find all unique x and y coordinates, and create a mapping to an index space. So if you have x values { -1, 1.5, 3.1 } then map those to { 0, 1, 2 }, and likewise for y. Then every rectangle can be exactly represented with these packed integer coordinates.
Then I'd allocate a bitvector or something that covers the entire grid, and rasterize your rectangles in the grid. The nice thing about having a grid is that it's really easy to work with, and by limiting it to unique x and y coordinates it's minimal and exact.
One way to come up with a pretty quick solution is just dump every 'pixel' of your grid.. run them back through your mapping, and you're done. If you're looking for a more optimal number of rectangles, then you've got some sort of search problem on your hands.
Let's look at 4 neighboring pixels, a little 2x2 square. When I write algorithms like these, typically I think in terms of verts, edges, and faces. So, these are the faces around a vert. If 3 of them are on and 1 is off, then you've got a concave corner. This is the only invalid case. For example, if I don't have any concave corners, I just grab the extents and dump the whole thing as a single rectangle. For each concave corner, you need to decide whether to split horizontally, vertically, or both. I think of the splitting as marking edges not to cross when finding extents. You could also do it as coloring into sets, whatever is easier for you.
The concave corners and their split directions are your search space.. you can use whatever optimization algorithm you'd like. Branch/bound might work well. I bet you could find a simple heuristic that performs well (for example, if there's another concave corner directly across from the one you're considering, always split in that direction. Otherwise, split in the shorter direction). You could just go greedy. Or you could just split every concave vert in both directions, which would generally give you fewer rectangles than outputting every 'pixel' as a rect, and would be pretty simple.
Reading over this I realize that there may be areas that are unclear. Let me know if you want me to clarify anything.

Compare three-dimensional structures

I need to evaluate if two sets of 3d points are the same (ignoring translations and rotations) by finding and comparing a proper geometric hash. I did some paper research on geometric hashing techniques, and I found a couple of algorithms, that however tend to be complicated by "vision requirements" (eg. 2d to 3d, occlusions, shadows, etc).
Moreover, I would love that, if the two geometries are slightly different, the hashes are also not very different.
Does anybody know some algorithm that fits my need, and can provide some link for further study?
Thanks
Your first thought may be trying to find the rotation that maps one object to another but this a very very complex topic... and is not actually necessary! You're not asking how to best match the two, you're just asking if they are the same or not.
Characterize your model by a list of all interpoint distances. Sort the list by that distance. Now compare the list for each object. They should be identical, since interpoint distances are not affected by translation or rotation.
Three issues:
1) What if the number of points is large, that's a large list of pairs (N*(N-1)/2). In this case you may elect to keep only the longest ones, or even better, keep the 1 or 2 longest ones for each vertex so that every part of your model has some contribution. Dropping information like this however changes the problem to be probabilistic and not deterministic.
2) This only uses vertices to define the shape, not edges. This may be fine (and in practice will be) but if you expect to have figures with identical vertices but different connecting edges. If so, test for the vertex-similarity first. If that passes, then assign a unique labeling to each vertex by using that sorted distance. The longest edge has two vertices. For each of THOSE vertices, find the vertex with the longest (remaining) edge. Label the first vertex 0 and the next vertex 1. Repeat for other vertices in order, and you'll have assigned tags which are shift and rotation independent. Now you can compare edge topologies exactly (check that for every edge in object 1 between two vertices, there's a corresponding edge between the same two vertices in object 2) Note: this starts getting really complex if you have multiple identical interpoint distances and therefore you need tiebreaker comparisons to make the assignments stable and unique.
3) There's a possibility that two figures have identical edge length populations but they aren't identical.. this is true when one object is the mirror image of the other. This is quite annoying to detect! One way to do it is to use four non-coplanar points (perhaps the ones labeled 0 to 3 from the previous step) and compare the "handedness" of the coordinate system they define. If the handedness doesn't match, the objects are mirror images.
Note the list-of-distances gives you easy rejection of non-identical objects. It also allows you to add "fuzzy" acceptance by allowing a certain amount of error in the orderings. Perhaps taking the root-mean-squared difference between the two lists as a "similarity measure" would work well.
Edit: Looks like your problem is a point cloud with no edges. Then the annoying problem of edge correspondence (#2) doesn't even apply and can be ignored! You still have to be careful of the mirror-image problem #3 though.
There a bunch of SIGGRAPH publications which may prove helpful to you.
e.g. "Global Non-Rigid Alignment of 3-D Scans" by Brown and Rusinkiewicz:
http://portal.acm.org/citation.cfm?id=1276404
A general search that can get you started:
http://scholar.google.com/scholar?q=siggraph+point+cloud+registration
spin images are one way to go about it.
Seems like a numerical optimisation problem to me. You want to find the parameters of the transform which transforms one set of points to as close as possible by the other. Define some sort of residual or "energy" which is minimised when the points are coincident, and chuck it at some least-squares optimiser or similar. If it manages to optimise the score to zero (or as near as can be expected given floating point error) then the points are the same.
Googling
least squares rotation translation
turns up quite a few papers building on this technique (e.g "Least-Squares Estimation of Transformation Parameters Between Two Point Patterns").
Update following comment below: If a one-to-one correspondence between the points isn't known (as assumed by the paper above), then you just need to make sure the score being minimised is independent of point ordering. For example, if you treat the points as small masses (finite radius spheres to avoid zero-distance blowup) and set out to minimise the total gravitational energy of the system by optimising the translation & rotation parameters, that should work.
If you want to estimate the rigid
transform between two similar
point clouds you can use the
well-established
Iterative Closest Point method. This method starts with a rough
estimate of the transformation and
then iteratively optimizes for the
transformation, by computing nearest
neighbors and minimizing an
associated cost function. It can be
efficiently implemented (even
realtime) and there are available
implementations available for
matlab, c++... This method has been
extended and has several variants,
including estimating non-rigid
deformations, if you are interested
in extensions you should look at
Computer graphics papers solving
scan registration problem, where
your problem is a crucial step. For
a starting point see the Wikipedia
page on Iterative Closest Point
which has several good external
links. Just a teaser image from a matlab implementation which was designed to match to point clouds:
(source: mathworks.com)
After aligning you could the final
error measure to say how similar the
two point clouds are, but this is
very much an adhoc solution, there
should be better one.
Using shape descriptors one can
compute fingerprints of shapes which
are often invariant under
translations/rotations. In most cases they are defined for meshes, and not point clouds, nevertheless there is a multitude of shape descriptors, so depending on your input and requirements you might find something useful. For this, you would want to look into the field of shape analysis, and probably this 2004 SIGGRAPH course presentation can give a feel of what people do to compute shape descriptors.
This is how I would do it:
Position the sets at the center of mass
Compute the inertia tensor. This gives you three coordinate axes. Rotate to them. [*]
Write down the list of points in a given order (for example, top to bottom, left to right) with your required precision.
Apply any algorithm you'd like for a resulting array.
To compare two sets, unless you need to store the hash results in advance, just apply your favorite comparison algorithm to the sets of points of step 3. This could be, for example, computing a distance between two sets.
I'm not sure if I can recommend you the algorithm for the step 4 since it appears that your requirements are contradictory. Anything called hashing usually has the property that a small change in input results in very different output. Anyway, now I've reduced the problem to an array of numbers, so you should be able to figure things out.
[*] If two or three of your axis coincide select coordinates by some other means, e.g. as the longest distance. But this is extremely rare for random points.
Maybe you should also read up on the RANSAC algorithm. It's commonly used for stitching together panorama images, which seems to be a bit similar to your problem, only in 2 dimensions. Just google for RANSAC, panorama and/or stitching to get a starting point.

Resources