Snake cube puzzle correctness - algorithm

TrialPay posted a programming question about a snake cube puzzle on their blog.
Recently, one of our engineers introduced us to the snake cube. A snake cube is a puzzle composed of a chain of cubelets, connected by an elastic band running through each cubelet. Each cubelet can rotate 360° about the elastic band, allowing various structures to be built depending upon the way in which the chain is initially constructed, with the ultimate goal of arranging the cubes in such a way to create a cube.
Example:
This particular arrangement contains 17 groups of cubelets, composed of 8 groups of two cubelets and 9 groups of three cubelets. This arrangement can be expressed in a variety of ways, but for the purposes of this exercise, let '0' denote pieces whose rotation does not change the orientation of the puzzle, or may be considered a "straight" piece, while '1' will denote pieces whose rotation changes the puzzle configuration, or "bend" the snake. Using that schema, the snake puzzle above could be described as 001110110111010111101010100.
Challenge:
Your challenge is to write a program, in any language of your choosing, that takes the cube dimensions (X, Y, Z) and a binary string as input, and outputs '1' (without quotes) if it is possible to solve the puzzle, i.e. construct a proper XYZ cube given the cubelet orientation, and '0' if the current arrangement cannot be solved.
I posted a semi-detailed explanation of the solution, but how do I determine if the program solves the problem? I thought about getting more test cases, but I ran into some problems:
The snake cube example from TrialPay's blog has the same combination as the picture on Wikipedia's Snake Cube page and www.mathematische-basteleien.de.
It's very tedious to manually convert an image into a string.
I tried to make a program that would churn out a lot of combinations:
#We should start at the binary representation of 16777216 (00100...), because
#lesser numbers have more than 2 consecutive 0s (000111...)
i = 16777216
solved = []
while i <= 2**27:
s = str(bin(i))[2:]
#Add 0s
if len(s) < 27:
s = '0'*(27-len(s)) + s
#Check if there are more than 2 consecutive 0s
print s
if s.find("000") != -1:
if snake_cube_solution(3, 3, 3, s) == 1:
solved.append(s)
i += 1
But it just takes forever to finish executing. Is there a better way to verify the program?
Thanks in advance!

TL;DR: This isn't a programming problem, but a mathematical one. You may be better served at math.stackexchange.com.
Since the cube size and snake length are passed as input, the space of inputs a checker program would need to verify is essentially infinite. Even though checking the solutions's answer for a single input is reasonable, brute forcing this check across the entire input space is clearly not.
If your solution fails on certain cases, your checker program can help you find these. However it can't establish your program's correctness: if your solution is actually correct the checker will simply run forever and leave you wondering.
Unfortunately (or not, depending on your tastes), what you are looking for is not a program but a mathematical proof.
(Proving) Algorithm correctness is itself an entire field of study, and you can spend a long time in it. That said, proof by induction is often applicable (especially for recursive algorithms.)
Other times, navigating between state configurations can be restated as optimizing a utility function. Proving things about the space being optimized (such as it has only one extrema) can then translate to a proof of program correctness.
Your state configurations in this second approach could be snake orientations, or they might be some deeper structure. For example, the general strategy underneath solving a Rubik's cube
isn't usually stated on literal cube states, but on expressions of a group of relevant symmetries. This is what I personally expect your solution will eventually play out as.
EDIT: Years later, I feel I should point out that for a given, fixed cube size and snake length, of course the search space is actually finite. You can write a program to brute-force check all combinations. If you were clever, you could even argue that the times to check a set of cases can be treated as a set of independent random variables. From this you could build a reasonable progress bar to estimate how (very) long your wait would be.

I think your assertion that there can not be three consecutive 0's is false. Consider this arrangement:
000
100
101
100
100
101
100
100
100
One of the problems I'm having with this puzzle is the notation. A 1 indicates that the cubelet can change the puzzle's orientation, but about which axis? In my example above, assume that the Y axis is vertical and the X axis is horizontal. A 1 on the left indicates the ability to rotate about the cubelet's Y axis, and a 1 on the right indicates the ability to rotate about the cubelet's X axis.
I think it's possible to construct an arrangement similar to that above, but with three 000 groups. But I don't have the notation for it. Clearly, the example above could be modified so that the first three lines are:
001
000
101
With the first segment's 1 indicating rotation about the Y axis.

I wrote a Java application for the same problem not long ago.
I used the backtracking algorithm for this.
You just have to do an recursive search through the whole cube checking what directions are possible. If you have found one, you can stop and print the solution (I chose to print out all solutions).
For the 3x3x3 cubes my program solved them in under a second, for the bigger ones it takes about five seconds up to 15 minutes.
I'm sorry I couldn't find any code right now.

Related

Chess programming: minimax, detecting repeats, transposition tables

I'm building a database of chess evaluations (essentially a map from a chess position to an evaluation), and I want to use this to come up with a good move for given positions. The idea is to do a kind of "static" minimax, i.e.: for each position, use the stored evaluation if evaluations for child nodes (positions after next ply) are not available, otherwise use max (white to move)/min (black to move) evaluations of child nodes (which are determined in the same way).
The problem are, of course, loops in the graph, i.e. repeating positions. I can't fathom how to deal with this without making this infinitely less efficient.
The ideas I have explored so far are:
assume an evaluation of 0 for any position that can be reached in a game with less moves than are currently evaluated. This is an invalid assumption, because - for example - if White plays A, it might not be desirable for Black to follow up with x, but if White plays B, then y -> A -> x -> -B -> -y might be best line, resulting in the same position as A -> x, without any repetitions (-m denoting the inverse move to m here, lower case: Black moves, upper case: White moves).
having one instance for each possible way a position can be reached solves the loop problem, but this yields a bazillion of instances in some positions and is therefore not practical
the fact that there is a loop from a position back to that position doesn't mean that it's a draw by repetition, because playing the repeating line may not be best choice
I've tried iterating through the loops a few times to see if the overall evaluation would become stable. It doesn't, because in some cases, assuming the repeat is the best line means it isn't any longer - and then it goes back to the draw being the back line etc.
I know that chess engines use transposition tables to detect positions already reached before, but I believe this doesn't address my problem, and I actually wonder if there isn't an issue with them: a position may be reachable through two paths in the search tree - one of them going through the same position before, so it's a repeat, and the other path not doing that. Then the evaluation for path 1 would have to be 0, but the one for path 2 wouldn't necessarily be (path 1 may not be the best line), so whichever evaluation the transposition table holds may be wrong, right?
I feel sure this problem must have a "standard / best practice" solution, but google failed me. Any pointers / ideas would be very welcome!
I don't understand what the problem is. A minimax evaluation, unless we've added randomness to it, will have the exact same result for any given board position combined with who's turn it is and other key info. If we have the space available to store common board_position+who's_turn+castling+en passant+draw_related tuples (or hash thereof), go right ahead. When reaching that tuple in any other evaluation, just return the stored value or rely on its more detailed record for more complex evaluations (if the search yielding that record was not exhaustive, we can have different interpretations for it in any one evaluation). If the program also plays chess with time limits on the game, an additional time dimension (maybe a few broad blocks) would probably be needed in the memoisation as well.
(I assume you've read common public info about transposition tables.)

Finding the number of cases of solving paint-tool-puzzle

I was making a kind of paint-tool-puzzle game.
It's pretty easy to understand the rule if you see the short previews of the puzzle.
Preview 1
Preview 2
As you see, Some color blocks have colored triangles. When you click a triangle, it changes all the same color around it into the color of itself.
The goal is to unify the whole color blocks into one single color block.
I was trying to find the the number of cases of solving the puzzle algorithmically. So I replaced the puzzle in simple graph data structure and set up input format.
line 1) Pair of Integers v and e : numbers of vertices and numbers of edges
lines 2...v+1) one Character or pairs of Characters : color of vertex and if exists, a color of triangle inside.
line v+2...v+e+1) Pair of Integers : indexes of two vertices to be linked each other.
for example, a graph of Preview 1 can be shown like this. (each vertex indicates from leftmost to rightmost color block.)
5 5
A C
B C
C D
D A
C B
0 1
1 2
1 3
2 3
3 4
(The result should be 1. There's only one way to solve the puzzle.)
Then I programmed the code in C#; Made a structure which can indicate each color block, Made several methods to implement combination of several blocks with same colors, changing color of the block when clicking the triangle...
But all things I can do with them is just brute-forcing all possible combinations of clicking every triangles, which will take a time of enormous when solving a bit more complicated puzzle.
I need more efficient way to solve the problem or I just want to know if there's any algorithm which can be run faster than factorial time or not.
I've tried dynamic programming to improve performance, but I don't think the problem can be broken down into smaller pieces and I don't have any clue to apply memoiaztion to the whole bunch of data.
I ask if you have any ideas to help me with the problem.
ps. sorry if you felt inconvenient reading imperfect english. it's been a long time since i've posted a piece of writing in english.

Procedural Maze Algorithm With Cells Determined Independently of Neighbors

I was thinking about maze algorithms recently (mostly because I'm working on a game, but I felt this is a more general question than game development related). In simple terms, I was wondering if there is a sort of maze algorithm that can generate (a possibly infinite number of) cells without any information specifically about the cell's neighbors. I imagine, if such a thing were possible, it would rely heavily upon noise functions such as Perlin or Simplex.
Each cell has four walls, these are used when actually rendering the maze so that corridors and walls are not the same thickness.
Let's say, for example, I'd like a cell at (32, 15) to generate its walls.
I know of algorithms like Ellers (which requires a limited number of columns, but infinite rows) and the Virtual fractal Mazes algorithm (which needs to know previous cells in order to build upon them infinitely in both x and y directions).
Does anyone know of any algorithm I could look into for this specific request? If not, are there any algorithms that are good for chunk-based mazes that you know of?
(Note: I did search around for a bit through StackOverflow to see if there were any questions with similar requests to mine, but I did not come across any. If you happen to know of one, a link would be greatly appreciated :D)
Thank you in advance.
Seeeeeecreeeets. My preeeeciooouss secretts. But yeah I can understand the frustration so I'll throw this one to you OP/SO. Feel free to update the PCG Wiki if you're not as lazy as me :3
There are actually many ways to do this. Some of the best techniques for procgen are:
Asking what you really want.
Design backwards. Play in reverse. Result is forwards.
Look at a random sampling of your target goal and try to see overall patterns.
But to get back to the question, there are two simple ways and they both start from asking what your really want. I'll give those first.
The first is to create 2 layers. Both are random noise. You connect the top and the bottom so they're fully connected. No isolated portions. This is asking what you really want which is connected-ness. And then guaranteeing it in a local clean-up step. (tbh I forget the function applied to layer 2 that guarantees connected-ness. I can't find the code atm.... But I remember it was a really simple local function... XOR, Curl, or something similar. Maybe you can figure it out before I fix this).
The second way is using the properties of your functions. As long as your random function is smooth enough you can take the gradient and assign a tile to it. The way you assign the tiles changes the maze structure but you can guarantee connectivity by clever selection of tiles for each gradient (b/c similar or opposite gradients are more likely to be near each other on a smooth gradient function). From here your smooth random can be any form of Perlin Noise, etc. Once again a asking what you want technique.
For backwards-reversed you unfortunately have an NP problem (I'm not sure if it's hard, complete, or whatever it's been a while since I've worked on this...). So while you could generate a random map of distances down a maze path. And then from there generate the actual obstacles... it's not really advisable. There's also a ton of consideration on different cases even for small mazes...
012
123
234
Is simple. There's a column in the lower right corner of 0 and the middle 2 has an _| shaped wall.
042
123
234
This one makes less sense. You still are required to have the same basic walls as before on all the non-changed squares... But you can't have that 4. It needs to be within 1 of at least a single neighbor. (I mean you could have a +3 cost for that square by having something like a conveyor belt or something, but then we're out of the maze problem) Okay so....
032
123
234
Makes more sense but the 2 in the corner is nonsense once again. Flipping that from a trough to a peak would give.
034
123
234
Which makes sense. At any rate. If you can get to this point then looking at local neighbors will give you walls if it's +/-1 then no wall. Otherwise wall. Also note that you can break the rules for the distance map in a consistent way and make a maze just fine. (Like instead of allowing a column picking a wall and throwing it up. This is just loop splitting at this point and should be safe)
For random sampling as the final one that I'm going to look at... Certain maze generation algorithms in the limit take on some interesting properties either as an average configuration or after millions of steps. Some form Voronoi regions. Some form concentric circles with a randomly flipped wall to allow a connection between loops. Etc. The loop one is good example to work off of. Create a set of loops. Flip a random wall on each loop. One will delete a wall which will create access to the next loop. One will split a path and offer a dead-end and a continuation. For a random flip to be a failure there has to be an opening and a split made right next to each other (unless you allow diagonals then we're good). So make loops. Generate random noise per loop. Xor together. Replace local failures with a fixed path if no diagonals are allowed.
So how do we get random noise per loop? Or how do we get better loops than just squares? Just take a random function. Separate divergence and now you have a loop map. If you have the differential equations for the source random function you can pick one random per loop. A simpler way might be to generate concentric circular walls and pick a random point at each radius to flip. Then distort the final result. You have to be careful your distortion doesn't violate any of your path-connected-ness conditions at that point though.

Minutiae-based fingerprint matching algorithm

The problem
I need to match two fingerprints and give a score of resemblance.
I have posted a similar question before, but I think I've made enough progress to warrant a new question.
The input
For each image, I have a list of minutiae (important points). I want to match the fingerprints by matching these two lists.
When represented graphically, they look like this:
A minutia consists of a triplet (i, j, theta) where:
i is the row in a matrix
j is the column in a matrix
theta is a direction. I don't use that parameter yet in my matching algorithm.
What I have done so far
For each list, find the "dense regions" or "clusters". Some areas have more points than others, and I have written an algorithm to find them. I can explain further if you want.
Shifting the second list in order to account for the difference in finger position between both images. I neglect differences in finger rotation. The shift is done by aligning the barycenters of the centers of the clusters. (It is more reliable than the barycenter of all minutiae)
I tried building a matrix for each list (post-shift) so that for every minutia increments the corresponding element and it's close neighbours, like below.
1 1 1 1 1 1 1
1 2 2 2 2 2 1
1 2 3 3 3 2 1
1 2 3 4 3 2 1
1 2 3 3 3 2 1
1 2 2 2 2 2 1
1 1 1 1 1 1 1
By subtracting the two matrices and adding up the absolute values of all elements in the resulting matrix, I hoped to get low numbers for close fingerprints.
Results
I tested a few fingerprints and found that the number of clusters is very stable. Matching fingerprints very often have the same number of clusters, and different fingers give different numbers. So that will definitely be a factor in the overall resemblance score.
The sum of the differences didn't work at all however. There was no correlation between resemblance and the sum.
Thoughts
I may need to use the directions of the points but I don't know how yet
I could use the standard deviation of the points, or of the clusters.
I could repeat the process for different types of minutiae. Right now my algorithm detects ridge endings and ridge bifurcations but maybe I should process these separately.
Question: How can I improve my algorithm ?
Edit
I've come a long way since posting this question, so here's my update.
I dropped the bifurcations altogether, because my thinning algorithm messes those up too often. I did however end up using the angles quite a lot.
My initial cluster-counting idea does hold up pretty well on the small scale tests I ran (different combinations of my fingers and those of a handful of volunteers).
I give a score based on the following tests (10 tests, so 10% per success. It's a bit naïve but I'll find a better way to turn these 10 results into a score, as each test has its specificities):
Cluster-thingy (all the following don't use clusters, but minutiae. This is the only cluster-related approach I took)
Mean i position
Mean angle
i variance
j variance
Angle variance
i kurtosis
j kurtosis
Angle kurtosis
j skewness
A statistical approch indeed.
Same finger comparisons give pretty much always between 80 and 100%. Odd finger comparisons between 0 and 60% (not often 60%). I don't have exact numbers here so I won't pretend this a statistically significant success but it seems like a good first shot.
Your clustering approach is interesting, but one thing I'm curious about is how well you've tested it. For a new matching algorithm to be useful with respect to all the research and methods that already exists, you need to have a reasonably low EER. Have you tested your method with any of the standard databases? I have doubts as to the ability of cluster counts and locations alone to identify individuals at larger scales.
1) Fingerprint matching is a well studied problem and there are many good papers that can help you implement this. For a nice place to start, check out this paper, "Fingerprint Minutiae Matching Based on the Local and Global Structures" by Jiang & Yau. It's a classic paper, a short read (only 4 pages), and can be implemented fairly reasonably. They also define a scoring metric that can be used to quantify the degree to which two fingerprint images match. Again, this should only be a starting point because these days there are many algorithms that perform better.
2) If you want your algorithm to be robust, it should consider transformations of the fingerprint between images. Scanned fingerprints and certainly latent prints may not be consistent from image to image.
Also, calculating the direction of the minutiae points provides a method for handling fingerprint rotations. By measuring the angles between minutiae point directions, which will remain the same or close to the same across multiple images regardless of global rotation (though small inconsistencies may occur because skin is not rigid and may stretch slightly), you can find the best set of corresponding minutia pairs or triplets and use them as the basis for rotational alignment.
3) I recommend that you distinguish between ridge line endings and bifurcations. The more features you can isolate, the more accurately you can determine whether or not the fingerprints match. You might also consider the number of ridge lines that occur between each minutiae point.
This image below illustrates the features used by Jiang and Yau.
d: Euclidean distance between minutiae
θ: Angle measure between minutiae directions
φ: Global minutiae angle
n: Number of ridge lines between minutiae i and j
If you haven't read the Handbook of Fingerprint Recognition, I recommend it.

What do you think of this interest point detection algorithm?

I've been trying to come up with an interest point detection algorithm and this is what I came up with:
You go through the X and the Y axises 3n pixels at a time creating 3n x 3n squares.
For the the n x n square in the middle of the 3n x 3n square (let's call it square Z), the R, G, and B values are averaged and rounded to preset values to limit the number of colors, and that is the color that square will be treated as.
The same is done for the 8 surrounding n x n squares.
After that, the color of square Z is compared to the surrounding squares, if it matches x out of the 8 surrounding squares where x <= 3 or x => 5 then that is an interest point (a corner is detected).
And so on till all the image is covered.
The bigger n is, the faster the image will be scanned and the the less accurate the detection is, and vice versa.
This, supposedly, detects "literal corners", that is corners you can actually SEE on the image.
What do you think of this algorithm? Is it efficient? Can it be used on a live video stream (say from the camera) on a hand-held device?
I'm sorry to say that I don't think this is likely to be very good. Your algorithm looks a bit like a simplistic version of Moravec's algorithm, which is itself one of the simplest corner detection algorithms. The hardcoded limits you test against effectively make your edge test a stepped function, unlike an approach such as summed square differences. This will almost certainly give you discontinuities in your detection function (corners that don't match when they should have), for some values.
You also have the same problem as Moravec, namely that if the edge lies at an angle to the direction of neighbours being considered, then it won't be detected.
Developing algorithms is fun, and if this isn't a business-critical project, then by all means, carry on tinkering and experimenting (and don't be put off by my comments!). But the fact is, for almost any practical problem, a better algorithm for the task you want to solve almost certainly already exists. The real challenge is identifying how you can best model your problem in such a way that you can solve it using an existing, well-understood approach, designed by experts.
In particular, robust identification and analysis of edge-cases and worst-case runtimes is a tricky business; unless you are a professional algorist, you are likely to find the going difficult. But I certainly encourage you to discover this for yourself by trying. nlucaroni mentions some excellent questions to use as starting points for your analysis.
Why not try it and see if it works the way you expect? It sounds like it should. How does the performance compare with other methods? What is the complexity of the algorithm? Is it efficient compared to others? Where can it be improved? What kind of false-positives and false negatives are expected? Are they within reason based on the data I plan to use this on? What threshold should be used to compare surrounding squares? ....
this is stuff you should be doing, not us.
I would suggest you look at the SIFT algorithm. Its the defacto standard for points of interest in an image. Unfortunately, its also patented, because its so good.
If you are interested in a real time version of SIFT you can get it to run on a GPU, but its highly experimental at this point. Note if you are developing a commercial application you'd have to first purchase a license for using SIFT or get approval from David Lowe.

Resources