"Squares" Logic Riddle Solution in Prolog - prolog

"Squares"
Input:
Board size m × n (m, n ∈ {1,...,32}), list of triples (i, j, k), where i ∈ {1,...,m}, j ∈ {1,...,n}, k ∈ {0,...,(n-2)·(m-2)}) describing fields with numbers.
Output:
List of triples (i, j, d) showing solved riddle. Triple (i, j, d) describes square with opposite vertices at coordinates (i, j) and (i+d, j+d).
Example:
Input:
7.
7. [(3,3,0), (3,5,0), (4,4,1), (5,1,0), (6,6,3)].
Output:
[(1,1,2), (1,5,2), (2,2,4), (5,1,2), (4,4,3)]
Image:
Explanation:
I have to find placement for x squares(x = fields with numbers). On the
circuit of the each square, exactly at one of his corners should be
only one digit equal to amount of digits inside the square. Sides of
squares can't cover each other, same as corners. Square lines are
"filling fields" so (0,0,1) square fills 4 fields and have 0 fields inside.
I need a little help coding solution to this riddle in Prolog. Could someone direct me in the right direction? What predicates, rules I should use.

Since Naimads life is worth saving(!), here is some more directive help.
Programmers tend to work out their projects either in bottomp-up (starting from "lower" levels of functionality, building up to a complete application) or top-down (starting with a high level "shell" of functionality that provides input/output, but with the solution processing stubbed in for later refinement).
Either way I can recommend an approach for Prolog progrmming that has been championed by the Ruby programming community: Test Driven Development.
Let's pick a couple of functional requirement, one high and one low, and discuss how this bit of philosophy might help.
The high level predicate in Prolog is one which takes all the input and output arguments and semantically defines the problem. Note that we can write down such a predicate without giving any care as to how the solution process will get implemented:
/* squaresRiddle/4 takes inputs of Height and Width of a "board" with
a list of (nonnegative) integers located at cells of the board, and
produces a corresponding list of squares having one corner at each
of those locations "boxing in" exactly the specified counts of them
in their interiors. No edges can overlap nor corners coincide. */
squaresRiddle(Height,Width,BoxedCountList, OutputSquaresList).
So far we have just framed the problem. The useful progress is given by the comment (clearly defining the inputs and outputs at a high level), and the code at this point just guarantees "success" with whatever arguments are passed. So a "test" of this code:
?= squaresRiddle(7,7,ListIn,ListOut).
won't produce anything except free variables.
Next let's give some thought to how the input list and output list ought to be represented. In the Question it is proposed to use triples of integers as entries of these lists. I would recommend instead using a named functor, because the semantics of those input triples (giving an x,y coordinate of a cell with a count) are subtly different from that of the output triples (giving an x,y coordinate of upper left corner and an "extent" (height&width) of the square). Note in particular that an output square may be described by a different corner than the one in the corresponding input item, although the one in the input item needs to be one of the four corners of the output square.
So it tends to make the code more readable if these constructs are distinguished by simple functor names, e.g. BoxedCountList = [box(3,3,0),box(3,5,0),box(4,4,1),box(5,1,0),box(6,6,3)] vs. OutputSquaresList = [sqr(1,1,2), sqr(1,5,2), sqr(2,2,4), sqr(5,1,2), sqr(4,4,3)].
Let's give a little thought to how the search for solutions will proceed. The output items are in 1-to-1 correspondence with the input items, so the lists will have equal length in the end. However choosing the kth output item depends not only on the kth input item, but on all the input items (since we count how many are in the interior of the output square) and on the preceding output items (since we disallow corners that meet and edges that overlap).
There are several ways to manage the flow of decision making consistent with this requirement, and it seems to be the crux of this exercise to pick one approach and make it work. If you are familiar with difference lists or accumulators, you have a head start on ways to do it.
Now let's switch to discussing some lower level functionality. Given an input box there are potentially a number of output sqr that could correspond to it. Given a positive "extent" for the output, the X,Y coordinates of the input can be met with any of the four corners of the output square. While more checking is required to verify the solution, this aspect seems a good place to start testing:
/* check that coordinates X,Y are indeed a corner of candidate square */
hasCorner(sqr(SX,SY,Extent),X,Y) :-
(SX is X ; SX is X + Extent),
(SY is Y ; SY is Y + Extent).
How would we make use of this test? Well we could generate all possible squares (possibly limiting ourselves to those that fit inside the height and width of our "board"), and then check whether any have the required corner. This might be fairly inefficient. A better way (as far as efficiency goes) would use the corner to generate possible squares (by extending from the corner by Extent in any of four directions) subject to the parameters of board size. Implementation of this "optimization" is left as an exercise for the reader.

Related

Algorithm to find positions in a game board i can move to

The problem i have goes as follows (simplified):
I have a board, represented as a matrix of n x m squares (n might equal m)
In it, there are p game pieces
Each game piece has a pre-defined speed, which is how many steps it can take in it's turn
Pieces can't overlap
There are three types of cells: those which don't require extra movements to be crossed (you loose 0 extra speed when going through), those which require 1 extra movement to be crossed and some which you simply can't get through (like a wall)
So, given a game piece in a certain [i,j] position in my game board, i want to find out:
a) All the places it can move to, with it's speed
b) The path to a certain [k,l] position in the board
Having a) solved, b) is almost trivial.
Currently the algorithm i'm using goes as follows, assuming a language where arrays of size n go from 0 to n-1:
Create a sqaure matrix of speed*2+1 size which represents the cost of moving as if all cells had no extra cost to be crossed (the piece is on the position [speed, speed])
Create another square matrix of speed*2+1 size which has the extra costs of each cell (those which can't be crossed because either it's a wall or there is another piece in it has a value of infinite)(the piece is on the position [speed, speed])
Create another square matrix of speed*2+1 size which is the sum of the former two(the piece is on the position [speed, speed])
Correct the latter matrix making sure the value of each cell is: the minimal cost of all the adjacent cells + 1 + the extra cost of the cell. If it isn't, i correct it and start with the matrix all over again.
An example:
P are pieces, W are walls, E are empty cells which require no extra movement, X are cells which require 1 extra movement to be crossed.
X,E,X,X,X
X,X,X,X,X
W,E,E,E,W
W,E,X,E,W
E,P,P,P,P
The first matrix:
2,2,2,2,2
2,1,1,1,2
2,1,0,1,2
2,1,1,1,2
2,2,2,2,2
The second matrix:
1,0,1,inf,1
1,1,1,1,1
inf,0,0,0,inf
inf,0,1,0,inf
0,inf,inf,inf,inf
The sum:
3,2,3,3,3
3,2,2,2,3
inf,1,0,1,inf
inf,1,2,1,inf
inf,inf,inf,inf,inf
Since [0,0] is not 2+1+1, i correct it:
The sum:
4,2,3,3,3
3,2,2,2,3
inf,1,0,1,inf
inf,1,2,1,inf
inf,inf,inf,inf,inf
Since [0,1] is not 2+1+0, i correct it:
The sum:
4,3,3,3,3
3,2,2,2,3
inf,1,0,1,inf
inf,1,2,1,inf
inf,inf,inf,inf,inf
Since [0,2] is not 2+1+1, i correct it:
The sum:
4,2,4,3,3
3,2,2,2,3
inf,1,0,1,inf
inf,1,2,1,inf
inf,inf,inf,inf,inf
Which one is the correct answer?
What I want to know is if this problem has a name I can search it by (couldn't find anything) or if anybody can tell me how to solve the point a).
Note that I want the optimal solution, so I went with a dynamic programming algorithm. Might random walkers be better? AFAIK, this solution is not failing (yet), but I have no proof of correctness for it, and I want to be sure it works.
A-star is a standard algorithm to determine shortest path give obstacles on a 2d board and cost per square of moving. You can also use it to test if a specific move is valid, but to actually generate all valid moves I would simply start ay the start position, move in each direction by one square mark which squares are valid and then repeat from each of your new places making sure not to visit the same square again. It will be a recursive algorithm calling itself at most 4 times on each call and will generate you valid moves efficiently. If there are constraints like how many squares you can move at once with different costs just pass the running total of how far you've come for each square.

Querying large amount of multidimensional points in R^N

I'm looking at listing/counting the number of integer points in R^N (in the sense of Euclidean space), within certain geometric shapes, such as circles and ellipses, subject to various conditions, for small N. By this I mean that N < 5, and the conditions are polynomial inequalities.
As a concrete example, take R^2. One of the queries I might like to run is "How many integer points are there in an ellipse (parameterised by x = 4 cos(theta), y = 3 sin(theta) ), such that y * x^2 - x * y = 4?"
I could implement this in Haskell like this:
ghci> let latticePoints = [(x,y) | x <- [-4..4], y <-[-3..3], 9*x^2 + 16*y^2 <= 144, y*x^2 - x*y == 4]
and then I would have:
ghci> latticePoints
[(-1,2),(2,2)]
Which indeed answers my question.
Of course, this is a very naive implementation, but it demonstrates what I'm trying to achieve. (I'm also only using Haskell here as I feel it most directly expresses the underlying mathematical ideas.)
Now, if I had something like "In R^5, how many integer points are there in a 4-sphere of radius 1,000,000, satisfying x^3 - y + z = 20?", I might try something like this:
ghci> :{
Prelude| let latticePoints2 = [(x,y,z,w,v) | x <-[-1000..1000], y <- [-1000..1000],
Prelude| z <- [-1000..1000], w <- [-1000..1000], v <-[1000..1000],
Prelude| x^2 + y^2 + z^2 + w^2 + v^2 <= 1000000, x^3 - y + z == 20]
Prelude| :}
so if I now type:
ghci> latticePoints2
Not much will happen...
I imagine the issue is because it's effectively looping through 2000^5 (32 quadrillion!) points, and it's clearly unreasonably of me to expect my computer to deal with that. I can't imagine doing a similar implementation in Python or C would help matters much either.
So if I want to tackle a large number of points in such a way, what would be my best bet in terms of general algorithms or data structures? I saw in another thread (Count number of points inside a circle fast), someone mention quadtrees as well as K-D trees, but I wouldn't know how to implement those, nor how to appropriately query one once it was implemented.
I'm aware some of these numbers are quite large, but the biggest circles, ellipses, etc I'd be dealing with are of radius 10^12 (one trillion), and I certainly wouldn't need to deal with R^N with N > 5. If the above is NOT possible, I'd be interested to know what sort of numbers WOULD be feasible?
There is no general way to solve this problem. The problem of finding integer solutions to algebraic equations (equations of this sort are called Diophantine equations) is known to be undecidable. Apparently, you can write equations of this sort such that solving the equations ends up being equivalent to deciding whether a given Turing machine will halt on a given input.
In the examples you've listed, you've always constrained the points to be on some well-behaved shape, like an ellipse or a sphere. While this particular class of problem is definitely decidable, I'm skeptical that you can efficiently solve these problems for more complex curves. I suspect that it would be possible to construct short formulas that describe curves that are mostly empty but have a huge bounding box.
If you happen to know more about the structure of the problems you're trying to solve - for example, if you're always dealing with spheres or ellipses - then you may be able to find fast algorithms for this problem. In general, though, I don't think you'll be able to do much better than brute force. I'm willing to admit that (and in fact, hopeful that) someone will prove me wrong about this, though.
The idea behind the kd-tree method is that you recursive subdivide the search box and try to rule out whole boxes at a time. Given the current box, use some method that either (a) declares that all points in the box match the predicate (b) declares that no points in the box match the predicate (c) makes no declaration (one possibility, which may be particularly convenient in Haskell: interval arithmetic). On (c), cut the box in half (say along the longest dimension) and recursively count in the halves. Obviously the method can choose (c) all the time, which devolves to brute force; the goal here is to do (a) or (b) as much as possible.
The performance of this method is very dependent on how it's instantiated. Try it -- it shouldn't be more than a couple dozen lines of code.
For nicely connected region, assuming your shape is significantly smaller than your containing search space, and given a seed point, you could do a growth/building algorithm:
Given a seed point:
Push seed point into test-queue
while test-queue has items:
Pop item from test-queue
If item tests to be within region (eg using a callback function):
Add item to inside-set
for each neighbour point (generated on the fly):
if neighbour not in outside-set and neighbour not in inside-set:
Add neighbour to test-queue
else:
Add item to outside-set
return inside-set
The trick is to find an initial seed point that is inside the function.
Make sure your set implementation gives O(1) duplicate checking. This method will eventually break down with large numbers of dimensions as the surface area exceeds the volume, but for 5 dimensions should be fine.

How can a random number be produced with decreasing odds (approaching zero) towards its upper boundary?

I want a random number generator that almost never produces numbers that are near a given upper boundary. The odds should drop linearly to 0 to the upper boundary.
This is possibly best suited as a math-only question, but I need it in code form (pseudo-code is fine, more specifically any C-based language) for my use, so I'm putting it here.
If you want a linear drop-off what you're describing is called a triangle (or triangular) distribution. Given U, a source of uniformly distributed random numbers on the range [0,1), you can generate a triangle on the range [a,b) with its mode at a using:
def triangle(a,b)
return a + (b-a)*(1 - sqrt(U))
end
This can be derived by writing the equation of a triangle for the specified range, scaling it so it has area 1 to make it a valid density, integrating to get the CDF, and using inversion.
As an interesting aside, this will still work if a >= b. For equality, you always get a (which makes sense if the range is zero). Otherwise, you get a triangle which goes from b to a and has its mode at a.

What to use for flow free-like game random level creation?

I need some advice. I'm developing a game similar to Flow Free wherein the gameboard is composed of a grid and colored dots, and the user has to connect the same colored dots together without overlapping other lines, and using up ALL the free spaces in the board.
My question is about level-creation. I wish to make the levels generated randomly (and should at least be able to solve itself so that it can give players hints) and I am in a stump as to what algorithm to use. Any suggestions?
Note: image shows the objective of Flow Free, and it is the same objective of what I am developing.
Thanks for your help. :)
Consider solving your problem with a pair of simpler, more manageable algorithms: one algorithm that reliably creates simple, pre-solved boards and another that rearranges flows to make simple boards more complex.
The first part, building a simple pre-solved board, is trivial (if you want it to be) if you're using n flows on an nxn grid:
For each flow...
Place the head dot at the top of the first open column.
Place the tail dot at the bottom of that column.
Alternatively, you could provide your own hand-made starter boards to pass to the second part. The only goal of this stage is to get a valid board built, even if it's just trivial or predetermined, so it's worth keeping it simple.
The second part, rearranging the flows, involves looping over each flow, seeing which one can work with its neighboring flow to grow and shrink:
For some number of iterations...
Choose a random flow f.
If f is at the minimum length (say 3 squares long), skip to the next iteration because we can't shrink f right now.
If the head dot of f is next to a dot from another flow g (if more than one g to choose from, pick one at random)...
Move f's head dot one square along its flow (i.e., walk it one square towards the tail). f is now one square shorter and there's an empty square. (The puzzle is now unsolved.)
Move the neighboring dot from g into the empty square vacated by f. Now there's an empty square where g's dot moved from.
Fill in that empty spot with flow from g. Now g is one square longer than it was at the beginning of this iteration. (The puzzle is back to being solved as well.)
Repeat the previous step for f's tail dot.
The approach as it stands is limited (dots will always be neighbors) but it's easy to expand upon:
Add a step to loop through the body of flow f, looking for trickier ways to swap space with other flows...
Add a step that prevents a dot from moving to an old location...
Add any other ideas that you come up with.
The overall solution here is probably less than the ideal one that you're aiming for, but now you have two simple algorithms that you can flesh out further to serve the role of one large, all-encompassing algorithm. In the end, I think this approach is manageable, not cryptic, and easy to tweek, and, if nothing else, a good place to start.
Update: I coded a proof-of-concept based on the steps above. Starting with the first 5x5 grid below, the process produced the subsequent 5 different boards. Some are interesting, some are not, but they're always valid with one known solution.
Starting Point
5 Random Results (sorry for the misaligned screenshots)
And a random 8x8 for good measure. The starting point was the same simple columns approach as above.
Updated answer: I implemented a new generator using the idea of "dual puzzles". This allows much sparser and higher quality puzzles than any previous method I know of. The code is on github. I'll try to write more details about how it works, but here is an example puzzle:
Old answer:
I have implemented the following algorithm in my numberlink solver and generator. In enforces the rule, that a path can never touch itself, which is normal in most 'hardcore' numberlink apps and puzzles
First the board is tiled with 2x1 dominos in a simple, deterministic way.
If this is not possible (on an odd area paper), the bottom right corner is
left as a singleton.
Then the dominos are randomly shuffled by rotating random pairs of neighbours.
This is is not done in the case of width or height equal to 1.
Now, in the case of an odd area paper, the bottom right corner is attached to
one of its neighbour dominos. This will always be possible.
Finally, we can start finding random paths through the dominos, combining them
as we pass through. Special care is taken not to connect 'neighbour flows'
which would create puzzles that 'double back on themselves'.
Before the puzzle is printed we 'compact' the range of colours used, as much as possible.
The puzzle is printed by replacing all positions that aren't flow-heads with a .
My numberlink format uses ascii characters instead of numbers. Here is an example:
$ bin/numberlink --generate=35x20
Warning: Including non-standard characters in puzzle
35 20
....bcd.......efg...i......i......j
.kka........l....hm.n....n.o.......
.b...q..q...l..r.....h.....t..uvvu.
....w.....d.e..xx....m.yy..t.......
..z.w.A....A....r.s....BB.....p....
.D.........E.F..F.G...H.........IC.
.z.D...JKL.......g....G..N.j.......
P...a....L.QQ.RR...N....s.....S.T..
U........K......V...............T..
WW...X.......Z0..M.................
1....X...23..Z0..........M....44...
5.......Y..Y....6.........C.......p
5...P...2..3..6..VH.......O.S..99.I
........E.!!......o...."....O..$$.%
.U..&&..J.\\.(.)......8...*.......+
..1.......,..-...(/:.."...;;.%+....
..c<<.==........)./..8>>.*.?......#
.[..[....]........:..........?..^..
..._.._.f...,......-.`..`.7.^......
{{......].....|....|....7.......#..
And here I run it through my solver (same seed):
$ bin/numberlink --generate=35x20 | bin/numberlink --tubes
Found a solution!
┌──┐bcd───┐┌──efg┌─┐i──────i┌─────j
│kka│└───┐││l┌─┘│hm│n────n┌o│┌────┐
│b──┘q──q│││l│┌r└┐│└─h┌──┐│t││uvvu│
└──┐w┌───┘d└e││xx│└──m│yy││t││└──┘│
┌─z│w│A────A┌┘└─r│s───┘BB││┌┘└p┌─┐│
│D┐└┐│┌────E│F──F│G──┐H┐┌┘││┌──┘IC│
└z└D│││JKL┌─┘┌──┐g┌─┐└G││N│j│┌─┐└┐│
P──┐a││││L│QQ│RR└┐│N└──┘s││┌┘│S│T││
U─┐│┌┘││└K└─┐└─┐V││└─────┘││┌┘││T││
WW│││X││┌──┐│Z0││M│┌──────┘││┌┘└┐││
1┐│││X│││23││Z0│└┐││┌────M┌┘││44│││
5│││└┐││Y││Y│┌─┘6││││┌───┐C┌┘│┌─┘│p
5││└P│││2┘└3││6─┘VH│││┌─┐│O┘S┘│99└I
┌┘│┌─┘││E┐!!│└───┐o┘│││"│└─┐O─┘$$┌%
│U┘│&&│└J│\\│(┐)┐└──┘│8││┌*└┐┌───┘+
└─1└─┐└──┘,┐│-└┐│(/:┌┘"┘││;;│%+───┘
┌─c<<│==┌─┐││└┐│)│/││8>>│*┌?│┌───┐#
│[──[└─┐│]││└┐│└─┘:┘│└──┘┌┘┌┘?┌─^││
└─┐_──_│f││└,│└────-│`──`│7┘^─┘┌─┘│
{{└────┘]┘└──┘|────|└───7└─────┘#─┘
I've tested replacing step (4) with a function that iteratively, randomly merges two neighboring paths. However it game much denser puzzles, and I already think the above is nearly too dense to be difficult.
Here is a list of problems I've generated of different size: https://github.com/thomasahle/numberlink/blob/master/puzzles/inputs3
The most straightforward way to create such a level is to find a way to solve it. This way, you can basically generate any random starting configuration and determine if it is a valid level by trying to have it solved. This will generate the most diverse levels.
And even if you find a way to generate the levels some other way, you'll still want to apply this solving algorithm to prove that the generated level is any good ;)
Brute-force enumerating
If the board has a size of NxN cells, and there are also N colours available, brute-force enumerating all possible configurations (regardless of wether they form actual paths between start and end nodes) would take:
N^2 cells total
2N cells already occupied with start and end nodes
N^2 - 2N cells for which the color has yet to be determined
N colours available.
N^(N^2 - 2N) possible combinations.
So,
For N=5, this means 5^15 = 30517578125 combinations.
For N=6, this means 6^24 = 4738381338321616896 combinations.
In other words, the number of possible combinations is pretty high to start with, but also grows ridiculously fast once you start making the board larger.
Constraining the number of cells per color
Obviously, we should try to reduce the number of configurations as much as possible. One way of doing that is to consider the minimum distance ("dMin") between each color's start and end cell - we know that there should at least be this much cells with that color. Calculating the minimum distance can be done with a simple flood fill or Dijkstra's algorithm.
(N.B. Note that this entire next section only discusses the number of cells, but does not say anything about their locations)
In your example, this means (not counting the start and end cells)
dMin(orange) = 1
dMin(red) = 1
dMin(green) = 5
dMin(yellow) = 3
dMin(blue) = 5
This means that, of the 15 cells for which the color has yet to be determined, there have to be at least 1 orange, 1 red, 5 green, 3 yellow and 5 blue cells, also making a total of 15 cells.
For this particular example this means that connecting each color's start and end cell by (one of) the shortest paths fills the entire board - i.e. after filling the board with the shortest paths no uncoloured cells remain. (This should be considered "luck", not every starting configuration of the board will cause this to happen).
Usually, after this step, we have a number of cells that can be freely coloured, let's call this number U. For N=5,
U = 15 - (dMin(orange) + dMin(red) + dMin(green) + dMin(yellow) + dMin(blue))
Because these cells can take any colour, we can also determine the maximum number of cells that can have a particular colour:
dMax(orange) = dMin(orange) + U
dMax(red) = dMin(red) + U
dMax(green) = dMin(green) + U
dMax(yellow) = dMin(yellow) + U
dMax(blue) = dMin(blue) + U
(In this particular example, U=0, so the minimum number of cells per colour is also the maximum).
Path-finding using the distance constraints
If we were to brute force enumerate all possible combinations using these color constraints, we would have a lot less combinations to worry about. More specifically, in this particular example we would have:
15! / (1! * 1! * 5! * 3! * 5!)
= 1307674368000 / 86400
= 15135120 combinations left, about a factor 2000 less.
However, this still doesn't give us the actual paths. so a better idea would be to a backtracking search, where we process each colour in turn and attempt to find all paths that:
doesn't cross an already coloured cell
Is not shorter than dMin(colour) and not longer than dMax(colour).
The second criteria will reduce the number of paths reported per colour, which causes the total number of paths to be tried to be greatly reduced (due to the combinatorial effect).
In pseudo-code:
function SolveLevel(initialBoard of size NxN)
{
foreach(colour on initialBoard)
{
Find startCell(colour) and endCell(colour)
minDistance(colour) = Length(ShortestPath(initialBoard, startCell(colour), endCell(colour)))
}
//Determine the number of uncoloured cells remaining after all shortest paths have been applied.
U = N^(N^2 - 2N) - (Sum of all minDistances)
firstColour = GetFirstColour(initialBoard)
ExplorePathsForColour(
initialBoard,
firstColour,
startCell(firstColour),
endCell(firstColour),
minDistance(firstColour),
U)
}
}
function ExplorePathsForColour(board, colour, startCell, endCell, minDistance, nrOfUncolouredCells)
{
maxDistance = minDistance + nrOfUncolouredCells
paths = FindAllPaths(board, colour, startCell, endCell, minDistance, maxDistance)
foreach(path in paths)
{
//Render all cells in 'path' on a copy of the board
boardCopy = Copy(board)
boardCopy = ApplyPath(boardCopy, path)
uRemaining = nrOfUncolouredCells - (Length(path) - minDistance)
//Recursively explore all paths for the next colour.
nextColour = NextColour(board, colour)
if(nextColour exists)
{
ExplorePathsForColour(
boardCopy,
nextColour,
startCell(nextColour),
endCell(nextColour),
minDistance(nextColour),
uRemaining)
}
else
{
//No more colours remaining to draw
if(uRemaining == 0)
{
//No more uncoloured cells remaining
Report boardCopy as a result
}
}
}
}
FindAllPaths
This only leaves FindAllPaths(board, colour, startCell, endCell, minDistance, maxDistance) to be implemented. The tricky thing here is that we're not searching for the shortest paths, but for any paths that fall in the range determined by minDistance and maxDistance. Hence, we can't just use Dijkstra's or A*, because they will only record the shortest path to each cell, not any possible detours.
One way of finding these paths would be to use a multi-dimensional array for the board, where
each cell is capable of storing multiple waypoints, and a waypoint is defined as the pair (previous waypoint, distance to origin). The previous waypoint is needed to be able to reconstruct the entire path once we've reached the destination, and the distance to origin
prevents us from exceeding the maxDistance.
Finding all paths can then be done by using a flood-fill like exploration from the startCell outwards, where for a given cell, each uncoloured or same-as-the-current-color-coloured neigbour is recursively explored (except the ones that form our current path to the origin) until we reach either the endCell or exceed the maxDistance.
An improvement on this strategy is that we don't explore from the startCell outwards to the endCell, but that we explore from both the startCell and endCell outwards in parallel, using Floor(maxDistance / 2) and Ceil(maxDistance / 2) as the respective maximum distances. For large values of maxDistance, this should reduce the number of explored cells from 2 * maxDistance^2 to maxDistance^2.
I think you'll want to do this in two steps. Step 1) find a set of non-intersecting paths that connect all your points, then 2) Grow/shift those paths to fill the entire board
My thoughts on Step 1 are to essentially perform Dijkstra like algorithm on all points simultaneously, growing together the paths. Similar to Dijkstra, I think you'll want to flood-fill out from each of your points, chosing which node to search next using some heuristic (My hunch says chosing points with the least degrees of freedom first, then by distance, might be a good one). Very differently from Dijkstra though I think we might be stuck with having to backtrack when we have multiple paths attempting to grow into the same node. (This could of course be fairly problematic on bigger maps, but might not be a big deal on small maps like the one you have above.)
You may also solve for some of the easier paths before you start the above algorithm, mainly to cut down on the number of backtracks needed. In specific, if you can make a trace between points along the edge of the board, you can guarantee that connecting those two points in that fashion would never interfere with other paths, so you can simply fill those in and take those guys out of the equation. You could then further iterate on this until all of these "quick and easy" paths are found by tracing along the borders of the board, or borders of existing paths. That algorithm would actually completely solve the above example board, but would undoubtedly fail elsewhere .. still, it would be very cheap to perform and would reduce your search time for the previous algorithm.
Alternatively
You could simply do a real Dijkstra's algorithm between each set of points, pathing out the closest points first (or trying them in some random orders a few times). This would probably work for a fair number of cases, and when it fails simply throw out the map and generate a new one.
Once you have Step 1 solved, Step 2 should be easier, though not necessarily trivial. To grow your paths, I think you'll want to grow your paths outward (so paths closest to walls first, growing towards the walls, then other inner paths outwards, etc.). To grow, I think you'll have two basic operations, flipping corners, and expanding into into adjacent pairs of empty squares.. that is to say, if you have a line like
.v<<.
v<...
v....
v....
First you'll want to flip the corners to fill in your edge spaces
v<<<.
v....
v....
v....
Then you'll want to expand into neighboring pairs of open space
v<<v.
v.^<.
v....
v....
v<<v.
>v^<.
v<...
v....
etc..
Note that what I've outlined wont guarantee a solution if one exists, but I think you should be able to find one most of the time if one exists, and then in the cases where the map has no solution, or the algorithm fails to find one, just throw out the map and try a different one :)
You have two choices:
Write a custom solver
Brute force it.
I used option (2) to generate Boggle type boards and it is VERY successful. If you go with Option (2), this is how you do it:
Tools needed:
Write a A* solver.
Write a random board creator
To solve:
Generate a random board consisting of only endpoints
while board is not solved:
get two endpoints closest to each other that are not yet solved
run A* to generate path
update board so next A* knows new board layout with new path marked as un-traversable.
At exit of loop, check success/fail (is whole board used/etc) and run again if needed
The A* on a 10x10 should run in hundredths of a second. You can probably solve 1k+ boards/second. So a 10 second run should get you several 'usable' boards.
Bonus points:
When generating levels for a IAP (in app purchase) level pack, remember to check for mirrors/rotations/reflections/etc so you don't have one board a copy of another (which is just lame).
Come up with a metric that will figure out if two boards are 'similar' and if so, ditch one of them.

Which algorithm will be required to do this?

I have data of this form:
for x=1, y is one of {1,4,6,7,9,18,16,19}
for x=2, y is one of {1,5,7,4}
for x=3, y is one of {2,6,4,8,2}
....
for x=100, y is one of {2,7,89,4,5}
Only one of the values in each set is the correct value, the rest is random noise.
I know that the correct values describe a sinusoid function whose parameters are unknown. How can I find the correct combination of values, one from each set?
I am looking something like "travelling salesman"combinatorial optimization algorithm
You're trying to do curve fitting, for which there are several algorithms depending on the type of curve you want to fit your curve to (linear, polynomial, etc.). I have no idea whether there is a specific algorithm for sinusoidal curves (Fourier approximations), but my first idea would be to use a polynomial fitting algorithm with a polynomial approximation of the sine.
I wonder whether you need to do this in the course of another larger program, or whether you are trying to do this task on its own. If so, then you'd be much better off using a statistical package, my preferred one being R. It allows you to import your data and fit curves and draw graphs in just a few lines, and you could also use R in batch-mode to call it from a script or even a program (this is what I tend to do).
It depends on what you mean by "exactly", and what you know beforehand. If you know the frequency w, and that the sinusoid is unbiased, you have an equation
a cos(w * x) + b sin(w * x)
with two (x,y) points at different x values you can find a and b, and then check the generated curve against all the other points. Choose the two x values with the smallest number of y observations and try it for all the y's. If there is a bias, i.e. your equation is
a cos(w * x) + b sin(w * x) + c
You need to look at three x values.
If you do not know the frequency, you can try the same technique, unfortunately the solutions may not be unique, there may be more than one w that fits.
Edit As I understand your problem, you have a real y value for each x and a bunch of incorrect ones. You want to find the real values. The best way to do this is to fit curves through a small number of points and check to see if the curve fits some y value in the other sets.
If not all the x values have valid y values then the same technique applies, but you need to look at a much larger set of pairs, triples or quadruples (essentially every pair, triple, or quad of points with different y values)
If your problem is something else, and I suspect it is, please specify it.
Define sinusoid. Most people take that to mean a function of the form a cos(w * x) + b sin(w * x) + c. If you mean something different, specify it.
2 Specify exactly what success looks like. An example with say 10 points instead of 100 would be nice.
It is extremely unclear what this has to do with combinatorial optimization.
Sinusoidal equations are so general that if you take any random value of all y's these values can be fitted in sinusoidal function unless you give conditions eg. Frequency<100 or all parameters are integers,its not possible to diffrentiate noise and data theorotically so work on finding such conditions from your data source/experiment first.
By sinusoidal, do you mean a function that is increasing for n steps, then decreasing for n steps, etc.? If so, you you can model your data as a sequence of nodes connected by up-links and down-links. For each node (possible value of y), record the length and end-value of chains of only ascending or descending links (there will be multiple chain per node). Then you scan for consecutive runs of equal length and opposite direction, modulo some initial offset.

Resources