detect rolling channels in a series of high and low points

detect rolling channels in a series of high and low points - algorithm

Suppose that I have a sequence of lows and highs (1st and 2nd column respectively). The sequence length can be an arbitrary number. The length of 100 here is just an example.
$ awk -e 'BEGIN { for(i=1;i<=100;++i) { l=rand(); b=rand(); print l, l+b } }'
0.924046 1.51795
0.306394 0.885335
0.740133 1.52706
0.43637 0.768565
0.77888 0.879766
0.785084 1.62024
0.761209 1.25729
0.426298 1.3721
0.821802 1.53107
0.157828 0.27758
...
For each window of rows with size w (for example, if w=4, the windows will be rows1-4, rows2-5, ...), I want to find two lines $a1+bi$ and $a2+bi$ (channel) enclosing all the points in the window so that $|a2-a1|$ is minimized.
Is there an efficient algorithm to find channels like this?
The implementation can be in python and R.

Here is a brutal force way that should run in NWlogW time, that does not factor in the rolling window. I thought about some ways the rolling part can be used to cut down some running time, but overall the complexity level remains the same, so I won't bother there
First, find the convex hull or the 2W points in O(WlogW) time. The result is represented by all points on the border of the hull, and ordered in a list (up to O(W) in size).
Second, for each edge (formed by 2 adjacent points, except for the 2 vertical edges which we don't need), you can find the 'b' and 'a1'. Then find the furthest point on the convex hull by assuming a parallel line of slope 'b' going thru a point a find its corresponding 'a2'. this can be done in bi-sectional search and needs time LogW. Now, since there are O(W) edges, you need WlogW total time to find the edge-point combination that give you the smallest |a1-a2|
Third, you have O(N) windows. So the total time is O(NWlogW). I guess this is not a very efficient way, but can give you an upper bound of doing this.

Related

Choice of optimization algorithm for distributing lines inside a shape

Consider the follow representation of a concrete slab element with reinforcement bars and holes.
I need an algorithm that automatically distributes lines over an arbitrary shape with different holes.
The main constraints are:
Lines cannot be outside of the region or inside a hole
The distance between two side-by-side lines cannot exceed a variable D
Lines have to be positioned on a fixed interval I, i.e. y mod I = 0, where y is the line's Y coordinate.
Each available point inside the shape cannot be further from a line than D/2
I want to optimize the solution by minimizing the total number of lines N. What kind of optimization algorithm would suit this problem?
I assume most approaches involves simplifying the shape into a raster (with pixel height I) and disable or enable each pixel. I thought this was an obvious LP problem and tried to set it up with GLPK, but found it very hard to describe the problem using this simplified raster for an arbitrary number of lines. I also suspect that the solution space might be too big.
I have already implemented an algorithm in C# that does the job, but not very optimized. This is how it works:
Create a simplified raster of the geometry
Calculate a score for each cell using a complicated formula that takes possible line length and distances to other rods and obstacles into account.
Determine which needs reinforcement (where number of free cells in y direction > D)
Pick the cell with the highest score that needs reinforcement, and reinforce it as far as possible in -x and +x directions
Repeat
Depending on the complicated formula, this works pretty well but starts giving unwanted results when putting the last few lines, since it can never move an already put line.
Are there any other optimization techniques that I should have a look at?

I'm not sure what follows is what you want - I'm fairly sure it's not what you had in mind - but if it sounds reasonable you might try it out.
Because the distance is simply at most d, and can be anything less than that, it seems at first blush like a greedy algorithm should work here. Always place the next line(s) so that (1) as few as possible are needed and (2) they are as far away from existing lines as possible.
Assume you have an optimal algorithm for this problem, and it places the next line(s) at distance a <= d from the last line. Say it places b lines. Our greedy algorithm will definitely place no more than b lines (since the first criterion is to place as few as possible), and if it places b lines it will place them at distance c with a <= c <= d, since it then places the lines as far as possible.
If the greedy algorithm did not do what the optimal algorithm did, it differed in one of the following ways:
It placed the same or fewer lines farther away. Suppose the optimal algorithm had gone on to place b' lines at distance a' away at the next step. Then these lines would be at distance a+a' and there would be b+b' lines in total. But the greedy algorithm can mimic the optimal algorithm in this case by placing b' lines at displacement a+a' by choosing c' = (a+a') - c. Since c > a and a' < d, c' < d and this is a legal placement.
It placed fewer lines closer together. This case is actually problematic. It is possible that this places k unnecessary lines, if any placement requires at least k lines and the farthest ones require more, and the arrangement of holes is chosen so that (e.g.) the distance it spans is a multiple of d.
So the greedy algorithm turns out not to work in case 2. However, it does in other cases. In particular, our observation in the first case is very useful: for any two placements (distance, lines) and (distance', lines'), if distance >= distance' and lines <= lines', the first placement is always to be preferred. This suggests the following algorithm:
PlaceLines(start, stop)
// if we are close enough to the other edge,
// don't place any more lines.
if start + d >= stop then return ([], 0)
// see how many lines we can place at distance
// d from the last placed lines. no need to
// ever place more lines than this
nmax = min_lines_at_distance(start + d)
// see how that selection pans out by recursively
// seeing how line placement works after choosing
// nmax lines at distance d from the last lines.
optimal = PlaceLines(start + d, stop)
optimal[0] = [d] . optimal[0]
optimal[1] = nmax + optimal[1]
// we only need to try fewer lines, never more
for n = 1 to nmax do
// find the max displacement a from the last placed
// lines where we can place n lines.
a = max_distance_for_lines(start, stop, n)
if a is undefined then continue
// see how that choice pans out by placing
// the rest of the lines
candidate = PlaceLines(start + a, stop)
candidate[0] = [a] . candidate[0]
candidate[1] = n + candidate[1]
// replace the last best placement with the
// one we just tried, if it turned out to be
// better than the last
if candidate[1] < optimal[1] then
optimal = candidate
// return the best placement we found
return optimal
This can be improved by memoization by putting results (seq, lines) into a cache indexed by (start, stop). That way, we can recognize when we are trying to compute assignments that may already have been evaluated. I would expect that we'd have this case a lot, regardless of whether you use a coarse or a fine discretization for problem instances.
I don't get into details about how max_lines_at_distance and max_distance_for_lines functions might work, but maybe a word on these.
The first tells you at a given displacement how many lines are required to span the geometry. If you have pixelated your geometry and colored holes black, this would mean looking at the row of cells at the indicated displacement, considering the contiguous black line segments, and determining from there how many lines that implies.
The second tells you, for a given candidate number of lines, the maximum distance from the current position at which that number of lines can be placed. You could make this better by having it tell you the maximum distance at which that number of lines, or fewer, could be placed. If you use this improvement, you could reverse the direction in which you're iterating n and:
if f(start, stop, x) = a and y < x, you only need to search up to a, not stop, from then on;
if f(start, stop, x) is undefined and y < x, you don't need to search any more.
Note that this function can be undefined if it is impossible to place n or fewer lines anywhere between start and stop.
Note also that you can memorize separately for these functions to save repeated lookups. You can precompute max_lines_at_distance for each row and store it in a cache for later. Then, max_distance_for_lines could be a loop that checks the cache back to front inside two bounds.

Is it better to reduce the space complexity or the time complexity for a given program?

Grid Illumination: Given an NxN grid with an array of lamp coordinates. Each lamp provides illumination to every square on their x axis, every square on their y axis, and every square that lies in their diagonal (think of a Queen in chess). Given an array of query coordinates, determine whether that point is illuminated or not. The catch is when checking a query all lamps adjacent to, or on, that query get turned off. The ranges for the variables/arrays were about: 10^3 < N < 10^9, 10^3 < lamps < 10^9, 10^3 < queries < 10^9
It seems like I can get one but not both. I tried to get this down to logarithmic time but I can't seem to find a solution. I can reduce the space complexity but it's not that fast, exponential in fact. Where should I focus on instead, speed or space? Also, if you have any input as to how you would solve this problem please do comment.

Is it better for a car to go fast or go a long way on a little fuel? It depends on circumstances.
Here's a proposal.
First, note you can number all the diagonals that the inputs like on by using the first point as the "origin" for both nw-se and ne-sw. The diagonals through this point are both numbered zero. The nw-se diagonals increase per-pixel in e.g the northeast direction, and decreasing (negative) to the southwest. Similarly ne-sw are numbered increasing in the e.g. the northwest direction and decreasing (negative) to the southeast.
Given the origin, it's easy to write constant time functions that go from (x,y) coordinates to the respective diagonal numbers.
Now each set of lamp coordinates is naturally associated with 4 numbers: (x, y, nw-se diag #, sw-ne dag #). You don't need to store these explicitly. Rather you want 4 maps xMap, yMap, nwSeMap, and swNeMap such that, for example, xMap[x] produces the list of all lamp coordinates with x-coordinate x, nwSeMap[nwSeDiagonalNumber(x, y)] produces the list of all lamps on that diagonal and similarly for the other maps.
Given a query point, look up it's corresponding 4 lists. From these it's easy to deal with adjacent squares. If any list is longer than 3, removing adjacent squares can't make it empty, so the query point is lit. If it's only 3 or fewer, it's a constant time operation to see if they're adjacent.
This solution requires the input points to be represented in 4 lists. Since they need to be represented in one list, you can argue that this algorithm requires only a constant factor of space with respect to the input. (I.e. the same sort of cost as mergesort.)
Run time is expected constant per query point for 4 hash table lookups.
Without much trouble, this algorithm can be split so it can be map-reduced if the number of lampposts is huge.
But it may be sufficient and easiest to run it on one big machine. With a billion lamposts and careful data structure choices, it wouldn't be hard to implement with 24 bytes per lampost in an unboxed structures language like C. So a ~32Gb RAM machine ought to work just fine. Building the maps with multiple threads requires some synchronization, but that's done only once. The queries can be read-only: no synchronization required. A nice 10 core machine ought to do a billion queries in well less than a minute.

There is very easy Answer which works
Create Grid of NxN
Now for each Lamp increment the count of all the cells which suppose to be illuminated by the Lamp.
For each query check if cell on that query has value > 0;
For each adjacent cell find out all illuminated cells and reduce the count by 1
This worked fine but failed for size limit when trying for 10000 X 10000 grid

Clustering Circular area with different angles

Let's say that I have a circular area and n objects deployed randomly in this area. I want to divide the circle into K groups from position of center such that objects in same area treated as member of same group. Moreover, K person assigned to visit K groups from the position of circle center (Specifically, one person for one group). Now, I want to group such a way that, traveling distance of a person for each group close to each other.
If objects are deployed uniformly, it's very easy. Only divide the area into equal angles gives me the desired results. But, for random deployment of objects, I could not divide the circular area (specifically, could not fix the angle) which gives me traveling distance for each person in each group which is close to each other

As mentioned in the comments, the subproblem of finding the shortest path within a group is NP-hard. Hence, the overall problem is NP-hard. I will present a dynamic programming algorithm that can either solve the problem exactly (in exponential time) or approximate it with a heuristic (in polynomial time).
Both variants require a function f(g) that evaluates the travel distance of a group. In the exact variant, you need to solve TSP. In the approximate variant, you use a heuristic. You should try several heuristics and see which one fits best. For example, the area of the objects' bounding ring might be a good start (plus the distance of the closest object to the center).
The actual algorithm looks as follows: Calculate the angular position of each object and sort them with respect to this position. Also, calculate the distance to the center.
For now, we assume that the first group starts at the first object. Then, we want to find the grouping that minimizes the sum of f(g) over all groups. The states of the dynamic program are parameterized by the number of groups specified so far and by the object that belongs to the first next group. This makes a 2D table. You can easily initialize the first column by calculating f() for the resulting group:
groups | 1 2 3 4 5 6 ... K
next object |
--------------+------------------------
o2 | f(o1->o1)
o3 | f(o1->o2)
... | ...
on | f(o1->on-1)
o1 | f(o1->on)
Then, you fill the subsequent columns. For each entry, you have to compare the group to all groups in the previous column and find the one with the minimum sum:
entry(column i, object j) = min_k (entry(i - 1, k) + f(ok->j-1))
Actually, you don't have to calculate the entire column. You can leave entries at the beginning and end empty if they do not allow to fit in K groups. E.g. in the first column, you only need to calculate up to on-(K-1) because you have to leave K-1 objects unassigned in order to get a valid grouping. You may cache the results of f to avoid duplicate calculations.
After you filled the table, you're interested in entry(K, o1). Follow the path from this entry back to the start and you get the optimal grouping where the first group starts at o1. Do this for the first n-K objects as group starts and you get the overall optimum.
The time complexity of this algorithm is O(n^3 * K * T(f)) where T(f) is the complexity of calculating f(g).

What to use for flow free-like game random level creation?

I need some advice. I'm developing a game similar to Flow Free wherein the gameboard is composed of a grid and colored dots, and the user has to connect the same colored dots together without overlapping other lines, and using up ALL the free spaces in the board.
My question is about level-creation. I wish to make the levels generated randomly (and should at least be able to solve itself so that it can give players hints) and I am in a stump as to what algorithm to use. Any suggestions?
Note: image shows the objective of Flow Free, and it is the same objective of what I am developing.
Thanks for your help. :)

Consider solving your problem with a pair of simpler, more manageable algorithms: one algorithm that reliably creates simple, pre-solved boards and another that rearranges flows to make simple boards more complex.
The first part, building a simple pre-solved board, is trivial (if you want it to be) if you're using n flows on an nxn grid:
For each flow...
Place the head dot at the top of the first open column.
Place the tail dot at the bottom of that column.
Alternatively, you could provide your own hand-made starter boards to pass to the second part. The only goal of this stage is to get a valid board built, even if it's just trivial or predetermined, so it's worth keeping it simple.
The second part, rearranging the flows, involves looping over each flow, seeing which one can work with its neighboring flow to grow and shrink:
For some number of iterations...
Choose a random flow f.
If f is at the minimum length (say 3 squares long), skip to the next iteration because we can't shrink f right now.
If the head dot of f is next to a dot from another flow g (if more than one g to choose from, pick one at random)...
Move f's head dot one square along its flow (i.e., walk it one square towards the tail). f is now one square shorter and there's an empty square. (The puzzle is now unsolved.)
Move the neighboring dot from g into the empty square vacated by f. Now there's an empty square where g's dot moved from.
Fill in that empty spot with flow from g. Now g is one square longer than it was at the beginning of this iteration. (The puzzle is back to being solved as well.)
Repeat the previous step for f's tail dot.
The approach as it stands is limited (dots will always be neighbors) but it's easy to expand upon:
Add a step to loop through the body of flow f, looking for trickier ways to swap space with other flows...
Add a step that prevents a dot from moving to an old location...
Add any other ideas that you come up with.
The overall solution here is probably less than the ideal one that you're aiming for, but now you have two simple algorithms that you can flesh out further to serve the role of one large, all-encompassing algorithm. In the end, I think this approach is manageable, not cryptic, and easy to tweek, and, if nothing else, a good place to start.
Update: I coded a proof-of-concept based on the steps above. Starting with the first 5x5 grid below, the process produced the subsequent 5 different boards. Some are interesting, some are not, but they're always valid with one known solution.
Starting Point
5 Random Results (sorry for the misaligned screenshots)
And a random 8x8 for good measure. The starting point was the same simple columns approach as above.

Updated answer: I implemented a new generator using the idea of "dual puzzles". This allows much sparser and higher quality puzzles than any previous method I know of. The code is on github. I'll try to write more details about how it works, but here is an example puzzle:
Old answer:
I have implemented the following algorithm in my numberlink solver and generator. In enforces the rule, that a path can never touch itself, which is normal in most 'hardcore' numberlink apps and puzzles
First the board is tiled with 2x1 dominos in a simple, deterministic way.
If this is not possible (on an odd area paper), the bottom right corner is
left as a singleton.
Then the dominos are randomly shuffled by rotating random pairs of neighbours.
This is is not done in the case of width or height equal to 1.
Now, in the case of an odd area paper, the bottom right corner is attached to
one of its neighbour dominos. This will always be possible.
Finally, we can start finding random paths through the dominos, combining them
as we pass through. Special care is taken not to connect 'neighbour flows'
which would create puzzles that 'double back on themselves'.
Before the puzzle is printed we 'compact' the range of colours used, as much as possible.
The puzzle is printed by replacing all positions that aren't flow-heads with a .
My numberlink format uses ascii characters instead of numbers. Here is an example:
$ bin/numberlink --generate=35x20
Warning: Including non-standard characters in puzzle
35 20
....bcd.......efg...i......i......j
.kka........l....hm.n....n.o.......
.b...q..q...l..r.....h.....t..uvvu.
....w.....d.e..xx....m.yy..t.......
..z.w.A....A....r.s....BB.....p....
.D.........E.F..F.G...H.........IC.
.z.D...JKL.......g....G..N.j.......
P...a....L.QQ.RR...N....s.....S.T..
U........K......V...............T..
WW...X.......Z0..M.................
1....X...23..Z0..........M....44...
5.......Y..Y....6.........C.......p
5...P...2..3..6..VH.......O.S..99.I
........E.!!......o...."....O..$$.%
.U..&&..J.\\.(.)......8...*.......+
..1.......,..-...(/:.."...;;.%+....
..c<<.==........)./..8>>.*.?......#
.[..[....]........:..........?..^..
..._.._.f...,......-.`..`.7.^......
{{......].....|....|....7.......#..
And here I run it through my solver (same seed):
$ bin/numberlink --generate=35x20 | bin/numberlink --tubes
Found a solution!
┌──┐bcd───┐┌──efg┌─┐i──────i┌─────j
│kka│└───┐││l┌─┘│hm│n────n┌o│┌────┐
│b──┘q──q│││l│┌r└┐│└─h┌──┐│t││uvvu│
└──┐w┌───┘d└e││xx│└──m│yy││t││└──┘│
┌─z│w│A────A┌┘└─r│s───┘BB││┌┘└p┌─┐│
│D┐└┐│┌────E│F──F│G──┐H┐┌┘││┌──┘IC│
└z└D│││JKL┌─┘┌──┐g┌─┐└G││N│j│┌─┐└┐│
P──┐a││││L│QQ│RR└┐│N└──┘s││┌┘│S│T││
U─┐│┌┘││└K└─┐└─┐V││└─────┘││┌┘││T││
WW│││X││┌──┐│Z0││M│┌──────┘││┌┘└┐││
1┐│││X│││23││Z0│└┐││┌────M┌┘││44│││
5│││└┐││Y││Y│┌─┘6││││┌───┐C┌┘│┌─┘│p
5││└P│││2┘└3││6─┘VH│││┌─┐│O┘S┘│99└I
┌┘│┌─┘││E┐!!│└───┐o┘│││"│└─┐O─┘$$┌%
│U┘│&&│└J│\\│(┐)┐└──┘│8││┌*└┐┌───┘+
└─1└─┐└──┘,┐│-└┐│(/:┌┘"┘││;;│%+───┘
┌─c<<│==┌─┐││└┐│)│/││8>>│*┌?│┌───┐#
│[──[└─┐│]││└┐│└─┘:┘│└──┘┌┘┌┘?┌─^││
└─┐_──_│f││└,│└────-│`──`│7┘^─┘┌─┘│
{{└────┘]┘└──┘|────|└───7└─────┘#─┘
I've tested replacing step (4) with a function that iteratively, randomly merges two neighboring paths. However it game much denser puzzles, and I already think the above is nearly too dense to be difficult.
Here is a list of problems I've generated of different size: https://github.com/thomasahle/numberlink/blob/master/puzzles/inputs3

The most straightforward way to create such a level is to find a way to solve it. This way, you can basically generate any random starting configuration and determine if it is a valid level by trying to have it solved. This will generate the most diverse levels.
And even if you find a way to generate the levels some other way, you'll still want to apply this solving algorithm to prove that the generated level is any good ;)
Brute-force enumerating
If the board has a size of NxN cells, and there are also N colours available, brute-force enumerating all possible configurations (regardless of wether they form actual paths between start and end nodes) would take:
N^2 cells total
2N cells already occupied with start and end nodes
N^2 - 2N cells for which the color has yet to be determined
N colours available.
N^(N^2 - 2N) possible combinations.
So,
For N=5, this means 5^15 = 30517578125 combinations.
For N=6, this means 6^24 = 4738381338321616896 combinations.
In other words, the number of possible combinations is pretty high to start with, but also grows ridiculously fast once you start making the board larger.
Constraining the number of cells per color
Obviously, we should try to reduce the number of configurations as much as possible. One way of doing that is to consider the minimum distance ("dMin") between each color's start and end cell - we know that there should at least be this much cells with that color. Calculating the minimum distance can be done with a simple flood fill or Dijkstra's algorithm.
(N.B. Note that this entire next section only discusses the number of cells, but does not say anything about their locations)
In your example, this means (not counting the start and end cells)
dMin(orange) = 1
dMin(red) = 1
dMin(green) = 5
dMin(yellow) = 3
dMin(blue) = 5
This means that, of the 15 cells for which the color has yet to be determined, there have to be at least 1 orange, 1 red, 5 green, 3 yellow and 5 blue cells, also making a total of 15 cells.
For this particular example this means that connecting each color's start and end cell by (one of) the shortest paths fills the entire board - i.e. after filling the board with the shortest paths no uncoloured cells remain. (This should be considered "luck", not every starting configuration of the board will cause this to happen).
Usually, after this step, we have a number of cells that can be freely coloured, let's call this number U. For N=5,
U = 15 - (dMin(orange) + dMin(red) + dMin(green) + dMin(yellow) + dMin(blue))
Because these cells can take any colour, we can also determine the maximum number of cells that can have a particular colour:
dMax(orange) = dMin(orange) + U
dMax(red) = dMin(red) + U
dMax(green) = dMin(green) + U
dMax(yellow) = dMin(yellow) + U
dMax(blue) = dMin(blue) + U
(In this particular example, U=0, so the minimum number of cells per colour is also the maximum).
Path-finding using the distance constraints
If we were to brute force enumerate all possible combinations using these color constraints, we would have a lot less combinations to worry about. More specifically, in this particular example we would have:
15! / (1! * 1! * 5! * 3! * 5!)
= 1307674368000 / 86400
= 15135120 combinations left, about a factor 2000 less.
However, this still doesn't give us the actual paths. so a better idea would be to a backtracking search, where we process each colour in turn and attempt to find all paths that:
doesn't cross an already coloured cell
Is not shorter than dMin(colour) and not longer than dMax(colour).
The second criteria will reduce the number of paths reported per colour, which causes the total number of paths to be tried to be greatly reduced (due to the combinatorial effect).
In pseudo-code:
function SolveLevel(initialBoard of size NxN)
{
foreach(colour on initialBoard)
{
Find startCell(colour) and endCell(colour)
minDistance(colour) = Length(ShortestPath(initialBoard, startCell(colour), endCell(colour)))
}
//Determine the number of uncoloured cells remaining after all shortest paths have been applied.
U = N^(N^2 - 2N) - (Sum of all minDistances)
firstColour = GetFirstColour(initialBoard)
ExplorePathsForColour(
initialBoard,
firstColour,
startCell(firstColour),
endCell(firstColour),
minDistance(firstColour),
U)
}
}
function ExplorePathsForColour(board, colour, startCell, endCell, minDistance, nrOfUncolouredCells)
{
maxDistance = minDistance + nrOfUncolouredCells
paths = FindAllPaths(board, colour, startCell, endCell, minDistance, maxDistance)
foreach(path in paths)
{
//Render all cells in 'path' on a copy of the board
boardCopy = Copy(board)
boardCopy = ApplyPath(boardCopy, path)
uRemaining = nrOfUncolouredCells - (Length(path) - minDistance)
//Recursively explore all paths for the next colour.
nextColour = NextColour(board, colour)
if(nextColour exists)
{
ExplorePathsForColour(
boardCopy,
nextColour,
startCell(nextColour),
endCell(nextColour),
minDistance(nextColour),
uRemaining)
}
else
{
//No more colours remaining to draw
if(uRemaining == 0)
{
//No more uncoloured cells remaining
Report boardCopy as a result
}
}
}
}
FindAllPaths
This only leaves FindAllPaths(board, colour, startCell, endCell, minDistance, maxDistance) to be implemented. The tricky thing here is that we're not searching for the shortest paths, but for any paths that fall in the range determined by minDistance and maxDistance. Hence, we can't just use Dijkstra's or A*, because they will only record the shortest path to each cell, not any possible detours.
One way of finding these paths would be to use a multi-dimensional array for the board, where
each cell is capable of storing multiple waypoints, and a waypoint is defined as the pair (previous waypoint, distance to origin). The previous waypoint is needed to be able to reconstruct the entire path once we've reached the destination, and the distance to origin
prevents us from exceeding the maxDistance.
Finding all paths can then be done by using a flood-fill like exploration from the startCell outwards, where for a given cell, each uncoloured or same-as-the-current-color-coloured neigbour is recursively explored (except the ones that form our current path to the origin) until we reach either the endCell or exceed the maxDistance.
An improvement on this strategy is that we don't explore from the startCell outwards to the endCell, but that we explore from both the startCell and endCell outwards in parallel, using Floor(maxDistance / 2) and Ceil(maxDistance / 2) as the respective maximum distances. For large values of maxDistance, this should reduce the number of explored cells from 2 * maxDistance^2 to maxDistance^2.

I think you'll want to do this in two steps. Step 1) find a set of non-intersecting paths that connect all your points, then 2) Grow/shift those paths to fill the entire board
My thoughts on Step 1 are to essentially perform Dijkstra like algorithm on all points simultaneously, growing together the paths. Similar to Dijkstra, I think you'll want to flood-fill out from each of your points, chosing which node to search next using some heuristic (My hunch says chosing points with the least degrees of freedom first, then by distance, might be a good one). Very differently from Dijkstra though I think we might be stuck with having to backtrack when we have multiple paths attempting to grow into the same node. (This could of course be fairly problematic on bigger maps, but might not be a big deal on small maps like the one you have above.)
You may also solve for some of the easier paths before you start the above algorithm, mainly to cut down on the number of backtracks needed. In specific, if you can make a trace between points along the edge of the board, you can guarantee that connecting those two points in that fashion would never interfere with other paths, so you can simply fill those in and take those guys out of the equation. You could then further iterate on this until all of these "quick and easy" paths are found by tracing along the borders of the board, or borders of existing paths. That algorithm would actually completely solve the above example board, but would undoubtedly fail elsewhere .. still, it would be very cheap to perform and would reduce your search time for the previous algorithm.
Alternatively
You could simply do a real Dijkstra's algorithm between each set of points, pathing out the closest points first (or trying them in some random orders a few times). This would probably work for a fair number of cases, and when it fails simply throw out the map and generate a new one.
Once you have Step 1 solved, Step 2 should be easier, though not necessarily trivial. To grow your paths, I think you'll want to grow your paths outward (so paths closest to walls first, growing towards the walls, then other inner paths outwards, etc.). To grow, I think you'll have two basic operations, flipping corners, and expanding into into adjacent pairs of empty squares.. that is to say, if you have a line like
.v<<.
v<...
v....
v....
First you'll want to flip the corners to fill in your edge spaces
v<<<.
v....
v....
v....
Then you'll want to expand into neighboring pairs of open space
v<<v.
v.^<.
v....
v....
v<<v.
>v^<.
v<...
v....
etc..
Note that what I've outlined wont guarantee a solution if one exists, but I think you should be able to find one most of the time if one exists, and then in the cases where the map has no solution, or the algorithm fails to find one, just throw out the map and try a different one :)

You have two choices:
Write a custom solver
Brute force it.
I used option (2) to generate Boggle type boards and it is VERY successful. If you go with Option (2), this is how you do it:
Tools needed:
Write a A* solver.
Write a random board creator
To solve:
Generate a random board consisting of only endpoints
while board is not solved:
get two endpoints closest to each other that are not yet solved
run A* to generate path
update board so next A* knows new board layout with new path marked as un-traversable.
At exit of loop, check success/fail (is whole board used/etc) and run again if needed
The A* on a 10x10 should run in hundredths of a second. You can probably solve 1k+ boards/second. So a 10 second run should get you several 'usable' boards.
Bonus points:
When generating levels for a IAP (in app purchase) level pack, remember to check for mirrors/rotations/reflections/etc so you don't have one board a copy of another (which is just lame).
Come up with a metric that will figure out if two boards are 'similar' and if so, ditch one of them.

Looking for an algorithm (version of 2-dimensional binary search)

Easy problem and known algorithm:
I have a big array with 100 members. First X members are 0, and the rest are 1. Find X.
I am solving it by a binary search: Check member 50, if it is 0 - check member 75, etc, until I find adjacent 0 and 1.
I am looking for an optimized algorithm for the same problem in 2-dimensions:
I have 2-dimensional array 100*100. Those members that are on rows 0-X AND on columns 0-Y are 0, and the rest are 1. How to find Y and X?

Edit : The optimal solution consists in two simple binary search.
I'm very sorry for the long and convoluted post I did below. What the problem fundamentally consists in is to find a point in a space that contains 100*100 elements. The best you can do is to divide at each step this space in two. You can do it in a convoluted way (the one I did in the rest of the post) But if you realize that a binary search on the X axis still divides the research space in two at each step, (the same goes for the Y axis) then you understand that it's optimal.
I still let the thing I did, and I'm sorry that I made some peremptory affirmations in it.
If you're looking for a simple algorithm (though not optimal) just run the binary search twice as suggested.
However, if you want an optimal algorithm, you can look for the boundary on X and on Y at the same time. (You have to note that the two algorithm have same asymptotical complexity, but the optimal algorithm will still be faster)
In all the following graphics, the point (0, 0) is in the bottom left corner.
Basically when you choose a point and get the result, you cut your space in two parts. When you think about it that is actually the biggest amount of information you can extract from this.
If you choose the point (the black cross) and the result is 1 (red lines), this means that the point you're looking for can not be in the gray space (thus must be in the remaining white area)
On the other hand, if the value is 0 (blue lines), this means that the point you're looking for can not be in the gray area (thus must be in the remaining white area)
So, if you get one 0 result and one 1 result, this is what you'll get :
The point you're looking for is either in rectangle 1, 2 or 3. You just need to check the two corners of rectangle 3 to know which of the 3 rectangle is the good one.
So the algorithm is the following :
Note where are the bottom left and top right corner of the rectangle you're working with.
Do a binary search along the diagonal of the rectangle until you've stumbled at least once on a 1 result and once a 0 result.
Check the 2 other corners of the rectangle 3 (you'll necessary already know the values of the two corners on the diagonal) It is possible to check only one corner to know the right rectangle (but you'll have to check the two corners if the right rectangle is the rectangle 3)
Determine if the point you're looking for is in rectangle 1, 2 or 3
Repeat by reducing the problem to the good rectangle until the final rectangle is reduced to a point : it's the value you're looking for
Edit : if you want the supremum optimality, you'd not the when you choose the point (50, 50), you do not cut the space in equal part. One is three time bigger than the other. Ideally, you'll choose a point that cuts the space in two equal regions (area-wise)
You should compute once at the beginning the value of factor = (1.0 - 1.0/sqrt(2.0)). Then when you want to cut bewteen values a and b, choose the cutting point as a + factor*(b-a). When you cut the initial 100x100 rectangle at the point (100*factor, 100*factor) the two regions will have an area (100*100)/2, thus the convergence will be quicker.

Run your binary search twice. First determine X by running binary search on the last row and then determine Y by running binary search on last column.

Simple solution: go first in X-direction and then in Y-direction.
Check (0,50); If it is 0, check (0,75); until You find adjacent 0 and 1. Then go to Y direction from there.
Second solution:
Check member (50,50). If it is 1, check (25,25), until You find 0. Continue, until You find adjacent (X,X) and (X+1,X+1) that are 0 and 1. Then test (X,X+1) and (X+1,X). Neither or one of them will be 1. If neither, You are finished. If only one, say for example (X+1,X), then You know that the box's size is between (X+1,X) and (100,X). Use binary search to find box's height.
EDIT: As Chris pointed out, it seems that the simple approach is faster.
Second solution (modified):
Check member (50,50). If it is 1, check (25,25), until You find 0. Continue, until You find adjacent (X,X) and (X+1,X+1) that are 0 and 1. Then test (X,X+1). If it is 1, then do binary search on line (X,X+1)...(X,100). Else do binary search on line (X,X)...(100,X).
Even then I am probably beating a dead horse here. If it will be faster, then by neglible amount. This is just for theoretical fun. :)
EDIT 2 As Fezvez and Chris put it, binary search divides the search space in two most efficiently; My approach divides the area to 1/4 and 3/4 pieces. Fezvez pointed out that this could be remedied by calculating the dividing factor beforehand (but that would be extra calculation). In modified version of my algorithm I choose the direction where to go (X or Y direction), which effectively also divides the search space in two, and then conduct binary search. To conclude, this shows that this approach will always be a bit slower. (and more complicated to implement.)
Thank You, Igor Oks, for interesting question. :)

Use binary search on both dimensions and the 1D case:
Start with j=50. Now the 1-D array obtained by varying i is of the desired form - so find X from 1D case.
If X = 100 (i.e. no ones), then make j=75 (middle of the range in j dimension) and repeat.
If X < 100, then you have found it. All that is left is to fix i=X and find Y from the 1D case.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio