Generating Random Puzzle Boards for Rush Hour Game - algorithm

If you're not familiar with it, the game consists of a collection of cars of varying sizes, set either horizontally or vertically, on a NxM grid that has a single exit.
Each car can move forward/backward in the directions it's set in, as long as another car is not blocking it. You can never change the direction of a car.
There is one special car, usually it's the red one. It's set in the same row that the exit is in, and the objective of the game is to find a series of moves (a move - moving a car N steps back or forward) that will allow the red car to drive out of the maze.
I've been trying to think how to generate instances for this problem, generating levels of difficulty based on the minimum number to solve the board.
Any idea of an algorithm or a strategy to do that?
Thanks in advance!

The board given in the question has at most 4*4*4*5*5*3*5 = 24.000 possible configurations, given the placement of cars.
A graph with 24.000 nodes is not very large for todays computers. So a possible approach would be to
construct the graph of all positions (nodes are positions, edges are moves),
find the number of winning moves for all nodes (e.g. using Dijkstra) and
select a node with a large distance from the goal.

One possible approach would be creating it in reverse.
Generate a random board, that has the red car in the winning position.
Build the graph of all reachable positions.
Select a position that has the largest distance from every winning position.
The number of reachable positions is not that big (probably always below 100k), so (2) and (3) are feasible.
How to create harder instances through local search
It's possible that above approach will not yield hard instances, as most random instances don't give rise to a complex interlocking behavior of the cars.
You can do some local search, which requires
a way to generate other boards from an existing one
an evaluation/fitness function
(2) is simple, maybe use the length of the longest solution, see above. Though this is quite costly.
(1) requires some thought. Possible modifications are:
add a car somewhere
remove a car (I assume this will always make the board easier)
Those two are enough to reach all possible boards. But one might to add other ways, because of removing makes the board easier. Here are some ideas:
move a car perpendicularly to its driving direction
swap cars within the same lane (aaa..bb.) -> (bb..aaa.)
Hillclimbing/steepest ascend is probably bad because of the large branching factor. One can try to subsample the set of possible neighbouring boards, i.e., don't look at all but only at a few random ones.

I know this is ancient but I recently had to deal with a similar problem so maybe this could help.
Constructing instances by applying random operators from a terminal state (i.e., reverse) will not work well. This is due to the symmetry in the state space. On average you end up in a state that is too close to the terminal state.
Instead, what worked better was to generate initial states (by placing random cars on the grid) and then to try to solve it with some bounded heuristic search algorithm such as IDA* or branch and bound. If an instance cannot be solved under the bound, discard it.
Try to avoid A*. If you have your definition of what you mean is a "hard" instance (I find 16 moves to be pretty difficult) you can use A* with a pruning rule that prevents expansion of nodes x with g(x)+h(x)>T (T being your threshold (e.g., 16)).
Heuristics function - Since you don't have to be optimal when solving it, you can use any simple inadmissible heuristic such as number of obstacle squares to the goal. Alternatively, if you need a stronger heuristic function, you can implement a manhattan distance function by generating the entire set of winning states for the generated puzzle and then using the minimal distance from a current state to any of the terminal state.

Related

Does the removal of a few edges remove all paths to a node?

I'm making a game engine for a board game called Blockade and right now I'm trying to generate all legal moves in a position. The rules aren't exactly the same as the actual game and they don't really matter. The gist is: the board is a matrix and you move a pawn and place a wall every move.
In short, I have to find whether or not a valid path exists from every pawn to every goal after every potential legal move (imagine a pawn doesn't move and a wall is just placed), to rule out illegal moves. Or rather, if I simplify it to a subproblem, whether or not the removal of a few edges (placing a wall) removes all paths to a node.
Brute-forcing it would take O(k*n*m), where n and m are the board dimensions and k is the number of potential legal moves. Searching for a path (worst case; traversing most of the board) is very expensive, but I'm thinking with dynamic programming or some other idea/algorithm it can be done faster since the position is the same the wall placement just changes, or rather, in graph terms, the graph is the same which edges are removed is just changed. Any sort of optimization is welcome.
Edit:
To elaborate on the wall (blockade). A wall is two squares wide/tall (depending on whether it's horizontal or vertical) therefore it will usually remove at least four edges, eg:
p | r
q | t
In this 2x2 matrix, placing a wall in the middle (as shown) will remove jumping from and to:
p and t, q and r, p and r, and q and t
I apologize ahead of time if I don't fully understand your question as it is asked; there seems to be some tacit contextual knowledge you are hinting at in your question with respect to knowledge about how the blockade game works (which I am completely unfamiliar with.)
However, based on a quick scan on wikipedia about the rules of the game, and from what I gather from your question, my understanding is that you are effectively asking how to ensure that a move is legal. Based on what I understand, an illegal move is a wall/blockade placement that would make it impossible for any pawn to reach its goal state.
In this case, I believe a workable solution that would be fairly efficient would be as follows.
Define a path tree of a pawn to be a (possibly but not necessarily shortest) path tree from the pawn to each reachable position. The idea is, you want to maintain a path tree for every pawn so that it can be updated efficiently with every blockade placement. What is described in the previous sentence can be accomplished by observing and implementing the following:
when a blockade is placed it removes 2 edges from the graph, which can sever up to (at most) two edges in all your existing path trees
each pawn's path tree can be efficiently recomputed after edges are severed using the "adoption" algorithm of the Boykov-Komolgrov maxflow algorithm.
once every pawns path tree is recomputed efficiently, simply check that each pawn can still access its goal state, if not mark the move as illegal
repeat for each possible move (reseting graphs as needed during the search)
Here are resources on the adoption algorithm that is critical to doing what is described efficiently:
open-source implementation as part of the BK-maxflow: https://www.boost.org/doc/libs/1_66_0/libs/graph/doc/boykov_kolmogorov_max_flow.html
implementation by authors as part of BK-maxflow: https://pub.ist.ac.at/~vnk/software.html
detailed description of adoption (stage) algorithm of BK maxflow algorithm: section 3.2.3 of https://www.csd.uwo.ca/~yboykov/Papers/pami04.pdf
Note reading the description of the adopton algorithm included in the last
bullet point above would be most critical to understanding how to adopt
orphaned portions of your path-tree efficiently.
In terms of efficiency of this approach, I believe on average you should expect on average O(1) operations for each adopted edge, meaning this approach should take about O(k) time to compute where k is the number of board states which you wish to compute for.
Note, the pawn path tree should actually be a reverse directed tree rooted at the goal nodes, which will allow the computation to be done for all legal pawn placements given a blockade configuration.
A few suggestions:
To check if there's a path from A to B after ever
Every move removes a node from the graph/grid. So what we want to know is if there are critical nodes on the path from A to B (single points that could be blocked to break the path. This is a classic flow problem. For this application you want to set the vertex capacity to 1 and push 2 units of flow (basically just to verify that there are at least 2 paths). If there are 2 paths, no one block can disconnect you from the destination. You can optimize it a bit by using an implicit graph, but if you're new to this maybe create the graph to visualize it better. This should be O(N*M), the size of your grid.
Optimizations
Since this is a game, you know that the setup doesn't change dramatically from one step to another. So, you can keep track of the two paths. If the blocade is not placed on any of the paths, you can ignore it. You already have 2 paths to destination.
If the block does land on one of the paths, cancel only that path and then look for another (reusing the one you already have).
You can also speed up the pawn movement. This can be a bit trick, but what you want is to move the source. I'm assuming the pawn moves only a few cells at a time, maybe instead of finding completely new paths, you can simply adjust them to connect to the new position, speeding up the update.

Trouble finding shortest path across a 2D mesh surface

I asked this question three days ago and I got burned by contributors because I didn't include enough information. I am sorry about that.
I have a 2D matrix and each array position relates to the depth of water in a channel, I was hoping to apply Dijkstra's or a similar "least cost path" algorithm to find out the least amount of concrete needed to build a bridge across the water.
It took some time to format the data into a clean version so I've learned some rudimentary Matlab skills doing that. I have removed most of the land so that now the shoreline is standardised to a certain value, my plan is to use a loop to move through each "pixel" on the "west" shore and run a least cost algorithm against it to the closest "east" shore and move through the entire mesh ultimately finding the least cost one.
This is my problem, fitting the data to any of the algorithms. Unfortunately I get overwhelmed by options and different formats because the other examples are for other use cases.
My other consideration is that when the shortest cost path is calculated that it will be a jagged line which would not be suitable for a bridge so I need to constrain the bend radius in the path if at all possible and I don't know how to go about doing that.
A picture of the channel:
Any advice in an approach method would be great, I just need to know if someone knows a method that should work, then I will spend the time learning how to fit the data.
You can apply Dijkstra to your problem in this way:
the two "dry" regions you want to connect correspond to matrix entries with value 0; the other cells have a positive value designating the depth (or the cost of filling this place with concrete)
your edges are the connections of neighbouring cells in your matrix. (It can be a 4- or 8-neighbourhood.) The weight of the edge is the arithmetic mean of the values of the connected cells.
then you apply the Dijkstra algorithm with a starting point in one "dry" region and an end point in the other "dry" region.
The cheapest path will connect two cells of value 0 and its weight will correspond to sum of costs of cells visited. (One half of each cell weight is coming from the edge going to the cell, the other half from the edge leaving the cell.)
This way you will get a possibly rather crooked path leading over the water, which may be a helpful hint for where to build a cheap bridge.
You can speed up the calculation by using the A*-algorithm. Therefore one needs a lower bound of the remaining costs for reaching the other side for each cell. Such a lower bound can be calculated by examining the "concentric rings" around a point as long as rings do not contain a 0-cell of the other side. The sum of the minimal cell values for each ring is then a lower bound of the remaining costs.
An alternative approach, which emphasizes the constraint that you require a non-jagged shape for your bridge, would be to use Monte-Carlo, simulated annealing or a genetic algorithm, where the initial "bridge" consisted a simple spline curve between two randomly chosen end points (one on each side of the chasm), plus a small number or randomly chosen intermediate points in the chasm. You would end up with a physically 'realistic' bridge and a reasonably optimized cost of concrete.

Algorithm for solving Flow Free Game

I recently started playing Flow Free Game.
Connect matching colors with pipe to create a flow. Pair all colors, and cover the entire board to solve each puzzle in Flow Free. But watch out, pipes will break if they cross or overlap!
I realized it is just path finding game between given pair of points with conditions that no two paths overlap. I was interested in writing a solution for the game but don't know where to start. I thought of using backtracking but for very large board sizes it will have high time complexity.
Is there any suitable algorithm to solve the game efficiently. Can using heuristics to solve the problem help? Just give me a hint on where to start, I will take it from there.
I observed in most of the boards that usually
For furthest points, you need to follow path along edge.
For point nearest to each other, follow direct path if there is one.
Is this correct observation and can it be used to solve it efficiently?
Reduction to SAT
Basic idea
Reduce the problem to SAT
Use a modern SAT solver to solve the problem
Profit
Complexity
The problem is obviously in NP: If you guess a board constellation, it is easy (poly-time) to check whether it solves the problem.
Whether it is NP-hard (meaning as hard as every other problem in NP, e.g. SAT), is not clear. Surely modern SAT solvers will not care and solve large instances in a breeze anyway (I guess up to 100x100).
Literature on Number Link
Here I just copy Nuclearman's comment to the OP:
Searching for "SAT formulation of numberlink" and "NP-completeness of numberlink" leads to a couple references. Unsurprisingly, the two most interesting ones are in Japanese. The first is the actual paper proof of NP-completeness. The second describes how to solve NumberLink using the SAT solver, Sugar. –
Hint for reduction to SAT
There are several possibilities to encode the problem. I'll give one that I could make up quickly.
Remark
j_random_hacker noted that free-standing cycles are not allowed. The following encoding does allow them. This problem makes the SAT encoding a bit less attractive. The simplest method I could think of to forbid free-standing loops would introduce O(n^2) new variables, where n is the number of tiles on the board (count distance from next sink for each tile) unless one uses log encoding for this, which would bring it down to O(n*log n), possible making the problem harder for the solver.
Variables
One variable per tile, piece type and color. Example if some variable X-Y-T-C is true it encodes that the tile at position X/Y is of type T and has color C. You don't need the empty tile type since this cannot happen in a solution.
Set initial variables
Set the variables for the sink/sources and say no other tile can be sink/source.
Constraints
For every position, exactly one color/piece combination is true (cardinality constraint).
For every variable (position, type, color), the four adjacent tiles have to be compatible (if the color matches).
I might have missed something. But it should be easily fixed.
I suspect that no polynomial-time algorithm is guaranteed to solve every instance of this problem. But since one of the requirements is that every square must be covered by pipe, a similar approach to what both people and computers use for solving Sudoku should work well here:
For every empty square, form a set of possible colours for that square, and then repeatedly perform logical deductions at each square to shrink the allowed set of colours for that square.
Whenever a square's set of possible colours shrinks to size 1, the colour for that square is determined.
If we reach a state where no more logical deductions can be performed and the puzzle is not completely solved yet (i.e. there is at least one square with more than one possible colour), pick one of these undecided squares and recurse on it, trying each of the possible colours in turn. Each try will either lead to a solution, or a contradiction; the latter eliminates that colour as a possibility for that square.
When picking a square to branch on, it's generally a good idea to pick a square with as few allowed colours as possible.
[EDIT: It's important to avoid the possibility of forming invalid "loops" of pipe. One way to do this is by maintaining, for each allowed colour i of each square x, 2 bits of information: whether the square x is connected by a path of definite i-coloured tiles to the first i-coloured endpoint, and the same thing for the second i-coloured endpoint. Then when recursing, don't ever pick a square that has two neighbours with the same bit set (or with neither bit set) for any allowed colour.]
You actually don't need to use any logical deductions at all, but the more and better deductions you use, the faster the program will run as they will (possibly dramatically) reduce the amount of recursion. Some useful deductions include:
If a square is the only possible way to extend the path for some particular colour, then it must be assigned that colour.
If a square has colour i in its set of allowed colours, but it does not have at least 2 neighbouring squares that also have colour i in their sets of allowed colours, then it can't be "reached" by any path of colour i, and colour i can be eliminated as a possibility.
More advanced deductions based on path connectivity might help further -- e.g. if you can determine that every path connecting some pair of connectors must pass through a particular square, you can immediately assign that colour to the square.
This simple approach infers a complete solution without any recursion in your 5x5 example: the squares at (5, 2), (5, 3), (4, 3) and (4, 4) are forced to be orange; (4, 5) is forced to be green; (5, 5) is also forced to be green by virtue of the fact that no other colour could get to this square and then back again; now the orange path ending at (4, 4) has nowhere to go except to complete the orange path at (3, 4). Also (3, 1) is forced to be red; (3, 2) is forced to be yellow, which in turn forces (2, 1) and then (2, 2) to be red, which finally forces the yellow path to finish at (3, 3). The red pipe at (2, 2) forces (1, 2) to be blue, and the red and blue paths wind up being completely determined, "forcing each other" as they go.
I found a blog post on Needlessly Complex that completely explains how to use SAT to solve this problem.
The code is open-source as well, so you can look at it (and understand it) in action.
I'll provide a quote from it here that describes the rules you need to implement in SAT:
Every cell is assigned a single color.
The color of every endpoint cell is known and specified.
Every endpoint cell has exactly one neighbor which matches its color.
The flow through every non-endpoint cell matches exactly one of the six direction types.
The neighbors of a cell specified by its direction type must match its color.
The neighbors of a cell not specified by its direction type must not match its color.
Thank you #Matt Zucker for creating this!
I like solutions that are similar to human thinking. You can (pretty easily) get the answer of a Sudoku by brute force, but it's more useful to get a path you could have followed to solve the puzzle.
I observed in most of the boards that usually
1.For furthest points, you need to follow path along edge.
2.For point nearest to each other, follow direct path if there is one.
Is this correct observation and can it be used to solve it efficiently?
These are true "most of the times", but not always.
I would replace your first rule by this one : if both sinks are along edge, you need to follow path along edge. (You could build a counter-example, but it's true most of the times). After you make a path along the edge, the blocks along the edge should be considered part of the edge, so your algorithm will try to follow the new edge made by the previous pipe. I hope this sentence makes sense...
Of course, before using those "most of the times" rules, you need to follow absolutes rules (see the two deductions from j_random_hacker's post).
Another thing is to try to eliminate boards that can't lead to a solution. Let's call an unfinished pipe (one that starts from a sink but does not yet reach the other sink) a snake, and the last square of the unfinished pipe will be called the snake's head. If you can't find a path of blank squares between the two heads of the same color, it means your board can't lead to a solution and should be discarded (or you need to backtrack, depending of your implementation).
The free flow game (and other similar games) accept as a valid solution a board where there are two lines of the same color side-by-side, but I believe that there always exists a solution without side-by-side lines. That would mean that any square that is not a sink would have exactly two neighbors of the same color, and sinks would have exactly one. If the rule happens to be always true (I believe it is, but can't prove it), that would be an additional constraint to decrease your number of possibilities. I solved some of Free Flow's puzzles using side-by-side lines, but most of the times I found another solution without them. I haven't seen side-by-side lines on Free Flow's solutions web site.
A few rules that lead to a sort of algorithm to solve levels in flow, based on the IOS vertions by Big Duck Games, this company seems to produce the canonical versions. The rest of this answer assumes no walls, bridges or warps.
Even if your uncannily good, the huge 15x18 square boards are a good example of how just going at it in ways that seem likely get you stuck just before the end over and over again and practically having to start again from scratch. This is probably to do with the already mentioned exponential time complexity in the general case. But this doesn’t mean that a simple stratergy isn’t overwhelmingly effective for most boards.
Blocks are never left empty, therefore orphaned blocks mean you’ve done something wrong.
Cardinally neighbouring cells of the same colour must be connected. This rules out 2x2 blocks of the same colour and on the hexagonal grid triangles of 3 neighbouring cells.
You can often make perminent progress by establishing that a color goes or is excluded from a certain square.
Due to points 1 and 2, on the hexagonal grid on boards that are hexagonal in shape a pipe going along an edge is usually stuck going along it all the way round to the exit, effectively moving the outer edge in and making the board smaller so the process can be repeated. It is predictable what sorts of neighbouring conditions guarantee and what sorts can break this cycle for both sorts of grid.
Most if not all 3rd party variants I’ve found lack 1 to 4, but given these restraints generating valid boards may be a difficult task.
Answer:
Point 3 suggests a value be stored for each cell that is able to be either a colour, or a set of false/indeterminate values there being one for each colour.
A solver could repeatedly use points 1 and 2 along with the data stored for point 3 on small neighbourhoods of paths around the ends of pipes to increasingly set colours or set the indeterminate values to false.
A few of us have spent quite a bit of time thinking about this. I summarised our work in a Medium article here: https://towardsdatascience.com/deep-learning-vs-puzzle-games-e996feb76162
Spoiler: so far, good old SAT seems to beat fancy AI algorithms!

Marauders dilemma algorithm

I'm making this repost after the earlier one here with more details.
PROBLEM :
The problem consists of a marauder who has to travel to different cities spread over a map. The starting location is known. Each city has a fixed loot associated with it. The aim of marauder is to travel across various nature of terrain. By nature of terrain, I mean there is a varied cost of travel between each pair of cities. He has to maximize the booty gained.
What we have done:
We have generated an adjacancy matrix (booty-path cost in place for each node) and then employed a heuristic analysis. It gave some output which is reasonable.
Now, the problem now is that each city has few or more vehicles in them, which can be bought (by paying) and can be used to travel. What vehicle does in actual is that it reduces the path cost. Once a vehicle is bought, it remains upto the time when next vehicle is bought. It is to upto to decide whether to buy the vehicle or not and how.
I need help at this point. How to integrate the idea of vehicle into what we already have? Plus, any further ideas which may help us to maximize the profit. I can post the code, if required. Thanks!
One way to do it would be to have a directed edge bearing the cost of the vehicle towards a duplicate graph with the reduced costs. You can even make it so that the reduction is finer than just a percentage if you want to.
The downside is that this will probably increase the size of the graph a lot (as many copies as you have different vehicles, plus the links between them), and if your heuristic is not optimal, you may have to modify it so that it considers the new edge positively.
It sounds as though beam search would suit this problem. Beam search uses a heuristic function H and a parameter k and works like this:
Initialize the set S to the initial game position.
Set T to the empty set.
For each game position in S, generate all possible successor positions to S after one move by the marauder. (A move being to loot, to purchase a vehicle, to move to an adjacent city, or whatever else a marauder can do.) Add each such successor position to the set T.
For each position p in T, evaluate H(p) for a heuristic function H. (The heuristic function can take into account the amount of loot, the possession of a vehicle, the number of remaining unlooted cities, and whatever else you think is relevant and easy to compute.)
If you've run out of search time, return the best-scoring position in T.
Otherwise, set S to the best-scoring k positions in T and go back to step 2.
The algorithm works well if you store T in the form of a heap with k elements.

Calculate minimum moves to solve a puzzle

I'm in the process of creating a game where the user will be presented with 2 sets of colored tiles. In order to ensure that the puzzle is solvable, I start with one set, copy it to a second set, then swap tiles from one set to another. Currently, (and this is where my issue lies) the number of swaps is determined by the level the user is playing - 1 swap for level 1, 2 swaps for level 2, etc. This same number of swaps is used as a goal in the game. The user must complete the puzzle by swapping a tile from one set to the other to make the 2 sets match (by color). The order of the tiles in the (user) solved puzzle doesn't matter as long as the 2 sets match.
The problem I have is that as the number of swaps I used to generate the puzzle approaches the number of tiles in each set, the puzzle becomes easier to solve. Basically, you can just drag from one set in whatever order you need for the second set and solve the puzzle with plenty of moves left. What I am looking to do is after I finish building the puzzle, calculate the minimum number of moves required to solve the puzzle. Again, this is almost always less than the number of swaps used to create the puzzle, especially as the number of swaps approaches the number of tiles in each set.
My goal is to calculate the best case scenario and then give the user a "fudge factor" (i.e. 1.2 times the minimum number of moves). Solving the puzzle in under this number of moves will result in passing the level.
A little background as to how I currently have the game configured:
Levels 1 to 10: 9 tiles in each set. 5 different color tiles.
Levels 11 to 20: 12 tiles in each set. 7 different color tiles.
Levels 21 to 25: 15 tiles in each set. 10 different color tiles.
Swapping within a set is not allowed.
For each level, there will be at least 2 tiles of a given color (one for each set in the solved puzzle).
Is there any type of algorithm anyone could recommend to calculate the minimum number of moves to solve a given puzzle?
The minimum moves to solve a puzzle is essentially the shortest path from that unsolved state to a solved state. Your game implicitly defines a graph where the vertices are legal states, and there's an edge between two states if there's a legal move that enables that transition.
Depending on the size of your search space, a simple breadth-first search would be feasible, and would give you the minimum number of steps to reach any given state. In fact, you can generate the problems this way too: instead of making random moves to arrive at a state and checking its "distance" from the initial state, simply explore the search space in breadth-first/level-order, and pick a state at a given "distance" for your puzzle.
Related questions
Rush Hour - Solving the Game
BFS is used to solve Rush Hour, with source code in Java
Alternative
IF the search space is too huge for BFS (and I'm not yet convinced that it is), you can use iterative deepening depth-first search instead. It's space-efficient like DFS, but (cummulatively) level-order like BFS. Even though nodes would be visited many times, it is still asymptotically identical to BFS, but requiring much leser space.
I didn't quite understand the puzzle from your description, but two general ideas often useful in solving that kind of puzzles are backtracking and branch and bound.
The A* search algorithm. The idea is that you have some measure of how close a position is to the solution. A* is then a "best first" search in the sense that at each step it considers moves from the best position found so far. It's up to you to come up with some kind of measure of how close you are to a solution. (It doesn't have to be accurate, it's just a heuristic to guide the search.) In practice it often performs much better than a pure breadth first search because it's always guided by your closeness scoring function. But without understanding your problem description, it's hard to say. (A rule of thumb is that if there's a sense of "making progress" while doing a puzzle, rather than it all suddenly coming together at the end, then A* is a good choice.)

Resources