Maze Solving From All Start Points - algorithm

Recently I came across a problem that stated;
Assume a maze having characters *,.,C .* represents walls and ./C are allowed. There is only one point which is marked C. Now given a bot stands on any of the allowed points, there exists a series of commands (for example LDDRU or LLLRRDU,etc.) such that if the bot starts from any allowed point, it passes through C at least once.
Eg:
******
*.C..*
**.***
*....*
******
Command: RLLURUU
Now I know how to solve a maze using DFS/BFS(for shortest path). But can anyone provide a hint on how I shall proceed problems like this?
EDIT: if the next move is into walls / outside maze, it is ignored. And as usual L IS LEFT R IS RIGHT U IS UP D IS DOWN.

This problem is related to the concept of synchronizing words or reset sequences for finite automata. You can imagine building an automaton where
each open space, plus C, is a state;
each state other than C transitions to itself for every move that hits a wall;
each state other than C transitions to a neighboring open state in the indicated direction if there's an open spot in that direction; and
the C state transitions to itself on all moves.
Given this automaton, you're now looking for a sequence that takes every state to the C state, hence the connection to synchronizing words. There are a number of algorithms for finding synchronizing words, and any of them could be adapted to solve this particular problem. One option would be to build the power automaton from the original automaton and to look for a path from the start state to the C state, which (I believe) ends up being a theoretically optimal version of the comment talking about collapsing virtual robots together (in that it will always find the optimal path.)

Related

Does the removal of a few edges remove all paths to a node?

I'm making a game engine for a board game called Blockade and right now I'm trying to generate all legal moves in a position. The rules aren't exactly the same as the actual game and they don't really matter. The gist is: the board is a matrix and you move a pawn and place a wall every move.
In short, I have to find whether or not a valid path exists from every pawn to every goal after every potential legal move (imagine a pawn doesn't move and a wall is just placed), to rule out illegal moves. Or rather, if I simplify it to a subproblem, whether or not the removal of a few edges (placing a wall) removes all paths to a node.
Brute-forcing it would take O(k*n*m), where n and m are the board dimensions and k is the number of potential legal moves. Searching for a path (worst case; traversing most of the board) is very expensive, but I'm thinking with dynamic programming or some other idea/algorithm it can be done faster since the position is the same the wall placement just changes, or rather, in graph terms, the graph is the same which edges are removed is just changed. Any sort of optimization is welcome.
Edit:
To elaborate on the wall (blockade). A wall is two squares wide/tall (depending on whether it's horizontal or vertical) therefore it will usually remove at least four edges, eg:
p | r
q | t
In this 2x2 matrix, placing a wall in the middle (as shown) will remove jumping from and to:
p and t, q and r, p and r, and q and t
I apologize ahead of time if I don't fully understand your question as it is asked; there seems to be some tacit contextual knowledge you are hinting at in your question with respect to knowledge about how the blockade game works (which I am completely unfamiliar with.)
However, based on a quick scan on wikipedia about the rules of the game, and from what I gather from your question, my understanding is that you are effectively asking how to ensure that a move is legal. Based on what I understand, an illegal move is a wall/blockade placement that would make it impossible for any pawn to reach its goal state.
In this case, I believe a workable solution that would be fairly efficient would be as follows.
Define a path tree of a pawn to be a (possibly but not necessarily shortest) path tree from the pawn to each reachable position. The idea is, you want to maintain a path tree for every pawn so that it can be updated efficiently with every blockade placement. What is described in the previous sentence can be accomplished by observing and implementing the following:
when a blockade is placed it removes 2 edges from the graph, which can sever up to (at most) two edges in all your existing path trees
each pawn's path tree can be efficiently recomputed after edges are severed using the "adoption" algorithm of the Boykov-Komolgrov maxflow algorithm.
once every pawns path tree is recomputed efficiently, simply check that each pawn can still access its goal state, if not mark the move as illegal
repeat for each possible move (reseting graphs as needed during the search)
Here are resources on the adoption algorithm that is critical to doing what is described efficiently:
open-source implementation as part of the BK-maxflow: https://www.boost.org/doc/libs/1_66_0/libs/graph/doc/boykov_kolmogorov_max_flow.html
implementation by authors as part of BK-maxflow: https://pub.ist.ac.at/~vnk/software.html
detailed description of adoption (stage) algorithm of BK maxflow algorithm: section 3.2.3 of https://www.csd.uwo.ca/~yboykov/Papers/pami04.pdf
Note reading the description of the adopton algorithm included in the last
bullet point above would be most critical to understanding how to adopt
orphaned portions of your path-tree efficiently.
In terms of efficiency of this approach, I believe on average you should expect on average O(1) operations for each adopted edge, meaning this approach should take about O(k) time to compute where k is the number of board states which you wish to compute for.
Note, the pawn path tree should actually be a reverse directed tree rooted at the goal nodes, which will allow the computation to be done for all legal pawn placements given a blockade configuration.
A few suggestions:
To check if there's a path from A to B after ever
Every move removes a node from the graph/grid. So what we want to know is if there are critical nodes on the path from A to B (single points that could be blocked to break the path. This is a classic flow problem. For this application you want to set the vertex capacity to 1 and push 2 units of flow (basically just to verify that there are at least 2 paths). If there are 2 paths, no one block can disconnect you from the destination. You can optimize it a bit by using an implicit graph, but if you're new to this maybe create the graph to visualize it better. This should be O(N*M), the size of your grid.
Optimizations
Since this is a game, you know that the setup doesn't change dramatically from one step to another. So, you can keep track of the two paths. If the blocade is not placed on any of the paths, you can ignore it. You already have 2 paths to destination.
If the block does land on one of the paths, cancel only that path and then look for another (reusing the one you already have).
You can also speed up the pawn movement. This can be a bit trick, but what you want is to move the source. I'm assuming the pawn moves only a few cells at a time, maybe instead of finding completely new paths, you can simply adjust them to connect to the new position, speeding up the update.

Shortest path in a maze

I'm developing a game similar to Pacman: consider this maze:
Each white square is a node from the maze where an object located at P, say X, is moving towards node A in the right-to-left direction. X cannot switch to its opposite direction unless it encounters a dead-end such as A. Thus the shortest path joining P and B goes through A because X cannot reverse its direction towards the rightmost-bottom node (call it C). A common A* algorithm would output:
to get to B from P first go rightward, then go upward;
which is wrong. So I thought: well, I can set the C's visited attribute to true before running A* and let the algorithm find the path. Obviously this method doesn't work for the linked maze, unless I allow it to rediscover some nodes (the question is: which nodes? How to discriminate from useless nodes?). The first thinking that crossed my mind was: use the previous method always keeping track of the last-visited cell; if the resulting path isn't empty, you are done. Otherwise, when you get to the last-visited dead-end, say Y, (this step is followed by the failing of A*) go to Y, then use standard A* to get to the goal (I'm assuming the maze is connected). My questions are: is this guaranteed to work always? Is there a more efficient algorithm, such as an A*-derived algorithm modified to this purpose? How would you tackle this problem? I would greatly appreciate an answer explaining both optimal and non-optimal search techniques (actually I don't need the shortest path, a slightly long path is good, but I'm curious if such an optimal algorithm running as efficiently as Dijkstra's algorithm exists; if it does, what is its running time compared to a non-optimal algorithm?)
EDIT For Valdo: I added 3 cells in order to generalize a bit: please tell me if I got the idea:
Good question. I can suggest the following approach.
Use Dijkstra (or A*) algorithm on a directed graph. Each cell in your maze should be represented by multiple (up to 4) graph nodes, each node denoting the visited cell in a specific state.
That is, in your example you may be in the cell denoted by P in one of 2 states: while going left, and while going right. Each of them is represented by a separate graph node (though spatially it's the same cell). There's also no direct link between those 2 nodes, since you can't switch your direction in this specific cell.
According to your rules you may only switch direction when you encounter an obstacle, this is where you put links between the nodes denoting the same cell in different states.
You may also think of your graph as your maze copied into 4 layers, each layer representing the state of your pacman. In the layer that represents movement to the right you put only links to the right, also w.r.t. to the geometry of your maze. In the cells with obstacles where moving right is not possible you put links to the same cells at different layers.
Update:
Regarding the scenario that you described in your sketch. It's actually correct, you've got the idea right, but it looks complicated because you decided to put links between different cells AND states.
I suggest the following diagram:
The idea is to split your inter-cell AND inter-state links. There are now 2 kinds of edges: inter-cell, marked by blue, and inter-state, marked by red.
Blue edges always connect nodes of the same state (arrow direction) between adjacent cells, whereas red edges connect different states within the same cell.
According to your rules the state change is possible where the obstacle is encountered, hence every state node is the source of either blue edges if no obstacle, or red if it encounters an obstacle (i.e. can't emit a blue edge). Hence I also painted the state nodes in blue and red.
If according to your rules state transition happens instantly, without delay/penalty, then red edges have weight 0. Otherwise you may assign a non-zero weight for them, the weight ratio between red/blue edges should correspond to the time period ratio of turn/travel.

Generating Random Puzzle Boards for Rush Hour Game

If you're not familiar with it, the game consists of a collection of cars of varying sizes, set either horizontally or vertically, on a NxM grid that has a single exit.
Each car can move forward/backward in the directions it's set in, as long as another car is not blocking it. You can never change the direction of a car.
There is one special car, usually it's the red one. It's set in the same row that the exit is in, and the objective of the game is to find a series of moves (a move - moving a car N steps back or forward) that will allow the red car to drive out of the maze.
I've been trying to think how to generate instances for this problem, generating levels of difficulty based on the minimum number to solve the board.
Any idea of an algorithm or a strategy to do that?
Thanks in advance!
The board given in the question has at most 4*4*4*5*5*3*5 = 24.000 possible configurations, given the placement of cars.
A graph with 24.000 nodes is not very large for todays computers. So a possible approach would be to
construct the graph of all positions (nodes are positions, edges are moves),
find the number of winning moves for all nodes (e.g. using Dijkstra) and
select a node with a large distance from the goal.
One possible approach would be creating it in reverse.
Generate a random board, that has the red car in the winning position.
Build the graph of all reachable positions.
Select a position that has the largest distance from every winning position.
The number of reachable positions is not that big (probably always below 100k), so (2) and (3) are feasible.
How to create harder instances through local search
It's possible that above approach will not yield hard instances, as most random instances don't give rise to a complex interlocking behavior of the cars.
You can do some local search, which requires
a way to generate other boards from an existing one
an evaluation/fitness function
(2) is simple, maybe use the length of the longest solution, see above. Though this is quite costly.
(1) requires some thought. Possible modifications are:
add a car somewhere
remove a car (I assume this will always make the board easier)
Those two are enough to reach all possible boards. But one might to add other ways, because of removing makes the board easier. Here are some ideas:
move a car perpendicularly to its driving direction
swap cars within the same lane (aaa..bb.) -> (bb..aaa.)
Hillclimbing/steepest ascend is probably bad because of the large branching factor. One can try to subsample the set of possible neighbouring boards, i.e., don't look at all but only at a few random ones.
I know this is ancient but I recently had to deal with a similar problem so maybe this could help.
Constructing instances by applying random operators from a terminal state (i.e., reverse) will not work well. This is due to the symmetry in the state space. On average you end up in a state that is too close to the terminal state.
Instead, what worked better was to generate initial states (by placing random cars on the grid) and then to try to solve it with some bounded heuristic search algorithm such as IDA* or branch and bound. If an instance cannot be solved under the bound, discard it.
Try to avoid A*. If you have your definition of what you mean is a "hard" instance (I find 16 moves to be pretty difficult) you can use A* with a pruning rule that prevents expansion of nodes x with g(x)+h(x)>T (T being your threshold (e.g., 16)).
Heuristics function - Since you don't have to be optimal when solving it, you can use any simple inadmissible heuristic such as number of obstacle squares to the goal. Alternatively, if you need a stronger heuristic function, you can implement a manhattan distance function by generating the entire set of winning states for the generated puzzle and then using the minimal distance from a current state to any of the terminal state.

Find an algorithm to win this battle against crime!

A crime committed in a city and the suspect starts to run away. A map of the city is given. At the moment, there are some police cars at some given places and they try to stop the suspect. The car of police and the suspect have a same maximum speed. The suspect can only pass a point if he reaches it earlier than any police car. There are several exits in the map, and the suspect evades if he reaches any of them. Find an algorithm allocating the police cars so that no path can the suspect take to evade.
For example, below is a possible city map.
White circle is where the suspect starts, black circles are police cars, and little squares are exits. In this situation, suspect can be stopped. A possible plan is police car A goes to A', B stays and C goes to C'.
An equivalent description of my problem could be:
A chemical factory (marked by the white circle) explodes and poisonous fluid starts to flow at each possible direction at speed v, and the rescue teams (marked by black circles) whose maximum speed is also v are trying to block it. The little squares are villagers they are protecting.
My Thoughts
If we have n police cars, a highly inefficient approach is to list all possible k-element subsets P of vertices such that:
a) k <= n;
b) Remove all vertices in P in the map will cause any exit unreachable to the suspect;
c) Remove any proper subset of P will let at least one exit reachable to the suspect.
Then we can easily determine if every vertex in P can be covered by a police no later than the suspect.
But how do I list all the possible Ps?
#Lior Kogan:
Look at this map:
If it is a turning game in which both sides knowing other's strategy, the police will win because he can just guard the side where the suspect go.
But in my problem, the police loses because he'll never know which side the suspect may choose.
Edit2: Based on your clarifications:
I couldn't find any research concerning the exact posed problem.
Another close subject is virus spread and inoculation in networks. Here are some papers:
Inoculation strategies for victims of viruses and the sum-of-squares partition problem
Worm Versus Alert: Who Wins in a Battle for Control of a Large-Scale Network?
Protecting Against Network Infections: A Game Theoretic Perspective
I think that the posed problem is very interesting. Though I believe it is NP-hard.
Sorry for being unable to help any further.
--
Edit1: Changed from Cops and Robbers game to Graph guarding game.
New answer:
This is a variant of the Graph Guarding game.
A team of mobile agents, called guards, tries to keep an intruder out of an assigned area by blocking all possible attacks. In a graph model for this setting, the agents and the intruder are located on the vertices of a graph, and they move from node to node via connecting edges.
See: Guard Games on Graphs and How to Guard a Graph?
In your variant, there are two differences:
You are trying to guard more than one area
Each guarded area is a single node
--
Original answer:
This is a variant of the well studied Cops and Robbers game.
The Cops and Robbers game is played on undirected graphs where a group of cops tries to catch a robber. The game was defined independently by Winkler-Nowakowski and Quilliot in the 1980s and since that time has been studied intensively. Despite of that, its computation complexity is still an open question.
The problem of determining if k cops can capture a robber on an undirected graph, as well as the problem of computing the minimum number of cops that can catch a robber on a given graph were proven to be NP-hard.
Here are some resources:
Chapter 6 of The Game of Cops and Robbers on Graphs
On tractability of Cops and Robbers game
Complexity of Cops and Robber Game
Talks on GRASTA 2011 (see ch.3)
Now I have a clearer view of my problem. Although simpler than the Cops and Robbers Game or Graph Guarding game, it is nevertheless an NP-hard problem.
Two separate tasks this problem can actually be divided into:
Task a) Find a possible set of vertices that cuts the suspect unreachable to any exits.
Task b) Validate if this set of vertices can be all in-timely covered by police cars.
Now we are going to prove that Task a) is NP-complete.
First we consider when there is only one exit. Look at this simple map:
Assign False to a vertex if it is blocked by police and True if it's passable. We know that the suspect can evade if A & (B | D) & C == True. Now we clearly see that Task a) is equivalent to the famous NP-complete Boolean satisfiability problem.
If we have several exits, simply create several boolean expressions and connect them with AND(&).
Task b) is simply a bipartite graph matching problem, can be easily solved by Hungarian algorithm. It's time complexity is O(n^4).
So this whole problem is an NP-hard.

Combinatorial optimization

Suppose we have a connected and undirected graph: G=(V,E).
Definition of connected-set: a group of points belonging to V of G forms a valid connected-set iff every point in this group is within T-1 edges away from any other point in the same group, T is the number of points in the group.
Pls note that a connected set is just a connected subgraph of G without the edges but with the points.
And we have an arbitrary function F defined on connected-set, i.e given an arbitrary connected-set CS F(CS) will give us a real value.
Two connected-sets are said disjoint if their union is not a connected set.
For an visual explanation, pls see the graph below:
In the graph, the red,black,green point sets are all valid connected-sets, green set is disjoint to red set, but black set is not disjoint to the red one.
Now the question:
We want to find a bunch of disjoint connected-sets from G so that:
(1)every connected-set has at least K points. (K is a global parameter).
(2)the sum of their function values,i.e max(Σ F(CS)) are maximized.
Is there any efficient algorithm to tackle such a problem other than an exhaustive search?
Thx!
For example, the graph can be a planar graph in the 2D Euclidean plane, and the function value F of a connected-set CS can be defined as the area of the minimum bounding rectangle of all the points in CS(minimum bounding rectangle is the smallest rectangle enclosing all the points in the CS).
If you can define your function and prove it is a Submodular Function (property analogous to that of Convexity in continuous Optimization) then there are very efficient (strongly polynomial) algorithms that will solve your problem e.g. Minimum Norm Point.
To prove that your function is Submodular you only need to prove the following:
There are several available implementations of the Minimum Norm Point algorithm e.g. Matlab Toolbox for Submodular Function Optimization
I doubt there is an efficient algorithm since for a complete graph for instance, you cannot solve the problem without knowing the value of F on every subgraph (except if you have assumptions on F: monotonicity for instance).
Nevertheless, I'd go for a non deterministic algorithm. Try simulated annealing, with transitions being:
Remove a point from a set (if it stays connected)
Move a point from a set to another (if they stay connected)
Remove a set
Add a set with one point
Good luck, this seems to be a difficult problem.
For such a general F, it is not an easy task to draft an optimized algorithm, far from the brute force approach.
For instance, since we want to find a bunch of CS where F(CS) is maximized, should we assume we want actually to find max(Σ F(CS)) for all CS or the highest F value from all possible CS, max(F(csi))? We don't know for sure.
Also, F being arbitrary, we cannot estimate the probability of having F(cs+p1) > F(cs) => F(cs+p1+p2) > F(cs).
However, we can still discuss it:
It seems we can deduce from the problem that we can treat each CS independently, meaning if n = F(cs1) adding any cs2 (being disjoint from cs1) will have no impact on the n value.
It seems also believable, and this is where we should be able to get some gain, that the calculation of F can be made starting from any point of a CS, and, in general, if CS = cs1+cs2, F(CS) = F(cs1+cs2) = F(cs2+cs1).
Then we want to inject memoization in the algorithm in order to speed up the process when a CS is grown up little by little in order to find max(F(cs)) [considering F general, the dynamic programming approach, for instance starting from a CS made of all points, then reducing it little by little, doesn't seem to have a big interest].
Ideally, we could start with a CS made of a point, extending it by one, checking and storing F values (for each subset). Each test would first check if the F value exists in order not to calculate it ; then repeat the process for another point etc..., find the best subsets that maximize F. For a large number of points, this is a very lengthy experience.
A more reasonable approach would be to try random points and grow the CS up to a given size, then try another area distinct from the bigger CS obtained at the previous stage. One could try to assess the probability explained above, and direct the algorithm in a certain way depending on the result.
But, again due to lack of F properties, we can expect an exponential space need via memoization (like storing F(p1,...,pn), for all subsets). And an exponential complexity.
I would use dynamic programming. You can start out rephrasing your problem as a node coloring problem:
Your goal is to assign a color to each node. (In other words you are looking for a coloring of the nodes)
The available colors are black and white.
In order to judge a coloring you have to examine the set of "maximal connected sets of black nodes".
A set of black nodes is called connected if the induced subgraph is connected
A connected set of black nodes is called maximal none of the nodes in the set has a black neighbor in the original graph that is not contained in the set)
Your goal is to find the coloring that maximizes ΣF(CS). (Here you sum over the "maximal connected sets of black nodes")
You have some extra constraints are specified in your original post.
Perhaps you could look for an algorithm that does something like the following
Pick a node
Try to color the chosen node white
Look for a coloring of the remaining nodes that maximizes ΣF(CS)
Try to color the chosen node black
Look for a coloring of the remaining nodes that maximizes ΣF(CS)
Each time you have colored a node white then you can examine whether or not the graph has become "decomposable" (I made up this word. It is not official):
A partially colored graph is called "decomposable" if it contains a pair of none-white nodes that are not connected by any path that does not contain a white node.
If your partially colored graph is decomposable then you can split your problem in to two sub-problems.
EDIT: I added an alternative idea and deleted it again. :)

Resources