Algorithm to fill out a monochrome area - ideas? - image

I am looking for an algorithm which pretty much does the same as flood-fill, which is fill out a monochrome area. Instead of the recursion and the nearest-Neighbor approach i want the algorithm to be some sort of "turtle" or "mouse" that fills out the image, while leaving a path behind. This path must not contain diagonal movements. The result should be similar to a perfect Snake Game, where the entire square is filled (the snake represents the path in this case). It can cross its own path but that amount should be kept to a minimum and it should only occur in special cases (e.g: when the "mouse" enters a passage of width = 1px, where it would fill that passage out and turn around).The amount of changes in direction it takes should also be kept at a minimum.
P.S: not that this will be applied on an image, not a graph

I would model this as a graph problem where you are trying to visit all the nodes in the graph. Each pixel is a node and an edge exists between nodes that are directly next to each other.
I believe that creating an algorithm in which is a variation of a breadth first search on a graph modeled like this would achieve the result you were looking for. Ensuring that you do not visit nodes twice.
You would need to delve deeper into the implementation of the breadth first search in order to make sure it prefers a 'straight' path. Possibly when writing the breadth first search logic, create an is_straight() which checks the next node to see if it's straight or not, and prefer the straight node and only pick a non straight node when is_straight() returns false on all the child nodes.

Related

Can a quad-tree be used to accurately determine the closest object to a point?

I have a list of coordinates and I need to find the closest coordinate to a specific point which I'll call P.
At first I tried to just calculate the distance from each coordinate to P, but this is too slow.
I then tried to store these coordinates as a quad-tree, find the leaf node that contains P, then find the closest coordinate in that leaf by comparing distances of every coordinate to P. This gives a good approximation for the closest coordinate, but can be wrong sometimes. (when a coordinate is outside the leaf node, but closer). I've also tried searching through the leaf node's parent, but while that makes the search more accurate, it doesn't make it perfect.
If it is possible for this to be done with a quad-tree, please let me know how, otherwise, what other methods/data structures could I used that are reasonably efficient, or is it even possible to do this perfectly in an efficient manner?
Try "loose quadtree". It does not have a fixed volume per node. So it can adjust each node's bounding volume to adapt to the items added.
If you don't like quadtree's traversing performance and if your objects are just points, adaptive-grid can perform fast or very close to O(N). But memory-wise, loose quadtree would be better.
There is an algorithm by Hjaltason and Samet described in their paper "Distance browsing in spatial databases". It can easily be applied to quadtrees, I have an implementation here.
Basically, you maintain a sorted list of object, the list is sorted by distance (closest first), and the objects are either point in your tree (you call the coordinates) or nodes in the tree (distance to closest corner, or distance=0 if they overlap with you search point).
You start adding all nodes that overlap with your search point, and add all points and subnodes in these points.
Then you simply return points from the top of the list until you have as many closest points as you want. If a node is at the top of the list, add points/subnodes from that node to the list and check the top of the list again. Repeat.
yes you can find the closest coordinate inside a quad-tree even when it is not directly inside the leaf. in order to do that, you can do the following search algorithm :
search the closest position inside the quad-tree.
take its distance from your initial position
search all the nodes inside this bounding box from your root node
return the closest node from all the nodes inside this bounding box
however, this is a very basic algorithm with no performance optimizations. among other things :
if the distance calculated in 2. is less than the distance to the border of the tree node, then you don't need to do 3 or 4. (or you can take a node that is not the root node)
also, 3 and 4 could be simplified into a single algorithm that only search inside the tree with the distance to the closest node as the bounding box.
And you could also sort the way you search for the nodes inside the bounding box by beginning to search for the nodes closest to your position first.
However, I have not made complexity calculation, but you should expect a worst case scenario on one node that is as bad if not worst than normal, but in general you should get a pretty decent speed up all the while being error free.

Does the removal of a few edges remove all paths to a node?

I'm making a game engine for a board game called Blockade and right now I'm trying to generate all legal moves in a position. The rules aren't exactly the same as the actual game and they don't really matter. The gist is: the board is a matrix and you move a pawn and place a wall every move.
In short, I have to find whether or not a valid path exists from every pawn to every goal after every potential legal move (imagine a pawn doesn't move and a wall is just placed), to rule out illegal moves. Or rather, if I simplify it to a subproblem, whether or not the removal of a few edges (placing a wall) removes all paths to a node.
Brute-forcing it would take O(k*n*m), where n and m are the board dimensions and k is the number of potential legal moves. Searching for a path (worst case; traversing most of the board) is very expensive, but I'm thinking with dynamic programming or some other idea/algorithm it can be done faster since the position is the same the wall placement just changes, or rather, in graph terms, the graph is the same which edges are removed is just changed. Any sort of optimization is welcome.
Edit:
To elaborate on the wall (blockade). A wall is two squares wide/tall (depending on whether it's horizontal or vertical) therefore it will usually remove at least four edges, eg:
p | r
q | t
In this 2x2 matrix, placing a wall in the middle (as shown) will remove jumping from and to:
p and t, q and r, p and r, and q and t
I apologize ahead of time if I don't fully understand your question as it is asked; there seems to be some tacit contextual knowledge you are hinting at in your question with respect to knowledge about how the blockade game works (which I am completely unfamiliar with.)
However, based on a quick scan on wikipedia about the rules of the game, and from what I gather from your question, my understanding is that you are effectively asking how to ensure that a move is legal. Based on what I understand, an illegal move is a wall/blockade placement that would make it impossible for any pawn to reach its goal state.
In this case, I believe a workable solution that would be fairly efficient would be as follows.
Define a path tree of a pawn to be a (possibly but not necessarily shortest) path tree from the pawn to each reachable position. The idea is, you want to maintain a path tree for every pawn so that it can be updated efficiently with every blockade placement. What is described in the previous sentence can be accomplished by observing and implementing the following:
when a blockade is placed it removes 2 edges from the graph, which can sever up to (at most) two edges in all your existing path trees
each pawn's path tree can be efficiently recomputed after edges are severed using the "adoption" algorithm of the Boykov-Komolgrov maxflow algorithm.
once every pawns path tree is recomputed efficiently, simply check that each pawn can still access its goal state, if not mark the move as illegal
repeat for each possible move (reseting graphs as needed during the search)
Here are resources on the adoption algorithm that is critical to doing what is described efficiently:
open-source implementation as part of the BK-maxflow: https://www.boost.org/doc/libs/1_66_0/libs/graph/doc/boykov_kolmogorov_max_flow.html
implementation by authors as part of BK-maxflow: https://pub.ist.ac.at/~vnk/software.html
detailed description of adoption (stage) algorithm of BK maxflow algorithm: section 3.2.3 of https://www.csd.uwo.ca/~yboykov/Papers/pami04.pdf
Note reading the description of the adopton algorithm included in the last
bullet point above would be most critical to understanding how to adopt
orphaned portions of your path-tree efficiently.
In terms of efficiency of this approach, I believe on average you should expect on average O(1) operations for each adopted edge, meaning this approach should take about O(k) time to compute where k is the number of board states which you wish to compute for.
Note, the pawn path tree should actually be a reverse directed tree rooted at the goal nodes, which will allow the computation to be done for all legal pawn placements given a blockade configuration.
A few suggestions:
To check if there's a path from A to B after ever
Every move removes a node from the graph/grid. So what we want to know is if there are critical nodes on the path from A to B (single points that could be blocked to break the path. This is a classic flow problem. For this application you want to set the vertex capacity to 1 and push 2 units of flow (basically just to verify that there are at least 2 paths). If there are 2 paths, no one block can disconnect you from the destination. You can optimize it a bit by using an implicit graph, but if you're new to this maybe create the graph to visualize it better. This should be O(N*M), the size of your grid.
Optimizations
Since this is a game, you know that the setup doesn't change dramatically from one step to another. So, you can keep track of the two paths. If the blocade is not placed on any of the paths, you can ignore it. You already have 2 paths to destination.
If the block does land on one of the paths, cancel only that path and then look for another (reusing the one you already have).
You can also speed up the pawn movement. This can be a bit trick, but what you want is to move the source. I'm assuming the pawn moves only a few cells at a time, maybe instead of finding completely new paths, you can simply adjust them to connect to the new position, speeding up the update.

shortest path to surround a target in a weighted 2d array

I'm having some trouble finding the right approach to coding this.
Take a random-generated 2d array, about 50x50 with each cell having a value 1~99.
Starting at a random position "Green", and the goal is to surround the target "Red" with the lowest amount of actions.
Moving to a neighboring cell takes 1~99 actions depending on it's value.
example small array with low values:
[
Currently the best idea i have is, generate 4 sets of checkpoints based on the diagonals of the target and then using a lot of Dijkstra's to find a path that goes through all of them, as well as the starting point.
One problem i have is this very quickly becomes an extreme numbers of paths.
FROM any starting point "NorthWest-1 to NW-20" TO any ending point in "NE-1 to NE-20", is 400 possibilities. Adding the 3rd and 4th diagonal to that becomes 400 * 20 * 20.
Another problem using diagonal checkpoints is that the problem is not [shortest path from green to a diagonal (orange path)]
[
but rather from "green to any point on the path around red".
Current pseudocode;
take 2 sets of diagonals nearest to Green/start
find the shortest path that connects those diagonals while going through Green
(backtracking through the path is free)
draw a line starting from the target point, in-between the 2 connected diagonals,
set those cells to value infinite to force going around them (and thus around the target)
find the shortest path connecting the now-seperated diagonals
Unfortunately this pseudocode already includes some edge cases where the 'wall' blocks the most efficient path.
If relevant, this will be written in javascript.
Edit, as an edge case it could spiral the target before surrounding, though extremely rare
Edit2; "Surround" means disconnect the target from the rest of the field, regardless of how large the surrounded area is, or even if it includes the starting point (eg, all edges are 0)
Here is another larger field with (probably) optimal path, and 2 fields in text-form:
https://i.imgur.com/yMA14sS.png
https://pastebin.com/raw/YD0AG6YD
For short, let us call paths that surround the target fences. A valid fence is a set of (connected) nodes that makes the target disconnected from the start, but does not include the target. A minimal fence is one that does so while having a minimal cost. A lasso could be a fence that includes a path to the start node. The goal is to build a minimal-cost lasso.
A simple algorithm would be to use the immediate neighborhood of the target as a fence, and run Dijkstra to any of those fence-nodes to build a (probably non-optimal) lasso. Note that, if optimal answers are required, the choice of fence actually influences the choice of path from the start to the fence -- and vice-versa, the choice of path from start to fence can influence how the fence itself is chosen. The problem cannot be split neatly into two parts.
I believe that the following algorithm will yield optimal fences:
Build a path using Dijkstra from start to target (not including the end-points). Let us call this the yellow path.
Build 2 sets of nodes, one on each side of this yellow path, and neighboring it. Call those sets red and blue. Note that, for any given node that neighbors the path, it can either be part of the path, blue set, red set, or is actually an end-point.
For each node in the red set, run Dijkstra to find the shortest path to a node in the blue set that does not cross the yellow path.
For each of those previous paths, see which is shortest after adding the (missing) yellow-path bit to connect the blue and red ends together.
The cost is length(yellowPath) * cost_of_Dijkstra(redStart, anyBlue)
To make a good lasso, it would be enough to run Dijkstra from the start to any fence node. However, I am unsure of whether the final lasso will be optimal or not.
You might want to consider the A* search algorithm instead, you can probably adjust the algorithm to search for all 4 spots at once.
https://en.wikipedia.org/wiki/A*_search_algorithm
Basically A* expands Dijkstra's algorithm by focusing it's search on spots that are "closer" to the destination.
There are a number of other variations for search algorithms that may be more useful for your situation as well in the "Also See" section, though some of them are more suited for video game path planning rather than 2D grid paths.
Edit after reviewing question again:
Seems each spot has a weight. This makes the distance calculation a bit less straightforward. In this case, I would treat it as an optimization. For the heuristic cost function, it may be best to just use the most direct path (diagonal) to the goal as the heuristic cost, and then just use A* search to try to find an even better path.
As for the surround logic. I would treat that as it's own logic and a separate step (likely the second step). Find least cost path to the target first. Then find the cheapest way to surround the path. Honestly, the cheapest way to surround a point is probably worth it's own question.
Once you have both parts, it should be easy enough to merge the two. There will be some point where the two first overlap and that is where they are merged together.

Shortest path in a maze

I'm developing a game similar to Pacman: consider this maze:
Each white square is a node from the maze where an object located at P, say X, is moving towards node A in the right-to-left direction. X cannot switch to its opposite direction unless it encounters a dead-end such as A. Thus the shortest path joining P and B goes through A because X cannot reverse its direction towards the rightmost-bottom node (call it C). A common A* algorithm would output:
to get to B from P first go rightward, then go upward;
which is wrong. So I thought: well, I can set the C's visited attribute to true before running A* and let the algorithm find the path. Obviously this method doesn't work for the linked maze, unless I allow it to rediscover some nodes (the question is: which nodes? How to discriminate from useless nodes?). The first thinking that crossed my mind was: use the previous method always keeping track of the last-visited cell; if the resulting path isn't empty, you are done. Otherwise, when you get to the last-visited dead-end, say Y, (this step is followed by the failing of A*) go to Y, then use standard A* to get to the goal (I'm assuming the maze is connected). My questions are: is this guaranteed to work always? Is there a more efficient algorithm, such as an A*-derived algorithm modified to this purpose? How would you tackle this problem? I would greatly appreciate an answer explaining both optimal and non-optimal search techniques (actually I don't need the shortest path, a slightly long path is good, but I'm curious if such an optimal algorithm running as efficiently as Dijkstra's algorithm exists; if it does, what is its running time compared to a non-optimal algorithm?)
EDIT For Valdo: I added 3 cells in order to generalize a bit: please tell me if I got the idea:
Good question. I can suggest the following approach.
Use Dijkstra (or A*) algorithm on a directed graph. Each cell in your maze should be represented by multiple (up to 4) graph nodes, each node denoting the visited cell in a specific state.
That is, in your example you may be in the cell denoted by P in one of 2 states: while going left, and while going right. Each of them is represented by a separate graph node (though spatially it's the same cell). There's also no direct link between those 2 nodes, since you can't switch your direction in this specific cell.
According to your rules you may only switch direction when you encounter an obstacle, this is where you put links between the nodes denoting the same cell in different states.
You may also think of your graph as your maze copied into 4 layers, each layer representing the state of your pacman. In the layer that represents movement to the right you put only links to the right, also w.r.t. to the geometry of your maze. In the cells with obstacles where moving right is not possible you put links to the same cells at different layers.
Update:
Regarding the scenario that you described in your sketch. It's actually correct, you've got the idea right, but it looks complicated because you decided to put links between different cells AND states.
I suggest the following diagram:
The idea is to split your inter-cell AND inter-state links. There are now 2 kinds of edges: inter-cell, marked by blue, and inter-state, marked by red.
Blue edges always connect nodes of the same state (arrow direction) between adjacent cells, whereas red edges connect different states within the same cell.
According to your rules the state change is possible where the obstacle is encountered, hence every state node is the source of either blue edges if no obstacle, or red if it encounters an obstacle (i.e. can't emit a blue edge). Hence I also painted the state nodes in blue and red.
If according to your rules state transition happens instantly, without delay/penalty, then red edges have weight 0. Otherwise you may assign a non-zero weight for them, the weight ratio between red/blue edges should correspond to the time period ratio of turn/travel.

How to find the neighbors of a graph effiiciently

I have a program that create graphs as shown below
The algorithm starts at the green color node and traverses the graph. Assume that a node (Linked list type node with 4 references Left, Right, Up and Down) has been added to the graph depicted by the red dot in the image. Inorder to integrate the newly created node with it neighbors I need to find the four objects and link it so the graph connectivity will be preserved.
Following is what I need to clarify
Assume that all yellow colored nodes are null and I do not keep a another data structure to map nodes what is the most efficient way to find the existence of the neighbors of the newly created node. I know the basic graph search algorithms like DFS, BFS etc and shortest path algorithms but I do not think any of these are efficient enough because the graph can have about 10000 nodes and doing graph search algorithms (starting from the green node) to find the neighbors when a new node is added seems computationally expensive to me.
If the graph search is not avoidable what is the best alternative structure. I thought of a large multi-dimensional array. However, this has memory wastage and also has the issue of not having negative indexes. Since the graph in the image can grow in any directions. My solution to this is to write a separate class that consists of a array based data structure to portray negative indexes. However, before taking this option I would like to know if I could still solve the problem without resolving to a new structure and save a lot of rework.
Thank you for any feedback and reading this question.
I'm not sure if I understand you correctly. Do you want to
Check that there is a path from (0,0) to (x1,y1)
or
Check if any of the neighbors of (x1,y1) are in the graph? (even if there is no path from (0,0) to any of this neighbors).
I assume that you are looking for a path (otherwise you won't use a linked-list), which implies that you can't store points which have no path to (0,0).
Also, you mentioned that you don't want to use any other data structure beside / instead of your 2D linked-list.
You can't avoid full graph search. BFS and DFS are the classic algorithms. I don't think that you care about the shortest path - any path would do.
Another approaches you may consider is A* (simple explanation here) or one of its variants (look here).
An alternative data structure would be a set of nodes (each node is a pair < x,y > of course). You can easily run 4 checks to see if any of its neighbors are already in the set. It would take O(n) space and O(logn) time for both check and add. If your programming language does not support pairs as nodes of a set, you can use a single integer (x*(Ymax+1) + Y) instead.
Your data structure can be made to work, but probably not efficiently. And it will be a lot of work.
With your current data structure you can use an A* search (see https://en.wikipedia.org/wiki/A*_search_algorithm for a basic description) to find a path to the point, which necessarily finds a neighbor. Then pretend that you've got a little guy at that point, put his right hand on the wall, then have him find his way clockwise around the point. When he gets back, he'll have found the rest.
What do I mean by find his way clockwise? For example suppose that you go Down from the neighbor to get to his point. Then your guy should be faced the first of Right, Up, and Left which he has a neighbor. If he can go Right, he will, then he will try the directions Down, Right, Up, and Left. (Just imagine trying to walk through the maze yourself with your right hand on the wall.)
This way lies insanity.
Here are two alternative data structures that are much easier to work with.
You can use a quadtree. See http://en.wikipedia.org/wiki/Quadtree for a description. With this inserting a node is logarithmic in time. Finding neighbors is also logarithmic. And you're only using space for the data you have, so even if your graph is very spread out this is memory efficient.
Alternately you can create a class for a type of array that takes both positive and negative indices. Then one that builds on that to be 2-d class that takes both positive and negative indices. Under the hood that class would be implemented as a regular array and an offset. So an array that can start at some number, positive or negative. If ever you try to insert a piece of data that is before the offset, you create a new offset that is below that piece by a fixed fraction of the length of the array, create a new array, and copy data from the old to the new. Now insert/finding neighbors are usually O(1) but it can be very wasteful of memory.
You can use a spatial index like a quad tree or a r-tree.

Resources