Find isolated groups of blocks in a grid - algorithm

I have a grid of "blocks" (in the form of a 2D array, could be 5*5, 17*17 or whatever) where I can add or remove blocks at will, except for the one at the center that should always remain there.
I can place blocks if they have a local neighbour : on their right/left/up/down (at least one of them).
By removing some blocks, it may leave other blocks isolated with no "connection" to the center-block, and I want to avoid this.
I'm looking for a quick solution to check if all my blocks have a connection to the center, the simplest possible (in terms of coding, I can accept to have a non optimal solution since this is supposed to be on executed on very small data and not so often). The first thing that came to my mind was to implement this as a path search but that seems overkill.
I'm using C++ but that should not make any difference.

You need to find the connected components using DFS/BFS.Construct the initial graph and as you add new blocks, you can add new edges, or when you remove blocks you can remove edges.When you remove a block, temporarily delete those edges in the graph and check if it causes two pieces of the graph to disconnect.This is simple, carry out DFS again.If it does not disconnect you can remove that block.
DFS is only about 8 lines to implement, and for small data sets this is elegant.

Related

Fill all connected grid squares of the same type

Foreword: I am aware there is another question like this, however mine has very specific restrictions. I have done my best to make this question applicable to many, as it is a generic grid issue, but if it still does not belong here, then I am sorry, and please be nice about it. I have found in the past stackoverflow to be a very picky and hostile environment to question askers, but I'm hoping that was just a bad couple people.
Goal(abstract): Check all connected grid squares in a 3D grid that are of the same type and touching on one face.
Goal(specific/implementation): Create a "fill bucket" tool in Minecraft with command blocks.
Knowledge of Minecraft not really necessary to answer, this is more of an algorithm question, and I will be staying away from Minecraft specifics.
Restrictions: I can do this in code with recursive functions, but in Minecraft there are some limitations I am wondering if are possible to get around. 1: no arrays(data structure) permitted. In Minecraft I can store an integer variable and do basic calculations with it (+,-,*,/,%(mod),=,==), but that's it. I cannot dynamically create variables or have the program create anything with a name that I did not set out ahead of time. I can do "IF" and "OR" statements, and everything that derives from them. I CANNOT have multiple program pointers - that is, I can't have things like recursive functions, which require a program to stop executing, execute itself from beginning to end, and then resume executing where it was - I have minimal control over the program flow. I can use loops and conditional exits (so FOR loops). I can have a marker on the grid in 3D space that can move regardless of the presence of blocks (I'm using an armour stand, for those who know), and I can test grid squares relative to that marker.
So say my grid is full of empty spaces only. There are separate clusters of filled squares in opposite corners, not touching each other. If I "use" my fillbucket tool on one block / filled grid square, I want it to use a single marker to check and identify all the connected grid squares - basically, I need to be sure that it traverses the entire shape, all the nooks and crannies, but not the squares that are not connected to that shape. So in the end, one of the two clusters, from me only selecting a single square of it, will be erased/replaced by another kind of block, without affecting the other blocks around it.
Again, apologies if this doesn't belong here. And only answer this if you WANT to tackle the challenge - it's not important or anything, I just want to do this. You don't have to answer it if you don't want to. Or if you can solve this problem for a 2D grid, that would be helpful as well, as I could possibly extend that to work for 3D.
Thank you, and if I get nobody degrading me for how I wrote this post or the fact that I did, then I will consider this a success :)
With help from this and other sources, I figured it out! It turns out that, since all recursive functions (or at least most of them) can be written as FOR loops, that I can make a recursive function in Minecraft. So I did, and the general idea of it is as follows:
For explaining the program, you may assuming the situation is a largely empty grid with a grouping of filled squares in one part of it, and the goal is to replace the kind of block that that grouping is made of with a different block. We'll say the grouping currently consists of red blocks, and we want to change them to blue blocks.
Initialization:
IDs - A objective (data structure) for holding each marker's ID (score)
numIDs - An integer variable for holding number of IDs/markers active
Create one marker at selected grid position with ID [1] (aka give it a score of 1 in the "IDs" objective). This grid position will be a filled square from which to start replacing blocks.
Increment numIDs
Main program:
FOR loop that goes from 1 to numIDs
{
at marker with ID [1], fill grid square with blue block
step 1. test block one to the +x for a red block
step 2. if found, create marker there with ID [numIDs]
step 3. increment numIDs
[//repeat steps 1 2 and 3 for the other five adjacent grid squares: +z, -x, -z, +y, and -y]
delete stand[1]
numIDs -= 1
subtract 1 from every marker's ID's, so that the next marker to evaluate, which was [2], now has ID [1].
} (end loop)
So that's what I came up with, and it works like a charm. Sorry if my explanation is hard to understand, I'm trying to explain in a way that might make sense to both coders and Minecraft players, and maybe achieving neither :P

Out of core connected components algorithms

I have 4,000,000,000 (four billion) edges for an undirected graph. They are represented in a large text file as pairs of node ids. I would like to compute the connected components of this graph. Unfortunately, once you load in the node ids with the edges into memory this takes more than the 128GB of RAM I have available.
Is there an out of core algorithm for finding connected components that is relatively simple to implement? Or even better, can be it cobbled together with Unix command tools and existing (python) libraries?
Based on the description of the problem you've provided and the answers you provided in the comments, I think the easiest way to do this might be to use an approach like the one #dreamzor described. Here's a more fleshed-out version of that answer.
The basic idea is to convert the data to a more compressed format that fits into memory, to run a regular connected components algorithm on that data, then to decompress it. Notice that if you assign each node a 32-bit numeric ID, then the total space required to store all the nodes is at most the space for four billion nodes and eight billion edges (assuming you store two copies of each edge), which is space for twelve billion 32-bit integers, only around 48GB of space, below your memory threshold.
To start off, write a script that reads in the edges file, assigns a numeric ID to each node (perhaps sequentially in the order in which they appear). Have this script write this mapping to a file and, as it goes, write a new edges file that uses the numeric IDs of the nodes rather than the string names. When you're done, you'll have a names file mapping IDs to names and an edges file that takes up much less space than before. You mentioned in the comments that you can fit all the node names into memory, so this step should be very reasonable. Note that you don't need to store all the edges in memory - you can stream them through the program - so that shouldn't be a bottleneck.
Next, write a program that reads the edges file - but not the names file - into memory and finds connected components using any reasonable algorithm (BFS or DFS would be great here). If you're careful with your memory (using something like C or C++ here would be a good call), this should fit comfortably into main memory. When you're done, write out all the clusters to an external file by numeric ID. You now have a list of all the CCs by ID.
Finally, write a program that reads in the ID to node mapping from the names file, then streams in the cluster IDs and writes out the names of all the nodes in each cluster to a final file.
This approach should be relatively straightforward to implement because the key idea is to keep the existing algorithms you're used to but just change the representation of the graph to be more memory efficient. I've used approaches like this before in the past when dealing with huge graphs (Wikipedia) and it's worked beautifully even on systems with less memory than yours.
You can hold only an array of vertices as their "color" (an int value), then run through the file without loading the entire set of links, marking vertices with a color, a new one if neither vertice is colored, the same color if one is colored and the other isn't, and lowest of two colors, together with repainting all the other vertices in the array that are painted with the highest color if both are colored. A pseudocode example:
int nextColor=1;
int merges=0;
int[] vertices;
while (!file.eof()) {
link=file.readLink();
c1=vertices[link.a];
c2=vertices[link.b];
if ((c1==0)&&(c2==0)) {
vertices[link.a]=nextColor;
vertices[link.b]=nextColor;
nextColor++;
} else if ((c1!=0)&&(c2!=0)) {
// both colored, merge
for (i=vertices.length-1;i>=0;i--) if (vertices[i]==c2) vertices[i]=c1;
merges++;
} else if (c1==0) vertices[link.a]=c2; // only c1 is 0
else vertices[link.b]=c1; // only c2 is 0
}
In case you choose the smaller than 32-bit type for storing color of a vertex, you might need to first check if nextColor is maxed, have an array of colors unused (released in merge), and skip coloring a new set of two vertices if no color can be used, then re-run the file reading process if both the colors are all used and any mergings occur.
UPDATE: Since the vertices aren't really ints but strings instead, you should also have a map of string to int while parsing that file. If your strings are limited by length, you can probably fit them all into memory as a hash table, but I'd pre-process the file by creating another file that would have all strings "s1" replaced with "1", "s2" with "2", etc, where "s1", "s2" are whatever names appear as vertices in the file, so that the data will be compacted to a list of pairs of ints. In case you'll be processing similar data later (that is, your graph isn't changing much, and contains largely the same names of vertices, store the "metadata" file with links from names to ints to ease further pre-processings.

Faster access to nodes in linked list

Access to nodes in linked lists can get pretty slow if the list gets large. I did think of a way to speed up the access: there is an array (also a LL) with short cuts to every 100th node. This way if I want to get the 205th element, the program will have to go through this "path": short cut to [100] -> short cut to [200] -> [201] -> ... -> [205]. This is much faster that going through the whole LL to the 205th element- 5 "steps", instead of 204. Yes, it gets slower if I want the n-hundred-and-99-th element, but the program will skip a large part of the LL to get there- faster in the long run.
But those short cuts require readjusting after adding and removing more elements. Removing isn't a real problem- remove an element and set certain short cuts to point to the next nodes- those cuts that point at the formal n-hundredth nodes. Adding more data is a problem- when adding a new element, certain nodes must be set to point to the previous nodes. In order to get to these elements, the program must go ALL the way trough the list, starting from the last short cut that still points at an n-hundredth element. Unless the nodes also point to the previous elements, the whole process can get as slow as if I am removing an element from a vector.
Is there a way to speed up the access, keeping the processes for adding and removing elements fairly fast? This is just a question of curiosity, not if it is a good idea to use it in a real program.
You're using the wrong data structure. Linked lists are best used for sequentially-accessed lists, not for randomly-accessed collections. For that, you're better off with a hash table of some sort.

Need a Ruby way to determine the elements of a matrix "touching" another element

I think I need a method called “Touching” (as in contiguous, not emotional.)
I need to identify those elements of a matrix that are next to an individual element or set of elements. At least that’s the way I’ve thought of to solve the problem at hand.
The matrix State in the program below represents, let’s say, some underwater topography. As I lower the water, eventually the highest point will stick out and become an “island”. When the “water level” is at 34 then the element State[2,3] is the single point of the island. The array atlantis holds the coordinates of that single point .
As we lower the water level further, additional points will be “above water.” Additional contiguous points will become part of the island and their coordinates would be added to the array atlantis. (For example, the next piece of land to be part of atlantis would be State[3,4] at 31.)
My thought about how to do this is to identify all the matrix elements that touch/are next to the element in the atlantis, find the one with the highest elevation and then add it to the array atlantis. Looking for the elements next to a single element is a challenge in itself, but we could write some code to examine the set [i,j-1], [i,j+1], [i-1,j-1], [i-1,j], [i-1,j+1], [i+1,j-1], [i+1,J], [i+1,j+1]. (I think I got that right.)
But as we add additional points, the task of determining which points surround the points in atlantis becomes increasingly difficult. So that’s my question: can anyone think of any mechanism by which to do this? Any kind of simplified algorithm using capabilities of ruby of which I am unaware? (which include all but the most basic.) If such a method could be written then I could write atlantis.touching and get an array, for example, containing all the coordinates of all the points presently contiguous to atlantis.
At least that’s how I’m thinking this could be done. Any other ideas would be welcome. And if anyone knows any kind of partnering site where I could seek others who might be interested in working with me on this, that would be great.
# create State database using matrix
require 'matrix'
State=Matrix[ [3,1,4,4,6,2,8,12,8,2],
[6,2,4,13,25,21,11,22,9,3,],
[6,20,27,34,22,14,12,11,2,5],
[6,28,17,23,31,18,11,9,18,12],
[9,18,11,13,8,9,10,14,24,11],
[3,9,7,16,9,12,28,24,29,21],
[5,8,4,7,17,14,19,30,33,4],
[7,17,23,9,5,9,22,21,12,21,],
[7,14,25,22,16,10,19,15,12,11],
[5,16,7,3,6,3,9,8,1,5] ]
#find sate elements contiguous to island
atlantis=[[2,3]]
find all state[i,j] "touching" atlantis
Only checking the points around the currently exposed area doesn't sound like it could cover every case - what if the next point to be exposed was the beginning of a new island?
I'd go about it like this: Have another array - let's call it sorted which contains your points sorted by height. Every time you raise the water level, pop all the elements higher than the new water level off sorted and onto atlantis.
In fact, there's no need for separate sorted and atlantis arrays if you do it this way. Just store the index of the highest point not above water, and you've essentially got two arrays in one - everything above water on one side, and everything below water on the other.
Hope that helps!

Mahjong - Arrange tiles to ensure at least one path to victory, regardless of layout

Regardless of the layout being used for the tiles, is there any good way to divvy out the tiles so that you can guarantee the user that, at the beginning of the game, there exists at least one path to completing the puzzle and winning the game?
Obviously, depending on the user's moves, they can cut themselves off from winning. I just want to be able to always tell the user that the puzzle is winnable if they play well.
If you randomly place tiles at the beginning of the game, it's possible that the user could make a few moves and not be able to do any more. The knowledge that a puzzle is at least solvable should make it more fun to play.
Place all the tiles in reverse (ie layout out the board starting in the middle, working out)
To tease the player further, you could do it visibly but at very high speed.
Play the game in reverse.
Randomly lay out pieces pair by pair, in places where you could slide them into the heap. You'll need a way to know where you're allowed to place pieces in order to end up with a heap that matches some preset pattern, but you'd need that anyway.
I know this is an old question, but I came across this when solving the problem myself. None of the answers here are quite perfect, and several of them have complicated caveats or will break on pathological layouts. Here is my solution:
Solve the board (forward, not backward) with unmarked tiles. Remove two free tiles at a time. Push each pair you remove onto a "matched pair" stack. Often, this is all you need to do.
If you run into a dead end (numFreeTiles == 1), just reset your generator :) I have found I usually don't hit dead ends, and have so far have a max retry count of 3 for the 10-or-so layouts I have tried. Once I hit 8 retries, I give up and just randomly assign the rest of the tiles. This allows me to use the same generator for both setting up the board, and the shuffle feature, even if the player screwed up and made a 100% unsolvable state.
Another solution when you hit a dead end is to back out (pop off the stack, replacing tiles on the board) until you can take a different path. Take a different path by making sure you match pairs that will remove the original blocking tile.
Unfortunately, depending on the board, this may loop forever. If you end up removing a pair that resembles a "no outlet" road, where all subsequent "roads" are a dead end, and there are multiple dead ends, your algorithm will never complete. I don't know if it is possible to design a board where this would be the case, but if so, there is still a solution.
To solve that bigger problem, treat each possible board state as a node in a DAG, with each selected pair being an edge on that graph. Do a random traversal, until you find a leaf node at depth 72. Keep track of your traversal history so that you never repeat a descent.
Since dead ends are more rare than first-try solutions in the layouts I have used, what immediately comes to mind is a hybrid solution. First try to solve it with minimal memory (store selected pairs on your stack). Once you've hit the first dead end, degrade to doing full marking/edge generation when visiting each node (lazy evaluation where possible).
I've done very little study of graph theory, though, so maybe there's a better solution to the DAG random traversal/search problem :)
Edit: You actually could use any of my solutions w/ generating the board in reverse, ala the Oct 13th 2008 post. You still have the same caveats, because you can still end up with dead ends. Generating a board in reverse has more complicated rules, though. E.g, you are guaranteed to fail your setup if you don't start at least SOME of your rows w/ the first piece in the middle, such as in a layout w/ 1 long row. Picking a completely random (legal) first move in a forward-solving generator is more likely to lead to a solvable board.
The only thing I've been able to come up with is to place the tiles down in matching pairs as kind of a reverse Mahjong Solitaire game. So, at any point during the tile placement, the board should look like it's in the middle of a real game (ie no tiles floating 3 layers up above other tiles).
If the tiles are place in matching pairs in a reverse game, it should always result in at least one forward path to solve the game.
I'd love to hear other ideas.
I believe the best answer has already been pushed up: creating a set by solving it "in reverse" - i.e. starting with a blank board, then adding a pair somewhere, add another pair in a solvable position, and so on...
If you a prefer "Big Bang" approach (generating the whole set randomly at the beginning), are a very macho developer or just feel masochistic today, you could represent all the pairs you can take out from the given set and how they depend on each other via a directed graph.
From there, you'd only have to get the transitive closure of that set and determine if there's at least one path from at least one of the initial legal pairs that leads to the desired end (no tile pairs left).
Implementing this solution is left as an exercise to the reader :D
Here are rules i used in my implementation.
When buildingheap, for each fret in a pair separately, find a cells (places), which are:
has all cells at lower levels already filled
place for second fret does not block first, considering if first fret already put onboard
both places are "at edges" of already built heap:
EITHER has at least one neighbour at left or right side
OR it is first fret in a row (all cells at right and left are recursively free)
These rules does not guarantee a build will always successful - it sometimes leave last 2 free cells self-blocking, and build should be retried (or at least last few frets)
In practice, "turtle" built in no more then 6 retries.
Most of existed games seems to restrict putting first ("first on row") frets somewhere in a middle. This come up with more convenient configurations, when there are no frets at edges of very long rows, staying up until last player moves. However, "middle" is different for different configurations.
Good luck :)
P.S.
If you've found algo that build solvable heap in one turn - please let me know.
You have 144 tiles in the game, each of the 144 tiles has a block list..
(top tile on stack has an empty block list)
All valid moves require that their "current__vertical_Block_list" be empty.. this can be a 144x144 matrix so 20k of memory plus a LEFT and RIGHT block list, also 20 k each.
Generate a valid move table from (remaning_tiles) AND ((empty CURRENT VERTICAL BLOCK LIST) and ((empty CURRENT LEFT BLOCK LIST) OR (empty CURRENT RIGHT BLOCK LIST)))
Pick 2 random tiles from the valid move table, record them
Update the (current tables Vert, left and right), record the Tiles removed to a stack
Now we have a list of moves that constitute a valid game. Assign matching tile types to each of the 72 moves.
for challenging games, track when each tile becomes available. find sets that have are (early early early late) and (late late late early) since it's blank, you find 1 EE 1 LL and 2 LE blocks.. of the 2 LE block, find an EARLY that blocks ANY other EARLY that (except rightblocking a left side piece)
Once youve got a valid game play around with the ordering.
Solitaire? Just a guess, but I would assume that your computer would need to beat the game(or close to it) to determine this.
Another option might be to have several preset layouts(that allow winning, mixed in with your current level.
To some degree you could try making sure that one of the 4 tiles is no more than X layers below another X.
Most games I see have the shuffle command for when someone gets stuck.
I would try a mix of things and see what works best.

Resources