Testing grid passability - algorithm

Consider this problem:
There's a square grid defined, each tile being either passable (1) or impassable (0).
At first, we have a simply connected space in the grid with an impassable border, like this:
We then start placing impassable obstacles of various dimensions (e.g. 1x1, 2x2,..) into the passable space. After each obstacle is placed, we need to test whether the remaining passable space is still connected (i.e. make sure we didn't split the passable space in two or more disconnected spaces). Tiles are connected diagonally, too.
The point is that after every obstacle placement, every remaining passable tile has a path that connects it to EVERY other remaining passable tile.
I'm aware of the possibility of searching for paths between possibly disconnected points, but I'm afraid that might be too inefficient. What I'm interested in is doing this testing as fast as possible.
Thanks for any help!

Implement a flood fill algorithm. As a side effect of performing the fill, count the number of squares filled. After placing your obstacles perform another flood fill starting from any open square and compare the number of filled squares to the original number minus the number of squares placed as obstacles. If they are not the same, you have disconnected regions.

Wikipedia says that this can be done in amortized O(|V|) time using disjoint-set data structures, where V is the number of elements in the passable space (the second paragraph of that section). The citation is to this paper.
This is the same asymptotic complexity as Benjamin's answer and is presumably harder to implement, so I'd go with that. :)

Related

Solving the sliding puzzle-like problem with arbitrary number of holes

I've tried searching for a while, but haven't come across a solution, so figured I would ask my own.
Consider an MxM 2D grid of holes, and a set of N balls which are randomly placed in the grid. You are given some final configuration of the N balls in the grid, and your goal is to move the balls in the grid to achieve this final configuration in the shortest time possible.
The only move you are allowed to make is to move any contiguous subsection of the grid (on either a row or column) by one space. That sounds a bit confusing; basically you can select any set of points in a straight line in the grid, and shift all the balls in that subsection by one spot to the left or right if it is a row, or one spot up or down if it is a hole. If that is confusing, it's fine to consider the alternate problem where the only move you can make is to move a single ball to any adjacent spot. The caveat is that two balls can never overlap.
Ultimately this problem basically boils down to a version of the classic sliding tile puzzle, with two key differences: 1) there can be an arbitrary number of holes, and 2) we don't a priori know the numbering of the tiles - we don't care which balls end up in the final holes, we just want to final holes to be filled after it is all said and done.
I'm looking for suggestions about how to go about adapting classic sliding puzzle solutions to these two constraints. The arbitrary number of holes is likely pretty easy to implement efficiently, but the fact that we don't know which balls are destined to go in which holes at the start is throwing me for a loop. Any advice (or implementations of similar problems) would be greatly appreciated.
If I understood well:
all the balls are equal and cannot be distinguished - they can occupy any position on the grid, the starting state is a random configuration of balls and holes on the grid.
there are nxn = balls + holes = number of cells in the grid
your target is to reach a given configuration.
It seems a rather trivial problem, so maybe I missed some constraints. If this is indeed the problem, solving it can be approached like this:
Consider that you move the holes, not the balls.
conduct a search between each hole and each hole position in the target configuration.
Minimize the number of steps to walk the holes to their closest target. (maybe with a BFS if it is needed) - That is to say that you can use this measure as a heuristic to order the moves in a flavor of A* maybe. I think for a 50x50 grid, the search will be very fast, because your heuristic is extremely precise and nearly costless to calculate.
Solving the problem where you can move a hole along multiple positions on a line, or a file is not much more complicated; you can solve it by adding to the possible moves/next steps in your queue.

Mapping 2D points to a fixed grid

I have any number of points on an imaginary 2D surface. I also have a grid on the same surface with points at regular intervals along the X and Y access. My task is to map each point to the nearest grid point.
The code is straight forward enough until there are a shortage of grid points. The code I've been developing finds the closest grid point, displaying an already mapped point if the distance will be shorter for the current point.
I then added a second step that compares each mapped point to another and, if swapping the mapping with another point produces a smaller sum of the total mapped distance of both points, I swap them.
This last step seems important as it reduces the number crossed map lines. (This would be used to map points on a plate to a grid on another plate, with pins connecting the two, and lines that don't cross seem to have a higher chance that the pins would not make contact.)
Questions:
Can anyone comment on my thinking that if the image above were truly optimized, (that is, the mapped points--overall--would have the smallest total distance), then none of the lines were cross?
And has anyone seen any existing algorithms to help with this. I've searched but came up with nothing.
The problem could be approached as a variation of the Assignment Problem, with the "agents" being the grid squares and the points being the "tasks", (or vice versa) with the distance between them being the "cost" for that agent-task combination. You could solve with the Hungarian algorithm.
To handle the fact that there are more grid squares than points, find a bounding box for the possible grid squares you want to consider and add dummy points that have a cost of 0 associated with all grid squares.
The Hungarian algorithm is O(n3), perhaps your approach is already good enough.
See also:
How to find the optimal mapping between two sets?
How to optimize assignment of tasks to agents with these constraints?
If I understand your main concern correctly, minimising total length of line segments, the algorithm you used does not find the best mapping and it is clear in your image. e.g. when two line segments cross each other, simple mathematic says that if you rearrange their endpoints such that they do not cross, it provides a better total sum. You can use this simple approach (rearranging crossed items) to get better approximation to the optimum, you should apply swapping for more somehow many iterations.
In the following picture you can see why crossing has longer length than non crossing (first question) and also why by swapping once there still exists crossing edges (second question and w.r.t. Comments), I just drew one sample, in fact one may need many iterations of swapping to get non crossed result.
This is a heuristic algorithm certainly not optimum but I expect to be very good and efficient and simple to implement.

Creating an "untransparent" squares

I'm writing a program that prints the circumference of squares to the screen according to coordinates and length of a side given by the user for each square.
The squares should be on top of each other if they overlap so that the bottom one is being hidden by the top one.
The order of the squares is set according to the order they were entered to the program (First is bottom).
For example:
&&&&
& &
& &$$$
&&&& $
$ $
$ $
$$$$$
The best algorithm I came up with is with time complexity of O(n^2) for each square.
Any suggestion for how to make the squares "untransparent"?
The O(n^2) algorithm you mention is probably the classic "painter's algorithm", in which you simply render ("rasterize") the squares one after another from the bottom ones to the top ones. This is a perfectly good algorithm, widely used in computer graphics. However, any "raster" algorithm will have the same time complexity of O(n^2) per square.
But if you want an asymptotically faster algorithm, you have to look for a "vector" algorithm, i.e. the algorithm that works with the edges of the squares, but does not waste time processing their interiors. One way to build such an algorithm is to pre-calculate the final visible edge layout in vector form and then draw only the visible edges on the screen.
To implement something like that each square has to be initially represented by a set of four edges. Then a single pass of sweep-line algorithm will eliminate the invisible edges. And then you can render the remaining visible edges on the screen. This algorithm will be a lot more complex than "painter's algorithm", since you will have to implement the sweeping and edge elimination logic. But for this particular problem (especially considering that it deals with orthogonal geometry) it is not at all that difficult.
P.S. One critical point here is that the latter approach can only work if all the squares are known in advance, i.e it is only applicable to an off-line problem. If you are dealing with an on-line problem, i.e. you have to draw the squares immediately as they are received from the input, not knowing all of them in advance, then in general case there's no reason to attempt to improve anything here. Just use the painter's algorithm.

How can I determine optimally if an edge is correctly oriented on the Rubik's cube?

If an edge can be solved by only rotating the right, left, up and down faces, it is considered correctly oriented. If solving the edge requires the turning of the front or the back faces, it is considered misoriented or "bad". Rotating the cube, so that the front and back faces become different ones, is not allowed.
Here is an example:
Image from here
This site details a deductive way for humans to determine the edge orientation. I'm wondering if there is a more optimal way to do it from a program (also, the steps taken to scramble the cube are known).
There seems to be an answer to your question already on the site.
Look at the U/D faces. If you see:
- L/R colour (orange/red) it's bad.
- F/B colour means you need to look round the side of the edge. If the side is U/D (white/yellow) it is bad.
Then look at the F/B faces of the E-slice (middle layer). The same rules apply. If you see:
- L/R colour (orange/red) it's bad.
- F/B colour (green/blue) means you need to look round the side of the edge. If the side is U/D (white/yellow) it is bad.
so it's simply a matter of looping through colors on U/D/F/B faces (or you could do on a single edge basis) and if any of them break the rules you know that edge is bad. This way only looks at each edge once, so I'd say it's fairly efficient. This ignores knowing the scramble algorithm though.
Simply using the scramble algorithm to determine edge orientation would be much harder as you'd have to watch for patterns in the turns and if the scramble is long enough this could end up taking more time than what is explained above. But for completeness I will give a short example of how it could be done.
Start with the state of all edges oriented and where they lie, (only 12 positions so number accordingly). Or again if you're interested in one only track one.
Then iteratively go through the list
any time a F/B is turned an odd number of times flip the orientation on the edges on whichever face was turned.
That could be run backwards keeping track of the state of an edge as you move it back to completeness and if in the end your edge claims to be "misoriented" you'll know it was actually opposite the state that you started with (as the solved cube has all edges oriented).
This however runs in O(n) where n is the length of the scramble, and the first runs in O(1) so if you're expecting very short scrambles this second method may be better. but you're guaranteed speedy results with the first.
I would provide pseudo-code however I don't think these algorithms are very complex and I'm not sure how the data may be stored.

Space partitioning algorithm

I have a set of points which are contained within the rectangle. I'd like to split the rectangles into subrectangles based on point density (giving a number of subrectangles or desired density, whichever is easiest).
The partitioning doesn't have to be exact (almost any approximation better than regular grid would do), but the algorithm has to cope with the large number of points - approx. 200 millions. The desired number of subrectangles however is substantially lower (around 1000).
Does anyone know any algorithm which may help me with this particular task?
Just to understand the problem.
The following is crude and perform badly, but I want to know if the result is what you want>
Assumption> Number of rectangles is even
Assumption> Point distribution is markedly 2D (no big accumulation in one line)
Procedure>
Bisect n/2 times in either axis, looping from one end to the other of each previously determined rectangle counting "passed" points and storing the number of passed points at each iteration. Once counted, bisect the rectangle selecting by the points counted in each loop.
Is that what you want to achieve?
I think I'd start with the following, which is close to what #belisarius already proposed. If you have any additional requirements, such as preferring 'nearly square' rectangles to 'long and thin' ones you'll need to modify this naive approach. I'll assume, for the sake of simplicity, that the points are approximately randomly distributed.
Split your initial rectangle in 2 with a line parallel to the short side of the rectangle and running exactly through the mid-point.
Count the number of points in both half-rectangles. If they are equal (enough) then go to step 4. Otherwise, go to step 3.
Based on the distribution of points between the half-rectangles, move the line to even things up again. So if, perchance, the first cut split the points 1/3, 2/3, move the line half-way into the heavy half of the rectangle. Go to step 2. (Be careful not to get trapped here, moving the line in ever decreasing steps first in one direction, then the other.)
Now, pass each of the half-rectangles in to a recursive call to this function, at step 1.
I hope that outlines the proposal well enough. It has limitations: it will produce a number of rectangles equal to some power of 2, so adjust it if that's not good enough. I've phrased it recursively, but it's ideal for parallelisation. Each split creates two tasks, each of which splits a rectangle and creates two more tasks.
If you don't like that approach, perhaps you could start with a regular grid with some multiple (10 - 100 perhaps) of the number of rectangles you want. Count the number of points in each of these tiny rectangles. Then start gluing the tiny rectangles together until the less-tiny rectangle contains (approximately) the right number of points. Or, if it satisfies your requirements well enough, you could use this as a discretisation method and integrate it with my first approach, but only place the cutting lines along the boundaries of the tiny rectangles. This would probably be much quicker as you'd only have to count the points in each tiny rectangle once.
I haven't really thought about the running time of either of these; I have a preference for the former approach 'cos I do a fair amount of parallel programming and have oodles of processors.
You're after a standard Kd-tree or binary space partitioning tree, I think. (You can look it up on Wikipedia.)
Since you have very many points, you may wish to only approximately partition the first few levels. In this case, you should take a random sample of your 200M points--maybe 200k of them--and split the full data set at the midpoint of the subsample (along whichever axis is longer). If you actually choose the points at random, the probability that you'll miss a huge cluster of points that need to be subdivided will be approximately zero.
Now you have two problems of about 100M points each. Divide each along the longer axis. Repeat until you stop taking subsamples and split along the whole data set. After ten breadth-first iterations you'll be done.
If you have a different problem--you must provide tick marks along the X and Y axis and fill in a grid along those as best you can, rather than having the irregular decomposition of a Kd-tree--take your subsample of points and find the 0/32, 1/32, ..., 32/32 percentiles along each axis. Draw your grid lines there, then fill the resulting 1024-element grid with your points.
R-tree
Good question.
I think the area you need to investigate is "computational geometry" and the "k-partitioning" problem. There's a link that might help get you started here
You might find that the problem itself is NP-hard which means a good approximation algorithm is the best you're going to get.
Would K-means clustering or a Voronoi diagram be a good fit for the problem you are trying to solve?
That's looks like Cluster analysis.
Would a QuadTree work?
A quadtree is a tree data structure in which each internal node has exactly four children. Quadtrees are most often used to partition a two dimensional space by recursively subdividing it into four quadrants or regions. The regions may be square or rectangular, or may have arbitrary shapes. This data structure was named a quadtree by Raphael Finkel and J.L. Bentley in 1974. A similar partitioning is also known as a Q-tree. All forms of Quadtrees share some common features:
They decompose space into adaptable cells
Each cell (or bucket) has a maximum capacity. When maximum capacity is reached, the bucket splits
The tree directory follows the spatial decomposition of the Quadtree

Resources