Related
I'm currently stuck on a challenge our lecturer gave us over at our university. We've been looking at the most popular pathfinding algorithms such as Dijkstra and A*. Although, I think this challenge exercise requires something else and it has me stumped.
A visual representation of the maze that needs solving:
Color legend
Blue = starting node
Gray = path
Green = destination node
The way it's supposed to be solved is that when movement is done, it has to be done until it collides with either the edge of the maze or an obstacle (the black borders). It would also need to be solved in the least amount of row-movements possible (in this case 7)
My question: Could someone push me in the right direction on what algorithm to look at? I think Dijkstra/A* is not the way to go, considering the shortest path is not always the correct path given the assignment.
It is still solvable with Dijkstra / A*, what needs to be changed is the configuration of neighbors.
A little background, first:
Dijkstra and A* are general path finding algorithms formulated on graphs. When, instead of a graph, we have a character moving on a grid, it might not be that obvious where the graph is. But it's still there, and one way to construct a graph would be the following:
the nodes of the graph correspond to the cells of the grid
there are edges between nodes corresponding to neighboring cells.
Actually, in most problems involving some configurations and transitions between them, it is possible to construct a corresponding graph, and apply Dijkstra/A*. Thus, it is also possible to tackle problems such as sliding puzzle, rubik's cube, etc., which apparently differ significantly from a character moving on a grid. But they have states, and transitions between states, thus it is possible to try graph search methods (these methods, especially the uninformed ones such as Dijkstra's algorithm, might not always be feasible due to the large search space, but in principle it's possible to apply them).
In the problem that you mentioned, the graph would not differ much from the one with typical character movements:
the nodes can still be the cells of the grid
there will now be edges from a node to nodes corresponding to valid movements (ending near a border or an obstacle), which, in this case, will not always coincide with the four spatial immediate neighbors of the grid cell.
As pointed by Tamas Hegedus in the comments section, it's not obvious what heuristic function should be chosen if A* is used.
The standard heuristics based on Manhattan or Euclidean distance would not be valid here, as they might over-estimate the distance to the target.
One valid heuristic would be id(row != destination_row) + id(col != destination_col), where id is the identity function, with id(false) = 0 and id(true) = 1.
Dijkstra/A* are fine. What you need is to carefully think about what you consider graph nodes and graph edges.
Standing in the blue cell (lets call it 5,5), you have three valid moves:
move one cell to the right (to 6,5)
move four cells to the left (to 1,5)
move five cells up (to 5,1)
Note that you can't go from 5,5 to 4,5 or 5,4. Apply the same reasoning to new nodes (eg from 5,1 you can go to 1,1, 10,1 and 5,5) and you will get a graph on which you run your Dijkstra/A*.
You need to evaluate every possible move and take the move that results with the minimum distance. Something like the following:
int minDistance(int x, int y, int prevX, int prevY, int distance) {
if (CollionWithBorder(x, y) // can't take this path
return int.MAX_VALUE;
if (NoCollionWithBorder(x, y) // it's OK to take this path
{
// update the distance only when there is a long change in direction
if (LongDirectionChange(x, y, prevX, prevY))
distance = distance + 1;
)
if (ReachedDestination(x, y) // we're done
return distance;
// find the path with the minimum distance
return min(minDistance(x, y + 1, x, y, distance), // go right
minDistance(x + 1, y, x, y, distance), // go up
minDistance(x - 1, y, x, y, distance), // go down
minDistance(x, y - 1, x, y, distance)); // go left
}
bool LongDirectionChange(x, y, prevX, prevY) {
if (y-2 == prevY && x == prevX) ||(y == prevY && x-2 == prevX)
return true;
return false;
}
This is assuming diagonal moves are not allowed. If they are, add them to the min() call:
minDistance(x + 1, y + 1, distance), // go up diagonally to right
minDistance(x - 1, y - 1, distance), // go down diagonally to left
minDistance(x + 1, y - 1, distance), // go up diagonally to left
minDistance(x - 1, y + 1, distance), // go down diagonally to right
I am trying to solve
the SPOJ problem on rectangles.
Problem statement:
There are n rectangles drawn on the plane. Each rectangle has sides parallel to the coordinate axes and integer coordinates of vertices.
We define a block as follows:
each rectangle is a block,
if two distinct blocks have a common segment then they form the new block otherwise we say that these blocks are separate.
Write a program that for each test case:
reads the number of rectangles and coordinates of their vertices;
finds the number of separate blocks formed by the rectangles;
writes the result to the standard output.
Input:
The number of test cases t is in the first line of input, then t test cases follow separated by an empty line.
In the first line of a test case there is an integer n, 1 <= n <= 7000, which is the number of rectangles. In the following n lines there are coordinates of rectangles. Each rectangle is described by four numbers: coordinates x, y of the bottom-left vertex and coordinates x, y of the top-right vertex. All these coordinates are non-negative integers not greater than 10000.
Output:
For each test case you should output one line with the number of separate blocks formed by the given rectangles.
My approach:
Check for every pair of rectangle r_i and r_j whether they are separate or not based on that set adjacency matrix mat[i][j] and mat[j][i] to true or false respectively
Then run DFS on the constructed graph to count number of connected paths. This count will represent number of separate block.
As number of rectangles is at most 7000, looking at every pair will not cross 10^7. Still I am getting TLE (time limit exceeded).
How can I solve this problem more efficiently?
void comp() {
list.clear();
scanI(n);
REP(i,1,n) {
Rec rec;
scanI(rec.p);
scanI(rec.q);
scanI(rec.r);
scanI(rec.s);
list.pb(rec);
}
REP(i,0,list.size()-2){
Rec rec = list[i];
p = rec.p;
q = rec.q;
r = rec.r;
s = rec.s;
REP(j,i+1,list.size()-1) {
Rec m = list[j];
a = m.p;
b = m.q;
c = m.r;
d = m.s;
if(!isSeparate()) {
eList[i].pb(j); //adjacency list for rec_i
eList[j].pb(i);//adjacency list for rec_j
}
}
}
int cnt=0;
REP(i,0,n-1) {
if(!vis[i]){
cnt++;
dfs(i);
}
}
printf("%d\n",cnt);
}
bool isSeparate(){
if(s<b || d<q || r<a || c<p) return true;
if((r==a && q==d)||(c==p && b==s)||(a==r && b==s)||(p==c && q==d)) return true;
else return false;
}
void dfs(int s) {
cout<<"Visited : "<<s<<endl;
if(vis[s]) return;
vis[s] = true;
REP(i,0,eList[s].size()-1){
if(!vis[eList[s][i]]){
dfs(eList[s][i]);
}
}
}
I've thought of a couple of more algorithmic improvements.
Use a fast union/find data structure instead of building an adjacency-list representation of the graph. Then if one rectangle intersects another rectangle you can stop right then -- there's no need to continue testing it against all other rectangles seen so far. With this in place, problem instances in which most rectangles intersect most other rectangles will be solved very quickly.
There's still the need to efficiently handle problem instances in which most rectangles intersect few or no other rectangles. A couple of observations:
A rectangle can only overlap another rectangle if both their vertical and horizontal extents overlap.
If we have n non-overlapping rectangles centered at the grid points of some h*w grid, it must be that min(h, w) <= sqrt(n).
Suppose the problem instance has the form of the second bullet point above -- an h*w grid of non-overlapping rectangles, with h*w = n but h and w otherwise unknown. As you process each rectangle, insert its vertical extent into a data structure that enables fast point-in-interval queries, such as an interval tree or segment tree, and insert its horizontal extent into another such data structure. The obvious way of using this information -- by looking up all rectangles that overlap the current rectangle vertically, and looking up all rectangles that overlap it horizontally, and then intersecting these 2 lists -- doesn't give much speed advantage, because one of these lists could be very long. What you can do instead is to simply pick the shorter of these 2 lists and test every rectangle in it (as before, stopping as soon as an overlap is detected). This is fast, because we know that the shorter list can have at most sqrt(7000) rectangles in it.
I haven't proven that a grid of non-overlapping rectangles is a true worst case for this algorithm, but I'm confident the above approach will work quickly in any case.
I have a set of axis parallel 2d rectangles defined by their top left and bottom right hand corners(all in integer coordinates). Given a point query, how can you efficiently determine if it is in one of the rectangles? I just need a yes/no answer and don't need to worry about which rectangle it is in.
I can check if (x,y) is in ((x1, y1), (x2, y2)) by seeing if x is between x1 and x2 and y is between y1 and y2. I can do this separately for each rectangle which runs in linear time in the number of rectangles. But as I have a lot of rectangles and I will do a lot of point queries I would like something faster.
The answer depends a little bit on how many rectangles you have. The brute force method checks your coordinates against each rectangular pair in turn:
found = false
for each r in rectangles:
if point.x > r.x1 && point.x < r.x2:
if point.y > r.y1 && point.y < r.y2
found = true
break
You can get more efficient by sorting the rectangles into regions, and looking at "bounding rectangles". You then do a binary search through a tree of ever-decreasing bounding rectangles. This takes a bit more work up front, but it makes the lookup O(ln(n)) rather than O(n) - for large collections of rectangles and many lookups, the performance improvement will be significant. You can see a version of this (which looks at intersection of a rectangle with a set of rectangles - but you easily adapt to "point within") in this earlier answer. More generally, look at the topic of quad trees which are exactly the kind of data structure you would need for a 2D problem like this.
A slightly less efficient (but faster) method would sort the rectangles by lower left corner (for example) - you then need to search only a subset of the rectangles.
If the coordinates are integer type, you could make a binary mask - then the lookup is a single operation (in your case this would require a 512MB lookup table). If your space is relatively sparsely populated (i.e. the probability of a "miss" is quite large) then you could consider using an undersampled bit map (e.g. using coordinates/8) - then map size drops to 8M, and if you have "no hit" you save yourself the expense of looking more closely. Of course you have to round down the left/bottom, and round up the top/right coordinates to make this work right.
Expanding a little bit with an example:
Imagine coordinates can be just 0 - 15 in x, and 0 - 7 in y. There are three rectangles (all [x1 y1 x2 y2]: [2 3 4 5], [3 4 6 7] and [7 1 10 5]. We can draw these in a matrix (I mark the bottom left hand corner with the number of the rectangle - note that 1 and 2 overlap):
...xxxx.........
...xxxx.........
..xxxxx.........
..x2xxxxxxx.....
..1xx..xxxx.....
.......xxxx.....
.......3xxx.....
................
You can turn this into an array of zeros and ones - so that "is there a rectangle at this point" is the same as "is this bit set". A single lookup will give you the answer. To save space you could downsample the array - if there is still no hit, you have your answer, but if there is a hit you would need to check "is this real" - so it saves less time, and savings depend on sparseness of your matrix (sparser = faster). Subsampled array would look like this (2x downsampling):
.oxx....
.xxooo..
.oooxo..
...ooo..
I use x to mark "if you hit this point, you are sure to be in a rectangle", and o to say "some of these are a rectangle". Many of the points are now "maybe", and less time is saved. If you did more severe downsampling you might consider having a two-bit mask: this would allow you to say "this entire block is filled with rectangles" (i.e. - no further processing needed: the x above) or "further processing needed" (like the o above). This soon starts to be more complicated than the Q-tree approach...
Bottom line: the more sorting / organizing of the rectangles you do up front, the faster you can do the lookup.
My favourite for a variety of 2D geometry queries is Sweep Line Algorithm. It's widely utilize in CAD software, which would be my wild guess for the purpose of your program.
Basically, you order all points and all polygon vertices (all 4 rectangle corners in your case) along X-axis, and advance along X-axis from one point to the next. In case of non-Manhattan geometries you would also introduce intermediate points, the segment intersections.
The data structure is a balanced tree of the points and polygon (rectangle) edge intersections with the vertical line at the current X-position, ordered in Y-direction. If the structure is properly maintained it's very easy to tell whether a point at the current X-position is contained in a rectangle or not: just examine Y-orientation of the vertically adjacent to the point edge intersections. If rectangles are allowed to overlap or have rectangle holes it's just a bit more complicated, but still very fast.
The overall complexity for N points and M rectangles is O((N+M)*log(N+M)). One can actually prove that this is asymptotically optimal.
Store the coordinate parts of your rectangles to a tree structure. For any left value make an entry that points to corresponding right values pointing to corresponding top values pointing to corresponding bottom values.
To search you have to check the x value of your point against the left values. If all left values do not match, meaning they are greater than your x value, you know the point is outside any rectangle. Otherwise you check the x value against the right values of the corresponding left value. Again if all right values do not match, you're outside. Otherwise the same with top and bottom values. Once you find a matching bottom value, you know you are inside of any rectangle and you are finished checking.
As I stated in my comment below, there are much room for optimizations, for example minimum left and top values and also maximum right and botom values, to quick check if you are outside.
The following approach is in C# and needs adaption to your preferred language:
public class RectangleUnion
{
private readonly Dictionary<int, Dictionary<int, Dictionary<int, HashSet<int>>>> coordinates =
new Dictionary<int, Dictionary<int, Dictionary<int, HashSet<int>>>>();
public void Add(Rectangle rect)
{
Dictionary<int, Dictionary<int, HashSet<int>>> verticalMap;
if (coordinates.TryGetValue(rect.Left, out verticalMap))
AddVertical(rect, verticalMap);
else
coordinates.Add(rect.Left, CreateVerticalMap(rect));
}
public bool IsInUnion(Point point)
{
foreach (var left in coordinates)
{
if (point.X < left.Key) continue;
foreach (var right in left.Value)
{
if (right.Key < point.X) continue;
foreach (var top in right.Value)
{
if (point.Y < top.Key) continue;
foreach (var bottom in top.Value)
{
if (point.Y > bottom) continue;
return true;
}
}
}
}
return false;
}
private static void AddVertical(Rectangle rect,
IDictionary<int, Dictionary<int, HashSet<int>>> verticalMap)
{
Dictionary<int, HashSet<int>> bottomMap;
if (verticalMap.TryGetValue(rect.Right, out bottomMap))
AddBottom(rect, bottomMap);
else
verticalMap.Add(rect.Right, CreateBottomMap(rect));
}
private static void AddBottom(
Rectangle rect,
IDictionary<int, HashSet<int>> bottomMap)
{
HashSet<int> bottomList;
if (bottomMap.TryGetValue(rect.Top, out bottomList))
bottomList.Add(rect.Bottom);
else
bottomMap.Add(rect.Top, new HashSet<int> { rect.Bottom });
}
private static Dictionary<int, Dictionary<int, HashSet<int>>> CreateVerticalMap(
Rectangle rect)
{
var bottomMap = CreateBottomMap(rect);
return new Dictionary<int, Dictionary<int, HashSet<int>>>
{
{ rect.Right, bottomMap }
};
}
private static Dictionary<int, HashSet<int>> CreateBottomMap(Rectangle rect)
{
var bottomList = new HashSet<int> { rect.Bottom };
return new Dictionary<int, HashSet<int>>
{
{ rect.Top, bottomList }
};
}
}
It's not beautiful, but should point you in the right direction.
I am looking for an algorithm to sort an unordered list of items into a tree structure, using the minimum number of "is child" comparison operations as possible.
A little background on my specific case, but I guess I am just looking for a generic sorting algorithm of a sort I have been unable to find (it is a hard search term to refine).
I have an unordered list of contours, which are simply lists of coordinates that describe closed polygons. I want to create a tree that represents the relationship between these contours, such that the outermost is the root, with each contour at the next level as children, and so forth. So a tree structure with zero-to-many children per node.
A key requirement of the algorithm is that tests to determine whether or not a contour is the child of another are kept to a minimum, as this operation is very expensive. Contours can (and often do) share many vertices, but should not intersect. These shared vertices usually arise where map limits are reached - picture a set of concentric semi circles against the straight edge of a map. The point-in-poly test is slow if I need to run through lots of point-on-lines before I get to a definitive answer.
Here's the algorithm I have come up with so far. It's pretty naive, no doubt, but it works. There are probably some heuristics that may help - a contour is only likely to be a child of another contour with a depth within a certain range, for example - but I want to nail the basic algorithm first. The first red flag is that it is exponential.
for each candidate_contour in all_contours
for each contour in all_contours
// note already contains means "is direct or indirect child of"
if contour == candidate_contour or contour already contains(candidate_contour)
continue
else
list contours_to_check
contours_to_check.add(candidate_contour)
contour parent_contour = candidate_contour.parent
while (parent_contour != null)
contours_to_check.add(parent_contour)
parent_contour = parent_contour.parent
for each possible_move_candidate in contours_to_check (REVERSE ITERATION)
if possible_move_candidate is within contour
// moving a contour moves the contour and all of its children
move possible_move_candidate to direct child of contour
break
So that works - or at least it seems to - but it gets very slow with a non-trivial number of contours (think - several hundred, to possibly several thousand).
Is there a fundamentally better way to do this, or indeed - are there known algorithms that deal with exactly this? As mentioned before - the key in my case is to keep the "is contour within" comparisons to a minimum.
Edit to add solution based on Jim's answer below - thanks Jim!
This is the first iteration - which produce good (10x) improvements. See below for iteration 2.
This code versus my original algorithm is > 10x faster once the contour set becomes non-trivially big. See image below that is now sorted in a couple of seconds (v's 30 odd seconds prior), and rendered in order. I think there is room to further improve with some added heuristics - for example, now that the original list is sorted according to area, then each new candidate has to be a leaf node somewhere in the tree. The difficulty is determining which branches to traverse to test the existing leaves - if there are many branches/leaves, then it is probably still quicker to cut the search space down by examining the branches at the top.. something more to think about!
public static iso_area sort_iso_areas(List<iso_area> areas, iso_area root)
{
if (areas.Count == 0)
return null;
areas.Sort(new iso_comparer_descending());
foreach (iso_area area in areas)
{
if (root.children.Count == 0)
root.children.Add(area);
else
{
bool found_child = false;
foreach (iso_area child in root.children)
{
// check if this iso_area is within the child
// if it is, follow the children down to find the insertion position
if (child.try_add_child(area))
{
found_child = true;
break;
}
}
if (!found_child)
root.children.Add(area);
}
}
return root;
}
// try and add the provided child to this area
// if it fits, try adding to a subsequent child
// keep trying until failure - then add to the previous child in which it fitted
bool try_add_child(iso_area area)
{
if (within(area))
{
// do a recursive search through all children
foreach (iso_area child in children)
{
if (child.try_add_child(area))
return true;
}
area.move_to(this);
return true;
}
else
return false;
}
Iteration two - comparing against existing leaves only
Following on from my earlier thought that new contours could only fit into existing leaves, it struck me that in fact this would be much quicker as the poly in poly test would fail at the first bounds check for all leaves other than the target leaf. The first solution involved traversing a branch to find the target where, by definition, each and every poly along the way would pass the bounds check, and involve a full poly-in-poly test until no further leaves were found.
Following Jim's comment and re-examination of the code - the second solution did not work, unfortunately. I'm wondering if there may still be some merit to looking at lower elements in the tree before the branches, as the poly-in-poly test should fail quickly, and you know that if you find a leaf that accepts the candidate, there are no more polys that need to be examined..
Iteration two revisited
Although it is not the case that contours can only fit into leaves, it is the case that they almost always do - and also that they will usually fit into a recent predecessor in the ordered list of contours. This final updated code is the fastest yet, and ditches the tree traversal completely. It simply walks backwards through the recent larger polygons and tries each - polys from other branches will likely fail the poly-in-poly test at the bounds check, and the first poly found that surrounds the candidate poly has to be the immediate parent, due to the prior sorting of the list. This code brings the sorting down into the millisecond range again and is about 5x faster than the tree traversal (significant speed improvements were also made to the poly-in-poly test which accounts for the rest of the speed-up). The root is now taken from the sorted list of areas. Note that I now supply a root in the list that I know encompasses all the contours (bounding box of all).
Thanks for the help - and Jim in particular - for helping me think about this problem. The key really was the original sorting of the contours into a list in which it was guaranteed that no contour could be a child of a later contour in the list.
public static iso_area sort_iso_areas(List<iso_area> areas)
{
if (areas.Count == 0)
return null;
areas.Sort(new iso_comparer_descending());
for (int i = 0; i < areas.Count; ++i)
{
for (int j = i - 1; j >= 0; --j)
{
if (areas[j].try_add_child(areas[i]))
break;
}
}
return areas[0];
}
My original attempt: 133 s
Iteration 1 (traverse branch to find leaf): 9s
Iteration 2 (walk backwards through contours in ascending size order): 25ms (with other pt-in-poly improvements also).
I did something similar a while back by first sorting by area.
If polygon B is contained within polygon A, then the bounding box for polygon A has to be larger than the bounding box for polygon B. More to the point, if you specify the bounding box as ((x1, y1), (x2, y2)), then:
A.x1 < B.x1
A.y1 < B.y1
A.x2 > B.x2
A.y2 > B.y2
(Those relationships might be <= and >= if polygons can share edges or vertices.)
So the first thing you should do is compute the bounding boxes and sort the polygons by bounding box area, descending (so the largest is first).
Create a structure that is essentially a polygon and a list of its children:
PolygonNode
{
Polygon poly
PolygonNode[] Children
}
So you start out with your polygons sorted by bounding box area, descending, and an initially empty list of PolygonNode structures:
Polygon[] sortedPolygons
PolygonNode[] theTree
Now, starting from the first member of sortedPolygons, which is the polygon with the largest area, check to see if it's a child of any of the top-level members of theTree. If it's not, add the polygon to the theTree. The bounding boxes help here because you don't have to do the full polygon-in-polygon test if the bounding box test fails.
If it is a child of a node, then check to see if it's a child of one of that node's children, and follow the child chain down until you find the insertion spot.
Repeat that for every polygon in sortedPolygons.
Worst case, that algorithm is O(n^2), which will happen if there are no parent/child relationships. But assuming that there are many nested parent/child relationships, the search space gets cut down very quickly.
You can probably speed it up somewhat by ordering the theTree list and the child nodes lists by position. You could then use a binary search to more quickly locate the potential parent for a polygon. Doing so complicates things a little bit, but it might be worthwhile if you have a lot of top-level polygons. I wouldn't add that optimization on the first cut, though. It's quite possible that the version I outlined using sequential search will be plenty fast enough.
Edit
Understanding the nature of the data helps. I didn't realize when I wrote my original response that your typical case is that given the sorted list of polygons, the normal case is that p[i] is a child of p[i-1], which is a child of p[i-2], etc. Your comments indicate that it's not always the case, but it is very often.
Given that, perhaps you should make a simple modification to your implementation so that you save the last polygon and check it first rather than starting in with the tree. So your loop looks something like this:
iso_area last_area = null; // <============
foreach (iso_area area in areas)
{
if (root.children.Count == 0)
root.children.Add(area);
else if (!last_area.try_add_child(area)) // <=======
{
bool found_child = false;
foreach (iso_area child in root.children)
{
// check if this iso_area is within the child
// if it is, follow the children down to find the insertion position
if (child.try_add_child(area))
{
found_child = true;
break;
}
}
if (!found_child)
root.children.Add(area);
}
last_area = area; // <============
}
return root;
If the data is as you said, then this optimization should help quite a bit because it eliminates a bunch of searching through the tree.
A recursive approach works well when dealing with trees. The following algorithm ought to be O(N log(N)) in cases where your shapes are distributed fairly evenly. It becomes O(N²) worse-case if your shapes are all concentric in one long tunnel-like distribution.
boolean tryAddToNode(Node possibleParent, Node toAdd)
{
if not toAdd.isChildOf(possibleParent)
return false
for each child in possibleParent.children
if(tryAddToNode(child, toAdd))
return true
// not a child of any of my children, but
// it is a child of me.
possibleParent.children.add(toAdd)
return true
}
I have a rectangular plane of integer dimension. Inside of this plane I have a set of non-intersecting rectangles (of integer dimension and at integer coordinates).
My question is how can I efficiently find the inverse of this set; that is the portions of the plane which are not contained in a sub-rectangle. Naturally, this collection of points forms a set of rectangles --- and it is these that I am interested in.
My current, naive, solution uses a boolean matrix (the size of the plane) and works by setting a point i,j to 0 if it is contained within a sub-rectangle and 1 otherwise. Then I iterate through each element of the matrix and if it is 1 (free) attempt to 'grow' a rectangle outwards from the point. Uniqueness is not a concern (any suitable set of rectangles is fine).
Are there any algorithms which can solve such a problem more effectively? (I.e, without needing to resort to a boolean matrix.
Yes, it's fairly straightforward. I've answered an almost identical question on SO before, but haven't been able to find it yet.
Anyway, essentially you can do this:
start with an output list containing a single output rect equal to the area of interest (some arbitrary bounding box which defines the area of interest and contains all the input rects)
for each input rect
if the input rect intersects any of the rects in the output list
delete the old output rect and generate up to four new output
rects which represent the difference between the intersection
and the original output rect
Optional final step: iterate through the output list looking for pairs of rects which can be merged to a single rect (i.e. pairs of rects which share a common edge can be combined into a single rect).
Alright! First implementation! (java), based of #Paul's answer:
List<Rectangle> slice(Rectangle r, Rectangle mask)
{
List<Rectangle> rects = new ArrayList();
mask = mask.intersection(r);
if(!mask.isEmpty())
{
rects.add(new Rectangle(r.x, r.y, r.width, mask.y - r.y));
rects.add(new Rectangle(r.x, mask.y + mask.height, r.width, (r.y + r.height) - (mask.y + mask.height)));
rects.add(new Rectangle(r.x, mask.y, mask.x - r.x, mask.height));
rects.add(new Rectangle(mask.x + mask.width, mask.y, (r.x + r.width) - (mask.x + mask.width), mask.height));
for (Iterator<Rectangle> iter = rects.iterator(); iter.hasNext();)
if(iter.next().isEmpty())
iter.remove();
}
else rects.add(r);
return rects;
}
List<Rectangle> inverse(Rectangle base, List<Rectangle> rects)
{
List<Rectangle> outputs = new ArrayList();
outputs.add(base);
for(Rectangle r : rects)
{
List<Rectangle> newOutputs = new ArrayList();
for(Rectangle output : outputs)
{
newOutputs.addAll(slice(output, r));
}
outputs = newOutputs;
}
return outputs;
}
Possibly working example here
You should take a look for the space-filling algorithms. Those algorithms are tyring to fill up a given space with some geometric figures. It should not be to hard to modify such algorithm to your needs.
Such algorithm is starting from scratch (empty space), so first you fill his internal data with boxes which you already have on the 2D plane. Then you let algorithm to do the rest - fill up the remaining space with another boxes. Those boxes are making a list of the inverted space chunks of your plane.
You keep those boxes in some list and then checking if a point is on the inverted plane is quite easy. You just traverse through your list and perform a check if point lies inside the box.
Here is a site with buch of algorithms which could be helpful .
I suspect you can get somewhere by ordering the rectangles by y-coordinate, and taking a scan-line approach. I may or may not actually contruct an implementation.
This is relatively simple because your rectangles are non-intersecting. The goal is basically a set of non-intersecting rectangles that fully cover the plane, some marked as original, and some marked as "inverse".
Think in terms of a top-down (or left-right or whatever) scan. You have a current "tide-line" position. Determine what the position of the next horizontal line you will encounter is that is not on the tide-line. This will give you the height of your next tide-line.
Between these tide-lines, you have a strip in which each vertical line reaches from one tide-line to the other (and perhaps beyond in both directions). You can sort the horizontal positions of these vertical lines, and use that to divide your strip into rectangles and identify them as either being (part of) an original rectangle or (part of) an inverse rectangle.
Progress to the end, and you get (probably too many too small) rectangles, and can pick the ones you want. You also have the option (with each step) of combining small rectangles from the current strip with a set of potentially-extendible rectangles from earlier.
You can do the same even when your original rectangles may intersect, but it's a little more fiddly.
Details left as an exercise for the reader ;-)