Good afternoon.
My situation:
In two-dimensional space.
Input: a set of rectangles (overlapping rectangles too).
Rectangles coordinates are integer type.
There are not any constraints on rectangle-size and rectangle-location (only extent of integer).
No rectangles have width=0 or height=0.
I need to find: all rectangles that contain entered point (with integer coordinates).
Questions:
What is the efficient structure to keep rectangles?
What alghorithm is efficient in this case?
And what algorithm is efficient only for adding rectangles without removing?
Thanks :-).
R-Tree is the best data structure suitable for this use case. R-trees are tree data structures used for spatial access methods, i.e., for indexing multi-dimensional information such as geographical coordinates, rectangles or polygons. The information of all rectangles can be stored in tree form so searching will be easy
Wikipedia page, short ppt and the research paper will help you understand the concept.
In java you can use shape.contains
But in general, assuming a rectangle is defined by (x,y,width,height) you do
if (pt.x >= x && pt.x <= x + width && pt.y >= y && pt.y <= y + height) ...
If you have all your rectangles in a collection you can iterate over the collection and find the ones that contain the point
It seems that your set of rectangles may be dynamic ("...for adding rectangles..."). In this case - 2D Interval tree could be the solution.
Here is a simple solution.
Define a grid on your plane.
Each cell maintains two lists: the rectangles that completely cover this cell and the rectangles that partially cover this cell.
On each insertion, put the ID of the target rectangle into all the involved cell lists.
On each query, locate the cell that contains the target point, output the completely cover list and run a scan on the partially cover list.
A rectangle (left, top, right, bottom) contains a point (x, y) if left < x < right and top < y < bottom (assuming coordinates increase going downwards, which is the case with most hardware i've seen; if your coordinates increase going upwards, the more traditionally mathematical case, swap top and bottom). You're not going to get much more efficient than that for the test.
If you consider a rectangle as "containing" a point if it's on the border as well, then replace all the <s with <=.
As for what to do with the collection of rectangles...i dunno. I'd think a list sorted by the coordinates of the corners would do something, but i'm not really seeing much benefit to it...at most, you'd be cutting your list of stuff to check by half, on average (with the worst case still requiring checking everything). Half of a whole freaking lot can still be a whole lot. :)
Related
I am making a simple game and stumbled upon this problem. Assume several points in 2D space. What I want is to make points close to each other interact in some way.
Let me throw a picture here for better understanding of the problem:
Now, the problem isn't about computing the distance. I know how to do that.
At first I had around 10 points and I could simply check every combination, but as you can already assume, this is extremely inefficient with increasing number of points. What if I had a million of points in total, but all of them would be very distant to each other?
I'm trying to find a suitable data structure or a way to look at this problem, so every point can only mind their surrounding and not whole space. Are there any known algorithms for this? I don't exactly know how to name this problem so I can google exactly what I want.
If you don't know of such known algorighm, all ideas are very welcome.
This is a range searching problem. More specifically - the 2-d circular range reporting problem.
Quoting from "Solving Query-Retrieval Problems by Compacting Voronoi Diagrams" [Aggarwal, Hansen, Leighton, 1990]:
Input: A set P of n points in the Euclidean plane E²
Query: Find all points of P contained in a disk in E² with radius r centered at q.
The best results were obtained in "Optimal Halfspace Range Reporting in Three Dimensions" [Afshani, Chan, 2009]. Their method requires O(n) space data structure that supports queries in O(log n + k) worst-case time. The structure can be preprocessed by a randomized algorithm that runs in O(n log n) expected time. (n is the number of input points, and k in the number of output points).
The CGAL library supports circular range search queries. See here.
You're still going to have to iterate through every point, but there are two optimizations you can perform:
1) You can eliminate obvious points by checking if x1 < radius and if y1 < radius (like Brent already mentioned in another answer).
2) Instead of calculating the distance, you can calculate the square of the distance and compare it to the square of the allowed radius. This saves you from performing expensive square root calculations.
This is probably the best performance you're gonna get.
This looks like a nearest neighbor problem. You should be using the kd tree for storing the points.
https://en.wikipedia.org/wiki/K-d_tree
Space partitioning is what you want.. https://en.wikipedia.org/wiki/Quadtree
If you could get those points to be sorted by x and y values, then you could quickly pick out those points (binary search?) which are within a box of the central point: x +- r, y +- r. Once you have that subset of points, then you can use the distance formula to see if they are within the radius.
I assume you have a minimum and maximum X and Y coordinate? If so how about this.
Call our radius R, Xmax-Xmin X, and Ymax-Ymin Y.
Have a 2D matrix of [X/R, Y/R] of double-linked lists. Put each dot structure on the correct linked list.
To find dots you need to interact with, you only need check your cell plus your 8 neighbors.
Example: if X and Y are 100 each, and R is 1, then put a dot at 43.2, 77.1 in cell [43,77]. You'll check cells [42,76] [43,76] [44,76] [42,77] [43,77] [44,77] [42,78] [43,78] [44,78] for matches. Note that not all cells in your own box will match (for instance 43.9,77.9 is in the same list but more than 1 unit distant), and you'll always need to check all 8 neighbors.
As dots move (it sounds like they'd move?) you'd simply unlink them (fast and easy with a double-link list) and relink in their new location. Moving any dot is O(1). Moving them all is O(n).
If that array size gives too many cells, you can make bigger cells with the same algo and probably same code; just be prepared for fewer candidate dots to actually be close enough. For instance if R=1 and the map is a million times R by a million times R, you wouldn't be able to make a 2D array that big. Better perhaps to have each cell be 1000 units wide? As long as density was low, the same code as before would probably work: check each dot only against other dots in this cell plus the neighboring 8 cells. Just be prepared for more candidates failing to be within R.
If some cells will have a lot of dots, each cell having a linked list, perhaps the cell should have an red-black tree indexed by X coordinate? Even in the same cell the vast majority of other cell members will be too far away so just traverse the tree from X-R to X+R. Rather than loop over all dots, and go diving into each one's tree, perhaps you could instead iterate through the tree looking for X coords within R and if/when you find them calculate the distance. As you traverse one cell's tree from low to high X, you need only check the neighboring cell to the left's tree while in the first R entries.
You could also go to cells smaller than R. You'd have fewer candidates that fail to be close enough. For instance with R/2, you'd check 25 link lists instead of 9, but have on average (if randomly distributed) 25/36ths as many dots to check. That might be a minor gain.
Given a set of rectangles represented as tuples (xmin, xmax, ymin, ymax) where xmin and xmax are the left and right edges, and ymin and ymax are the bottom and top edges, respectively - is there any pair of overlapping rectangles in the set?
A straightforward approach is to compare every pair of rectangles for overlap, but this is O(n^2) - it should be possible to do better.
Update: xmin, xmax, ymin, ymax are integers. So a condition for rectangle 1 and rectangle 2 to overlap is xmin_2 <= xmax_1 AND xmax_2 >= xmin_1; similarly for the Y coordinates.
If one rectangle contains another, the pair is considered overlapping.
You can do it in O(N log N) approach the following way.
Firstly, "squeeze" your y coordinates. That is, sort all y coordinates (tops and bottoms) together in one array, and then replace coordinates in your rectangle description by its index in a sorted array. Now you have all y's being integers from 0 to 2n-1, and the answer to your problem did not change (in case you have equal y's, see below).
Now you can divide the plane into 2n-1 stripes, each unit height, and each rectangle spans completely several of them. Prepare an segment tree for these stripes. (See this link for segment tree overview.)
Then, sort all x-coordinates in question (both left and right boundaries) in the same array, keeping for each coordinate the information from which rectangle it comes and whether this is a left or right boundary.
Then go through this list, and as you go, maintain list of all the rectangles that are currently "active", that is, for which you have seen a left boundary but not right boundary yet.
More exactly, in your segment tree you need to keep for each stripe how many active rectangles cover it. When you encounter a left boundary, you need to add 1 for all stripes between a corresponding rectangle's bottom and top. When you encounter a right boundary, you need to subtract one. Both addition and subtraction can be done in O(log N) using the mass update (lazy propagation) of the segment tree.
And to actually check what you need, when you meet a left boundary, before adding 1, check, whether there is at least one stripe between bottom and top that has non-zero coverage. This can be done in O(log N) by performing a sum on interval query in segment tree. If the sum on this interval is greater than 0, then you have an intersection.
squeeze y's
sort all x's
t = segment tree on 2n-1 cells
for all x's
r = rectangle for which this x is
if this is left boundary
if t.sum(r.bottom, r.top-1)>0 // O(log N) request
you have occurence
t.add(r.bottom, r.top-1, 1) // O(log N) request
else
t.subtract(r.bottom, r.top-1) // O(log N) request
You should implement it carefully taking into account whether you consider a touch to be an intersection or not, and this will affect your treatment of equal numbers. If you consider touch an intersection, then all you need to do is, when sorting y's, make sure that of all points with equal coordinates all tops go after all bottoms, and similarly when you sort x's, make sure that of all equal x's all lefts go before all rights.
Why don't you try a plane sweep algorithm? Plane sweep is a design paradigm widely used in computational geometry, so it has the advantage that it is well studied and a lot of documetation is available online. Take a look at this. The line segment intersection problem should give you some ideas, also the area of union of rectangles.
Read about Bentley-Ottman algorithm for line segment intersection, the problem is very similar to yours and it has O((n+k)logn) where k is the number of intersections, nevertheless, since your rectangles sides are parallel to the x and y axis, it is way more simpler so you can modify Bentley-Ottman to run in O(nlogn +k) since you won't need to update the event heap, since all intersections can be detected once the rectangle is visited and won't modify the sweep line ordering, so no need to mantain the events. To retrieve all intersecting rectangles with the new rectangle I suggest using a range tree on the ymin and ymax for each rectangle, it will give you all points lying in the interval defined by the ymin and ymax of the new rectangle and thus the rectangles intersecting it.
If you need more details you should take a look at chapter two of M. de Berg, et. al Computational Geometry book. Also take a look at this paper, they show how to find all intersections between convex polygons in O(nlogn + k), it might prove simpler than my above suggestion since all data strcutures are explained there and your rectangles are convex, a very good thing in this case.
You can do better by building a new list of rectangles that do not overlap. From the set of rectangles, take the first one and add it to the list. It obviously does not overlap with any others because it is the only one in the list. Take the next one from the set and see if it overlaps with the first one in the list. If it does, return true; otherwise, add it to the list. Repeat for all rectangles in the set.
Each time, you are comparing rectangle r with the r-1 rectangles in the list. This can be done in O(n*(n-1)/2) or O((n^2-n)/2). You can even apply this algorithm to the original set without having to create a new list.
Given the coordinates of N rectangles (N<=100.000) in the grid L*C (L and C can range from 0 to 1.000.000.000) I want to know what is the maximum number of rectangle overlapping at any point in the grid.
So I figured I would use a sweeping algorithm, for each event (opening or ending of a rectangle) sorted by x value, I add or remove an interval to my structure.
I have to use a tree to maintain the maximum overlapping of the intervals, and be able to add and remove an interval.
I know how to do that when the values of the intervals (start and end) are ranging from 0 to 100.000, but it is impossible here since the dimensions of the plane are from 0 to 1.000.000.000. How can I implement such a tree?
If you know the coordinates of all the rectangles up-front, you can use "coordinate compression".
Since you only have 10^5 rectangles, that means you have at most 2*10^5 different x and y coordinates. You can therefore create a mapping from those coordinates to natural numbers from 1 to 2*10^5 (by simply sorting the coordinates). Then you can just use the normal tree that you already know for the new coordinates.
This would be enough to get the number of rectangles, but if you also need the point where they overlap, you should also maintain a reverse mapping so you can get back to the real coordinates of the rectangles. In the general case, the answer will be a rectangle, not just a single point.
Use an interval tree. Your case is a bit more complicated because you really need a weighted interval tree, where the weight is the number of open rectangles for that interval.
Input: a set of rectangles within the area (0, 0) to (1600, 1200).
Output: a point which none of the rectangles contains.
What's an efficient algorithm for this? The only two I can currently think of are:
Create a 1600x1200 array of booleans. Iterate through the area of each rectangle, marking those bits as True. Iterate at the end and find a False bit. Problem is that it wastes memory and can be slow.
Iterate randomly through points. For each point, iterate through the rectangles and see if any of them contain the point. Return the first point that none of the rectangles contain. Problem is that it is really slow for densely populated problem instances.
Why am I doing this? It's not for homework or for a programming competition, although I think that a more complicated version of this question was asked at one (each rectangle had a 'color', and you had to output the color of a few points they gave you). I'm just trying to programmatically disable the second monitor on Windows, and I'm running into problems with a more sane approach. So my goal is to find an unoccupied spot on the desktop, then simulate a right-click, then simulate all the clicks necessary to disable it from the display properties window.
For each rectangle, create a list of runs along the horizontal direction. For example a rectangle of 100x50 will generate 50 runs of 100. Write these with their left-most X coordinate and Y coordinate to a list or map.
Sort the list, Y first then X.
Go through the list. Overlapping runs should be adjacent, so you can merge them.
When you find the first run that doesn't stretch across the whole screen, you're done.
I would allocate an image with my favorite graphics library, and let it do rectangle drawing.
You can try a low res version first (scale down a factor 8), that will work if there is at least a 15x15 area. If it fails, you can try a high res.
Use Windows HRGNs (Region in .net). They were kind of invented for this. But that's not language agnostic no.
Finally you can do rectangle subtraction. Only problem is that you can get up to 4 rectangles each time you subtract one rect from another. If there are lots of small ones, this can get out of hand.
P.S.: Consider optimizing for maximized windows. Then you can tell there are no pixels visible without hit testing.
Sort all X-coordinates (start and ends of rectangles), plus 0 & 1600, remove duplicates. Denote this Xi (0 <= i <= n).
Sort all Y-coordinates (start and ends of rectangles), plus 0 & 1200, remove duplicates. Denote this Yj (0 <= j <= m).
Make a n * m grid with the given Xi and Yj from the previous points, this should be much smaller than the original 1600x1200 one (unless you have a thousand rectangles, in which case this idea doesn't apply). Each point in this grid maps to a rectangle in the original 1600 x 1200 image.
Paint rectangles in this grid: find the coordinates of the rectangles in the sets from the first steps, paint in the grid. Each rectangle will be on the form (Xi1, Yj1, Xi2, Yj2), so you paint in the small grid all points (x, y) such that i1 <= x < i2 && j1 <= y < j2.
Find the first unpainted cell in the grid, take any point from it, the center for example.
Note: Rectangles are assumed to be on the form: (x1, y1, x2, y2), representing all points (x, y) such that x1 <= x < x2 && y1 <= y < y2.
Nore2: The sets of Xi & Yj may be stored in a sorted array or tree for O(log n) access. If the number of rectangles is big.
If you know the minimum x and y dimensions of the rectangles, you can use the first approach (a 2D array of booleans) using fewer pixels.
Take into account that 1600x1200 is less than 2M pixels. Is that really so much memory? If you use a bitvector, you only need 235k.
You first idea is not so bad... you should just change the representation of the data.
You may be interessed in a sparse array of booleans.
A language dependant solution is to use the Area (Java).
If I had to do this myself, I'd probably go for the 2d array of booleans (particularly downscaled as jdv suggests, or using accelerated graphics routines) or the random point approach.
If you really wanted to do a more clever approach, though, you can just consider rectangles. Start with a rectangle with corners (0,0),(1600,1200) = (lx,ly),(rx,ry) and "subtract" the first window (wx1,wy1)(wx2,wy2).
This can generate at most 4 new "still available" rectangles if it is completely contained within the original free rectangle: (eg, all 4 corners of the new window are contained within the old one) they are (lx,ly)-(rx,wy1), (lx,wy1)-(wx1,wy2), (wx2,wy1)-(rx,wy2), and (lx,wy2)-(rx,ry). If just a corner of the window overlaps (only 1 corner is inside the free rectangle), it breaks it into two new rectangles; if a side (2 corners) juts in it breaks it into 3; and if there's no overlap, nothing changes. (If they're all axes aligned, you can't have 3 corners inside).
So then keep looping through the windows, testing for intersection and sub-dividing rectangles, until you have a list (if any) of all remaining free space in terms of rectangles.
This is probably going to be slower than any of the graphics-library powered approaches above, but it'd be more fun to write :)
Keep a list of rectangles that represent uncovered space. Initialize it to the entire area.
For each of the given rectangles
For each rectangle in uncovered space
If they intersect, divide the uncovered space into smaller rectangles around the covering rectangle, and add the smaller rectangles (if any) to your list of uncovered ones.
If your list of uncovered space still has any entries, they contain all points not covered by the given rectangles.
This doesn't depend on the number of pixels in your area, so it will work for large (or infinite) resolution. Each new rectangle in the uncovered list will have corners at unique intersections of pairs of other rectangles, so there will be at most O(n^2) in the list, giving a total runtime of O(n^3). You can make it more efficient by keeping your list of uncovered rectangles an a better structure to check each covering rectangle against.
This is a simple solution with a 1600+1200 space complexity only, it is similar in concept to creating a 1600x1200 matrix but without using a whole matrix:
Start with two boolean arrays W[1600] and H[1200] set to true.
Then for each visible window rectangle with coordinate ranges w1..w2 and h1..h2, mark W[w1..w2] and H[h1..h2] to false.
To check if a point with coordinates (w, h) falls in an empty space just check that
(W[w] && H[h]) == true
I am looking for pointers to the solution of the following problem: I have a set of rectangles, whose height is known and x-positions also and I want to pack them in the more compact form. With a little drawing (where all rectangles are of the same width, but the width may vary in real life), i would like, instead of.
-r1-
-r2--
-r3--
-r4-
-r5--
something like.
-r1- -r3--
-r2-- -r4-
-r5--
All hints will be appreciated. I am not necessarily looking for "the" best solution.
Your problem is a simpler variant, but you might get some tips reading about heuristics developed for the "binpacking" problem. There has been a lot written about this, but this page is a good start.
Topcoder had a competition to solve the 3D version of this problem. The winner discussed his approach here, it might be an interesting read for you.
Are the rectangles all of the same height? If they are, and the problem is just which row to put each rectangle in, then the problem boils down to a series of constraints over all pairs of rectangles (X,Y) of the form "rectangle X cannot be in the same row as rectangle Y" when rectangle X overlaps in the x-direction with rectangle Y.
A 'greedy' algorithm for this sorts the rectangles from left to right, then assigns each rectangle in turn to the lowest-numbered row in which it fits. Because the rectangles are being processed from left to right, one only needs to worry about whether the left hand edge of the current rectangle will overlap any other rectangles, which simplifies the overlap detection algorithm somewhat.
I can't prove that this is gives the optimal solution, but on the other hand can't think of any counterexamples offhand either. Anyone?
Something like this?
Sort your collection of rectangles by x-position
write a method that checks which rectangles are present on a certain interval of the x-axis
Collection<Rectangle> overlaps (int startx, int endx, Collection<Rectangle> rects){
...
}
loop over the collection of rectangles
Collection<Rectangle> toDraw;
Collection<Rectangle> drawn;
foreach (Rectangle r in toDraw){
Collection<Rectangle> overlapping = overlaps (r.x, r.x+r.width, drawn);
int y = 0;
foreach(Rectangle overlapRect in overlapping){
y += overlapRect.height;
}
drawRectangle(y, Rectangle);
drawn.add(r);
}
Put a tetris-like game into you website. Generate the blocks that fall and the size of the play area based on your paramters. Award points to players based on the compactness (less free space = more points) of their design. Get your website visitors to perform the work for you.
I had worked on a problem like this before. The most intuitive picture is probably one where the large rectangles are on the bottom, and the smaller ones are on top, kinda like putting them all in a container and shaking it so the heavy ones fall to the bottom. So to accomplish this, first sort your array in order of decreasing area (or width) -- we will process the large items first and build the picture ground up.
Now the problem is to assign y-coordinates to a set of rectangles whose x-coordinates are given, if I understand you correctly.
Iterate over your array of rectangles. For each rectangle, initialize the rectangle's y-coordinate to 0. Then loop by increasing this rectangle's y-coordinate until it does not intersect with any of the previously placed rectangles (you need to keep track of which rectangles have been previously placed). Commit to the y-coordinate you just found, and continue on to process the next rectangle.