Input:
A set of rectangles (overlapping rectangles too) and a set of point.
Coordinates are integer type.
Rectangle 's sides parallel to axis
Output:
All points inside any rectangles given
What is the efficient algorithm and data structure should I use ? Thanks.
You can use a sweep line algorithm: Sort the points by X coordinate. Introduce events when rectangles enter or leave the sweep line (the X coordinates of their left and right border). The rectangles currently intersecting the sweepline are a set of intervals when projected onto the sweep line, so they can be maintained using an interval tree or segment tree (the latter only after Y coordinate compression, but you can do that as a preprocessing step).
With that setup, for every point you just need to check whether it intersects one of the intervals maintained by your data structure.
Runtime: O((n+m) log (n+m))
2d segment tree (example here) is effective data structure to check if points are inside of any rectangle
The best idea I can come up with would be to check for every point (x,y) whether it is contained in any rectangle (l,t,w,h), yielding a runtime bound of O(nm) where n is the number of points and m is the number of rectangles.
Related
Problem
Given an occupancy grid, for example:
...................*
*...............*...
*..*.............*..
...........*........
....................
..*.......X.........
............*.*.*...
....*..........*....
...*........*.......
..............*.....
Where, * represents an occupied block, . represents a free block and X represents a point (or block) of interest, what is the most time-efficient algorithm to find the largest rectangle which includes X, but does not include any obstacles, i.e. any *?
For example, the solution to the provided grid would be:
.....######........*
*....######.....*...
*..*.######......*..
.....######*........
.....######.........
..*..#####X.........
.....######.*.*.*...
....*######....*....
...*.######.*.......
.....######...*.....
My Thoughts
Given we have a known starting point X, I can't help but think there must be a straightforwards solution to "snap" lines to the outer boundaries to create the largest rectangle.
My current thinking is to snap lines to the maximum position offsets (i.e. go to the next row or column until you encounter an obstacle) in a cyclic manner. E.g. you propagate a horizontal line from the point X down until there is a obstacle along that line, then you propagate a vertical line left until you encounter an obstacle, then a horizontal line up and a vertical line right. You repeat this starting at with one of the four moving lines to get four rectangles, and then you select the rectangle with the largest area. However, I do not know if this is optimal, nor the quickest approach.
This problem is a well-known one in Computational Geometry. A simplified version of this problem (without a query point) is briefly described here. The problem with query point can be formulated in the following way:
Let P be a set of n points in a fixed axis-parallel rectangle B in the plane. A P-empty rectangle (or just an empty rectangle for short) is any axis-parallel rectangle that is contained in
B and its interior does not contain any point of P. We consider the problem of preprocessing
P into a data structure so that, given a query point q, we can efficiently find the largest-area
P-empty rectangle containing q.
The paragraph above has been copied from this paper, where authors describe an algorithm and data structure for the set with N points in the plane, which allow to find a maximal empty rectangle for any query point in O(log^4(N)) time. Sorry to say, it's a theoretic paper, which doesn't contain any algorithm implementation details.
A possible approach could be to somehow (implicitly) rule out irrelevant occupied cells: those that are in the "shadow" of others with respect to the starting point:
0 1 X
01234567890123456789 →
0....................
1....................
2...*................
3...........*........
4....................
5..*.......X.........
6............*.......
7....*...............
8....................
9....................
↓ Y
Looking at this picture, you could state that
there are only 3 relevant xmin values for the rectangle: [3,4,5], each having an associated ymin and ymax, respectively [(3,6),(0,6),(0,9)]
there are only 3 relevant xmax values for the rectangle: [10,11,19], each having an associated ymin and ymax, respectively [(0,9),(4,9),(4,5)]
So the problem can be reduced to finding the rectangle with the highest area out of the 3x3 set of unique combinations of xmin and xmax values
If you take into account the preparation part of selecting relevant occupied cells, this has the complexity of O(occ_count), not taking into sorting if this would still be needed and with occ_count being the number of occupied cells.
Finding the best solution (in this case 3x3 combinations) would be O(min(C,R,occ_count)²). The min(C,R) includes that you could choose the 'transpose' the approach in case R<C, (which is actually true in this example) and that that the number of relevant xmins and xmaxs have the number of occupied cells as an upper limit.
Given a set of rectangles represented as tuples (xmin, xmax, ymin, ymax) where xmin and xmax are the left and right edges, and ymin and ymax are the bottom and top edges, respectively - is there any pair of overlapping rectangles in the set?
A straightforward approach is to compare every pair of rectangles for overlap, but this is O(n^2) - it should be possible to do better.
Update: xmin, xmax, ymin, ymax are integers. So a condition for rectangle 1 and rectangle 2 to overlap is xmin_2 <= xmax_1 AND xmax_2 >= xmin_1; similarly for the Y coordinates.
If one rectangle contains another, the pair is considered overlapping.
You can do it in O(N log N) approach the following way.
Firstly, "squeeze" your y coordinates. That is, sort all y coordinates (tops and bottoms) together in one array, and then replace coordinates in your rectangle description by its index in a sorted array. Now you have all y's being integers from 0 to 2n-1, and the answer to your problem did not change (in case you have equal y's, see below).
Now you can divide the plane into 2n-1 stripes, each unit height, and each rectangle spans completely several of them. Prepare an segment tree for these stripes. (See this link for segment tree overview.)
Then, sort all x-coordinates in question (both left and right boundaries) in the same array, keeping for each coordinate the information from which rectangle it comes and whether this is a left or right boundary.
Then go through this list, and as you go, maintain list of all the rectangles that are currently "active", that is, for which you have seen a left boundary but not right boundary yet.
More exactly, in your segment tree you need to keep for each stripe how many active rectangles cover it. When you encounter a left boundary, you need to add 1 for all stripes between a corresponding rectangle's bottom and top. When you encounter a right boundary, you need to subtract one. Both addition and subtraction can be done in O(log N) using the mass update (lazy propagation) of the segment tree.
And to actually check what you need, when you meet a left boundary, before adding 1, check, whether there is at least one stripe between bottom and top that has non-zero coverage. This can be done in O(log N) by performing a sum on interval query in segment tree. If the sum on this interval is greater than 0, then you have an intersection.
squeeze y's
sort all x's
t = segment tree on 2n-1 cells
for all x's
r = rectangle for which this x is
if this is left boundary
if t.sum(r.bottom, r.top-1)>0 // O(log N) request
you have occurence
t.add(r.bottom, r.top-1, 1) // O(log N) request
else
t.subtract(r.bottom, r.top-1) // O(log N) request
You should implement it carefully taking into account whether you consider a touch to be an intersection or not, and this will affect your treatment of equal numbers. If you consider touch an intersection, then all you need to do is, when sorting y's, make sure that of all points with equal coordinates all tops go after all bottoms, and similarly when you sort x's, make sure that of all equal x's all lefts go before all rights.
Why don't you try a plane sweep algorithm? Plane sweep is a design paradigm widely used in computational geometry, so it has the advantage that it is well studied and a lot of documetation is available online. Take a look at this. The line segment intersection problem should give you some ideas, also the area of union of rectangles.
Read about Bentley-Ottman algorithm for line segment intersection, the problem is very similar to yours and it has O((n+k)logn) where k is the number of intersections, nevertheless, since your rectangles sides are parallel to the x and y axis, it is way more simpler so you can modify Bentley-Ottman to run in O(nlogn +k) since you won't need to update the event heap, since all intersections can be detected once the rectangle is visited and won't modify the sweep line ordering, so no need to mantain the events. To retrieve all intersecting rectangles with the new rectangle I suggest using a range tree on the ymin and ymax for each rectangle, it will give you all points lying in the interval defined by the ymin and ymax of the new rectangle and thus the rectangles intersecting it.
If you need more details you should take a look at chapter two of M. de Berg, et. al Computational Geometry book. Also take a look at this paper, they show how to find all intersections between convex polygons in O(nlogn + k), it might prove simpler than my above suggestion since all data strcutures are explained there and your rectangles are convex, a very good thing in this case.
You can do better by building a new list of rectangles that do not overlap. From the set of rectangles, take the first one and add it to the list. It obviously does not overlap with any others because it is the only one in the list. Take the next one from the set and see if it overlaps with the first one in the list. If it does, return true; otherwise, add it to the list. Repeat for all rectangles in the set.
Each time, you are comparing rectangle r with the r-1 rectangles in the list. This can be done in O(n*(n-1)/2) or O((n^2-n)/2). You can even apply this algorithm to the original set without having to create a new list.
I have googled about this topic a bit and found this on http://www.geeksforgeeks.org/
Interval tree is mainly a geometric data structure and often used for windowing queries, for instance, to find all roads on a computerized map inside a rectangular viewport, or to find all visible elements inside a three-dimensional scene.
Now my question is actually in two parts:
How is interval tree used to find all roads on a computerized map?
What are some other example of practical applications of interval tree?
P.S: Brief explanations with reference to more reading materials on interval tree will be more than welcomed
In the windowing query, given a set of line segments and an axis-aligned rectangle, we have to find the intersections of the line segments with the rectangle. This can be solved by using Interval Trees in combination with Range Trees.
Range Trees are an efficient data structure for finding the set of points present within a Range/Rectangle. So they can be used to find all the line segments such that one of the end points of each line segment is present in the query Rectangle. These correspond to the blue line segments in the figure below.
Interval Trees can be used to find those segments that overlap with the window but whose endpoints are outside the window. These are the red segments in the figure.
Before solving this problem, imagine a more restricted version of the problem in which all line segments are horizontal or vertical. In this case any horizontal segment that intersects the rectangle should intersect the left (and right) vertical edge of the rectangle. If we think of the horizontal segments as intervals and the vertical edge of the rectangle as a point, the problem is to find all the intervals that contain the point. Thus we can solve this problem using interval trees. Similarly we can find all intersecting vertical line segments.
The general version of the problem where line segments aren't parallel to axis cannot be solved perfectly using interval trees. However we can use interval trees to eliminate the overwhelming majority of the line segments that don't overlap with the query rectangle. For each line segment in the input, we construct an axis-parallel rectangle whose diagonal is the line segment. We then construct (horizontal and vertical) interval trees using the sides of the rectangles. Given a query rectangle R, we can first find all rectangles that intersect R as before. The corresponding line segments have a high chance of intersecting with R and can be checked individually for actual intersection.
Maybe not directly answer your question but I think it might helpful:
Enclosing Interval Searching Problem:
Given a set S of n intervals and a query point, q, report all those intervals containing q.
Overlapping Interval Searching Problem:
Given a set S of n intervals and a query interval Q, report all those intervals in S overlapping Q.
Reference (also compare with other similar data structure like segement tree): http://www.iis.sinica.edu.tw/~dtlee/dtlee/CRCbook_chapter18.pdf
Given the coordinates of N rectangles (N<=100.000) in the grid L*C (L and C can range from 0 to 1.000.000.000) I want to know what is the maximum number of rectangle overlapping at any point in the grid.
So I figured I would use a sweeping algorithm, for each event (opening or ending of a rectangle) sorted by x value, I add or remove an interval to my structure.
I have to use a tree to maintain the maximum overlapping of the intervals, and be able to add and remove an interval.
I know how to do that when the values of the intervals (start and end) are ranging from 0 to 100.000, but it is impossible here since the dimensions of the plane are from 0 to 1.000.000.000. How can I implement such a tree?
If you know the coordinates of all the rectangles up-front, you can use "coordinate compression".
Since you only have 10^5 rectangles, that means you have at most 2*10^5 different x and y coordinates. You can therefore create a mapping from those coordinates to natural numbers from 1 to 2*10^5 (by simply sorting the coordinates). Then you can just use the normal tree that you already know for the new coordinates.
This would be enough to get the number of rectangles, but if you also need the point where they overlap, you should also maintain a reverse mapping so you can get back to the real coordinates of the rectangles. In the general case, the answer will be a rectangle, not just a single point.
Use an interval tree. Your case is a bit more complicated because you really need a weighted interval tree, where the weight is the number of open rectangles for that interval.
Say I have a vector polygon with holes. I need to flood fill it by drawing connected segments. Of course, since there are holes, I can't fill it using a single continous polyline: I'll need to interrupt my path sometimes, then move to an area which was skipped and start another polyline there.
My goal is to find a set of polylines needed to fill the whole polygon. Better if I can find the smallest set (that is, the way I can fill the polygon with the minimum number of interruptions).
Bonus question: how could I do that for partial density fills? Say, I don't want to fill at 100% density but I want a 50% (this will require that fill lines, supposing they're parallel each other and have a single-unit width, are put at a distance of two units).
I couldn't find a similar question here, although there are many related to flood-fill algorithms.
Any ideas or pointers?
Update: this picture from Wikipedia shows a good hypotetical flood path. I believe I could do that using a bitmap. However I've got a vector polygon. Should I rasterize it?
I'm assuming here that the distance between lines is 1 unit.
A crude implementation, with no guarantee to find the minimum number of polyline, is:
Start with an empty set of polylines.
Determine minx and maxx of the polygon.
Loop x from xmin to xmax, with a step of 1. Line L is the vertical line at x.
Intersect vertical line L with your polygon (quick algorithm, easy to find). That will give you a set of segments: {(x,y1)-(x,y2)}.
For all polylines, and all segments, merge segment + end of polylines (see note 1 below). When you merge a segment and a polyline, append a small stretch at the end of the polyline (to joint it to the segment), and the segment itself. For all segments that you can't merge using that, add a new polyline in the global set.
At the end, try to merge again polylines if possible (ends close together).
Optimal algorithm for merging new segments to existing polylines should be easy to find (hashing on y), or a brute force algorithm may suffice:
number of new segments per line scan should not be too high if your polygons do not have zillions of holes,
number of global polylines at every step should not be too large,
you compare only with the end segment of each polylines, not the whole of it.
Added note (1): To cover the case where your polygon has nearly-vertical edges, the merge process should not look only at y-delta, but allow a merge if any two y range overlaps (that means end of polyline y-range overlap segment y-range).