Find the rectangle with the smallest area that can hold another rectangle - algorithm

Assume that I have a set of rectangles (with different or same dimensions).
The task is to find (and remove) the rectangle from the set that is larger or equal to a given rectangle.
It should also be the smallest rectangle in the set than can encompass the given rectangle.
This is easily solved in O(n) time by doing a linear search / update, but is it possible to achieve better results?
O(log n) would be optimal I'd assume.
Insert and removal must also be faster than O(n) for this to be of any use in my case.
Can any shortcuts be made by not finding the optimal rectangle, but rather relax the 2nd restriction to:
"It should also be one of the smallest rectangles that can encompass the given rectangle"-
I was thinking along the lines of using a Z-order curve (of the width/height) and use as a one dimensional index and combine that with a tree.
Would that work? Or would there be too much waste?
Another approach would be to use tree using one axis, and then test the other linearly.
Anyone done something similar and can share their experience?

Here's an idea which is not fully elaborated yet:
Maybe you could use a fourfold-branched tree with 2-tuple values (height and width) each representing one rectangle.
One node (w, h) has 4 child-nodes:
(<w, <h) - contains rects which have smaller width and smaller height
(>=w, <h) - contains rects which have greater width and smaller height
(<w, >=h) - contains rects which have smaller width and greater height
(>=w, >=h) - contains rects which have greater width and greater height
When you descend at a (w, h) rect node to look for a container for your (w2, h2) rect there are 4 different cases now:
w2<w and h2<h - three options: (>=w, <h), (<w, >=h), (>=w, >=h)
w2>=w and h2<h - two options: (>=w, <h), (>=w, >=h)
w2<w and h2>=h - two options: (<w, >=h), (>=w, >=h)
w2>=w and h2>=h - one option: (>=w, >=h)
You would have to descend to all possible branches, which is still better than O(n).
Inserting is O(log n).
Not sure about deleting and balancing yet. But I am almost certain there is a solution for that as well.

Related

Compute the number of points inside a circle with given radius for all points

My input will be a list of x and y coordinates, those indicate the pixels on the picture. I will have a given radius r. For each pixel, I need to compute how many other pixels are inside the circle within the radius r. And I have to do this for all the points I have.
I understand that the brute force way to do this will be compare all other point to see if x^2+y^2 <= r^2. The complexity would be O(n^2). I am wondering if there's any approach that I can reduce the complexity to O(nlogn) or O(n)?
The mathematical answer is the area of the circle: Area = PI * r**2, though this includes partial pixels. To get the count of full pixels only, you will need to traverse an axis (x) and get the height of the other axis (y) at each point (rounded down). Do this for a quarter of the circle then multiple by 4. I'm assuming the entire circle is in the picture.
I think the complexity is O(n) based on the circle radius.
You can divide the plane into a grid with each cell having a width equal to the diameter of the circle and assign your points into bins for each cell in the grid. You can implement this as a hashmap/dictionary where the key is the top-left corner of the grid cell and the value is a list of points in the cell. Iterate through your list of points, work out which grid cell it's in and add to the appropriate list.
A circle with radius r can overlap at most 4 grid cells. Now you can iterate over the point list again and for each point you will only need to check the points in the grid cells overlapping the circle centred on the point.
This will reduce the number of checks you need to do but it is still O(n^2) in the pathological case where all points are with r of each other. To further optimize the algorithm, store a quadtree in each grid cell. Each node in the quadtree (the top node being the grid cell) stores a count of the number of points in the node and has up to 4 sub-nodes (empty nodes are not stored), the leaf nodes containing individual points. To test the points in a node against your point, first check if the entire node is within r of the point, if so you can count all the points without checking them individually. If not recurse into the sub-nodes that overlap the circle until you reach individual points to test or you get down to sub-nodes that lie entirely within or outside the circle.
The performance of these algorithms will be highly dependent on the distribution of your points.
If the problem is to count the points at maximum distance R to every point of a list of N points, you can use a kD-tree for acceleration.
With some care you can achieve a running time O(N Log N) for the setup of the tree, then hope for a query time of O(Log N + K) for counting, where K is the average number of neighbors.
This still makes a total of O(N Log N + KN). :-(

Given a set of rectangles, do any overlap?

Given a set of rectangles represented as tuples (xmin, xmax, ymin, ymax) where xmin and xmax are the left and right edges, and ymin and ymax are the bottom and top edges, respectively - is there any pair of overlapping rectangles in the set?
A straightforward approach is to compare every pair of rectangles for overlap, but this is O(n^2) - it should be possible to do better.
Update: xmin, xmax, ymin, ymax are integers. So a condition for rectangle 1 and rectangle 2 to overlap is xmin_2 <= xmax_1 AND xmax_2 >= xmin_1; similarly for the Y coordinates.
If one rectangle contains another, the pair is considered overlapping.
You can do it in O(N log N) approach the following way.
Firstly, "squeeze" your y coordinates. That is, sort all y coordinates (tops and bottoms) together in one array, and then replace coordinates in your rectangle description by its index in a sorted array. Now you have all y's being integers from 0 to 2n-1, and the answer to your problem did not change (in case you have equal y's, see below).
Now you can divide the plane into 2n-1 stripes, each unit height, and each rectangle spans completely several of them. Prepare an segment tree for these stripes. (See this link for segment tree overview.)
Then, sort all x-coordinates in question (both left and right boundaries) in the same array, keeping for each coordinate the information from which rectangle it comes and whether this is a left or right boundary.
Then go through this list, and as you go, maintain list of all the rectangles that are currently "active", that is, for which you have seen a left boundary but not right boundary yet.
More exactly, in your segment tree you need to keep for each stripe how many active rectangles cover it. When you encounter a left boundary, you need to add 1 for all stripes between a corresponding rectangle's bottom and top. When you encounter a right boundary, you need to subtract one. Both addition and subtraction can be done in O(log N) using the mass update (lazy propagation) of the segment tree.
And to actually check what you need, when you meet a left boundary, before adding 1, check, whether there is at least one stripe between bottom and top that has non-zero coverage. This can be done in O(log N) by performing a sum on interval query in segment tree. If the sum on this interval is greater than 0, then you have an intersection.
squeeze y's
sort all x's
t = segment tree on 2n-1 cells
for all x's
r = rectangle for which this x is
if this is left boundary
if t.sum(r.bottom, r.top-1)>0 // O(log N) request
you have occurence
t.add(r.bottom, r.top-1, 1) // O(log N) request
else
t.subtract(r.bottom, r.top-1) // O(log N) request
You should implement it carefully taking into account whether you consider a touch to be an intersection or not, and this will affect your treatment of equal numbers. If you consider touch an intersection, then all you need to do is, when sorting y's, make sure that of all points with equal coordinates all tops go after all bottoms, and similarly when you sort x's, make sure that of all equal x's all lefts go before all rights.
Why don't you try a plane sweep algorithm? Plane sweep is a design paradigm widely used in computational geometry, so it has the advantage that it is well studied and a lot of documetation is available online. Take a look at this. The line segment intersection problem should give you some ideas, also the area of union of rectangles.
Read about Bentley-Ottman algorithm for line segment intersection, the problem is very similar to yours and it has O((n+k)logn) where k is the number of intersections, nevertheless, since your rectangles sides are parallel to the x and y axis, it is way more simpler so you can modify Bentley-Ottman to run in O(nlogn +k) since you won't need to update the event heap, since all intersections can be detected once the rectangle is visited and won't modify the sweep line ordering, so no need to mantain the events. To retrieve all intersecting rectangles with the new rectangle I suggest using a range tree on the ymin and ymax for each rectangle, it will give you all points lying in the interval defined by the ymin and ymax of the new rectangle and thus the rectangles intersecting it.
If you need more details you should take a look at chapter two of M. de Berg, et. al Computational Geometry book. Also take a look at this paper, they show how to find all intersections between convex polygons in O(nlogn + k), it might prove simpler than my above suggestion since all data strcutures are explained there and your rectangles are convex, a very good thing in this case.
You can do better by building a new list of rectangles that do not overlap. From the set of rectangles, take the first one and add it to the list. It obviously does not overlap with any others because it is the only one in the list. Take the next one from the set and see if it overlaps with the first one in the list. If it does, return true; otherwise, add it to the list. Repeat for all rectangles in the set.
Each time, you are comparing rectangle r with the r-1 rectangles in the list. This can be done in O(n*(n-1)/2) or O((n^2-n)/2). You can even apply this algorithm to the original set without having to create a new list.

Minimum number of rectangles in shape made from rectangles?

I'm not sure if there's an algorithm that can solve this.
A given number of rectangles are placed side by side horizontally from left to right to form a shape. You are given the width and height of each.
How would you determine the minimum number of rectangles needed to cover the whole shape?
i.e How would you redraw this shape using as few rectangles as possible?
I've can only think about trying to squeeze as many big rectangles as i can but that seems inefficient.
Any ideas?
Edit:
You are given a number n , and then n sizes:
2
1 3
2 5
The above would have two rectangles of sizes 1x3 and 2x5 next to each other.
I'm wondering how many rectangles would i least need to recreate that shape given rectangles cannot overlap.
Since your rectangles are well aligned, it makes the problem easier. You can simply create rectangles from the bottom up. Each time you do that, it creates new shapes to check. The good thing is, all your new shapes will also be base-aligned, and you can just repeat as necessary.
First, you want to find the minimum height rectangle. Make a rectangle that height, with the width as total width for the shape. Cut that much off the bottom of the shape.
You'll be left with multiple shapes. For each one, do the same thing.
Finding the minimum height rectangle should be O(n). Since you do that for each group, worst case is all different heights. Totals out to O(n2).
For example:
In the image, the minimum for each shape is highlighted green. The resulting rectangle is blue, to the right. The total number of rectangles needed is the total number of blue ones in the image, 7.
Note that I'm explaining this as if these were physical rectangles. In code, you can completely do away with the width, since it doesn't matter in the least unless you want to output the rectangles rather than just counting how many it takes.
You can also reduce the "make a rectangle and cut it from the shape" to simply subtracting the height from each rectangle that makes up that shape/subshape. Each contiguous section of shapes with +ve height after doing so will make up a new subshape.
If you look for an overview on algorithms for the general problem, Rectangular Decomposition of Binary Images (article by Tomas Suk, Cyril Höschl, and Jan Flusser) might be helpful. It compares different approaches: row methods, quadtree, largest inscribed block, transformation- and graph-based methods.
A juicy figure (from page 11) as an appetizer:
Figure 5: (a) The binary convolution kernel used in the experiment. (b) Its 10 blocks of GBD decomposition.

Rectangle packing with constraints

I want to pack a set of rectangles (example):
So that the total height is as low as possible with the constraint that the rectangles must end up in the same column they started in. The rectangles are allowed to "move" through each other to reach the final state, as long as they don't intersect at the end.
Our current algorithm is to process the rectangles from largest height to smallest height, and put them at the lowest y position that's available. Is there a more optimal algorithm?
EDIT: I don't necessarily need the optimal solution, any algorithm that generates a better solution than the current one is interesting. Also, the number of rectangles is around 50.
Suppose you have N rectangles. For each rectangle i, let [a_i, b_i] be the horizontal span, and let h_i be the height.
Your solution space looks like y_i, i = 1, ..., N, where the vertical span of rectangle i is [y_i, y_i + h_i].
Without loss of generality, we can constrain y_i >= 0. We then want to minimize the objective function max{y_i + h_i | i}.
The constraints you have for non-overlapping rectangles are:
y_i + h_i <= y_j
OR
y_j + h_j <= y_i
for all i != j such that `[a_i, b_i]` and `[a_j, b_j]` intersect
Figuring out which [a_i, b_i] intersect with each other is easy, so figuring out for which pairs of rectangles to form these constraints should be straightforward.
To get rid of the OR in our constraint, we can create binary dummy variables z_k for each constraint k and a "Big M" M that is sufficiently large and rewrite:
y_i + h_i <= y_j + (z_k)M
y_j + h_j <= y_i + (1-z_k)M
for all i != j such that `[a_i, b_i]` and `[a_j, b_j]` intersect
We can introduce a dummy variable H and add the constraints y_i + h_i <= H so that we can rewrite the objective function as minimizing H.
The resulting optimization problem is:
minimize H
with respect to: y_i, z_k, H
subject to:
(1) y_i + h_i <= y_j + (z_k)M for all i != j such that [a_i, b_i]
y_j + h_j <= y_i + (1-z_k)M and [a_j, b_j] intersect
(2) y_i + h_i <= H for all i
(3) y_i >= 0 for all i
(4) z_k in {0, 1} for all constraints of type (1) k
This is a mixed-integer linear optimization problem. There are general solvers that exist for this type of problem that you can apply directly.
Typically, they will perform tricks like relaxing the binary constraint on z_k to the constraint that z_k be in [0,1] during the algorithm, which turns this into a linear programming problem, which can be solved very efficiently.
I would not advise trying to reinvent those solvers.
Given that rectangles can only move vertically, there would appear to be only two solutions: moving all rectangles as far upward as you can until a collision occurs, or moving them all downwards until a collision occurs. I have a sneaking suspicion that these solutions are going to be equivalent*. I can't think that there's a much more sophisticated notion of packing when you're constrained to one dimension. Perhaps I'm missing something?
*If I've understood your constraint correctly, the minimal height is going to always be the number of filled cells in the column with the largest number of filled cells. This doesn't vary whether the translation is applied upwards or downwards.
In my humble opinion, the first step is to calculate, for each column, the least required height. Using your picture as an example, the first column requires at least a height of 10, which is contributed by the red, green and small blue rectangles. This is easily done by iterating through every given rectangle and add their corresponding height to the columns it occupies. By doing so, the maximum number in all the "column height" is found, which I call it the "pillar". In your picture, the "pillar" is at column 8:10 with height of 14, contributed by rectangle 1,2,4,6 (numbered from bottom to top). What this means is the minimum height of the packing is at least the height of the "pillar" since the "pillar" columns is solid filled and can't be further reduced. And stacking these four rectangle up forms such picture: (the non-pillar rectangle not shown)
Then the pillar divides the picture into two parts, one is the region to the left of pillar and another on the other side. Also, the "non-pillar" rectangles (R3,5,7,8) are separately positioned to the two regions as well. R3,R7 on the LHS and R5,R8 on the RHS.
Now consider the left side part first. I rearranged the pillar rectangles as shown it in the picture (fig.3):
With the rearranged pillar rectangle stacking order, though I don't have a rigid proof, it is highly possible that no matter what the shapes and what the number of the rectangles are positioned on the LHS of the pillar, all the given rectangles can fit in the empty space on the LHS (the constraint here is these rectangles can't give a higher solide pillar, otherwise the step 1 would have detected already and use it as the actual pillar). This arrangement gives the empty space on LHS the best "space consistency" which means the empty space created by each pillar rectangle is stacked in ascending order from bottom up. This "consistency" let the empty spaces created by each pillar rectangle to "work together" and then contain retangles that are higher than any single empty space created by single pillar rectangle. For example, the green rectangle in next picture is fit in using the empty space created by blue and purple rectangle together.
Assuming the statements above is true, then the rectangles positioned on the LHS will never make a higher stack than the pillar. However, if these retangles requires any cooperation between the empty spaces to fit in on the LHS, then they actually limit the swapping possibility for the pillar rectangles. Use fig.3 as example, the green rectangle requires the purple and blue to be neighbor to fit in, however, to get the best space consistency on RHS, the magenta has to go between the purple and blue. This means the green on LHS makes it impossible to get the best consistency for RHS and consequently makes it possible to have rectangles positioned on RHS can't fit in the empty space and cause a stack with holes and exceeds the height set by the pillar. Sorry that I can't devise such a case here, but it sure makes a difference.
In conclusion:
step 1 is to find the pillar, one easy answer can be found here if every given rectangle is involved in the pillar -- the height of the pillar is the minimum packing height.
step 2 is to examine both side to the pillar.
case a: If one side has no free rectangle positioned, then the other side can be easily filled with the "best consistency" method and resulting minimum packing height is again the pillar height.
case b: If one side doesn't require free space consistency, then that side can be filled and the other side still can use "the best consistency". For example: (your original picture)
In this case, the empty space require for fitting in R3 is solely created by R6 and the same for R7 and R2. Thus swapping the stacking order of R6 and R2 with other pillar rectangle won't make R3, R7 unfit if R3, R7 follow the swapping. Which can result in a "best consistency" case for RHS:
Then RHS can be filled with the RHS positioned rectangles without exceeding the pillar height.
This non-consistency requiring can be identified by comparing the height of the free rectangle to fit in and the height of the pillar rectangle that's to create the free space for it. If the free rectangle's height is no larger than the other's, then it doesn't require a second pillar rectangle to get involved which means it doesn't require free space consistency.
case c: Both sides need free space consistency. This is where troubles kick in. Take fig.3 as example again. The green in fig.3 had the purple and blue combined. This means the green, purple and blue is considered as a whole to swap stacking order with other pillar rectangles to get the LHS's free rectangle the best fit. And within this whole, the blue and purple can swap as well.
If the RHS can't make the fit, resulting in a packing height larger than the pillar height, then it is required to repeat the step two but with fitting the RHS first and try fitting LHS after that. Then the compared lower packing height result is taken as the final result. The logic for this case is unclear, highly possible has better alternate.
I know this should not really be called as a proper solution but rather random and loose thoughts, but it obviously won't fit in the comments. Forgive me for my clumsy explanation and poor picture handling. Hope this helps.

Partition a rectangle into near-squares of given areas

I have a set of N positive numbers, and a rectangle of dimensions X and Y that I need to partition into N smaller rectangles such that:
the surface area of each smaller rectangle is proportional to its corresponding number in the given set
all space of big rectangle is occupied and there is no leftover space between smaller rectangles
each small rectangle should be shaped as close to square as feasible
the execution time should be reasonably small
I need directions on this. Do you know of such an algorithm described on the web? Do you have any ideas (pseudo-code is fine)?
Thanks.
What you describe sounds like a treemap:
Treemaps display hierarchical (tree-structured) data as a set of nested rectangles. Each branch of the tree is given a rectangle, which is then tiled with smaller rectangles representing sub-branches. A leaf node's rectangle has an area proportional to a specified dimension on the data.
That Wikipedia page links to a page by Ben Shneiderman, which gives a nice overview and links to Java implementations:
Then while puzzling about this in the faculty lounge, I had the Aha! experience of splitting the screen into rectangles in alternating horizontal and vertical directions as you traverse down the levels. This recursive algorithm seemed attractive, but it took me a few days to convince myself that it would always work and to write a six line algorithm.
Wikipedia also to "Squarified Treemaps" by Mark Bruls, Kees Huizing and Jarke J. van Wijk (PDF) that presents one possible algorithm:
How can we tesselate a rectangle recursively into rectangles, such that their aspect-ratios (e.g. max(height/width, width/height)) approach 1 as close as possible? The number of all possible tesselations is very large. This problem falls in the category of NP-hard problems. However, for our application we do not need the optimal solution, a good solution
that can be computed in short time is required.
You do not mention any recursion in the question, so your situation might be just one level of the treemap; but since the algorithms work on one level at a time, this should be no problem.
I have been working on something similar. I'm prioritizing simplicity over getting as similar aspect ratios as possible. This should (in theory) work. Tested it on paper for some values of N between 1 and 10.
N = total number of rects to create,
Q = max(width, height) / min(width, height),
R = N / Q
If Q > N/2, split the rect in N parts along its longest side.
If Q <= N/2, split the rect in R (rounded int) parts along its shortest side.
Then split the subrects in N/R (rounded down int) parts along its shortest side.
Subtract the rounded down value from the result of the next subrects division. Repeat for all subrects or until the required number of rects are created.

Resources