I have a set of rectangles and arbitrary shape in 2D space. The shape is not necessary a polygon (it may be a circle), and rectangles have different widths and heights. The task is to approximate the shape with rectangles as close as possible. I can't change rectangles dimensions, but rotation is permitted.
It sounds very similar to packing problem and covering problem but covering area is not rectangular...
I guess it's NP problem, and I'm pretty sure there should be some papers that show good heuristics to solve it, but I don't know what to google? Where should I start?
Update: One idea just came into my mind but I'm not sure if it's worth investigating. What if we consider bounding shape as a physical mold filled with water. Each rectangle is considered as a positively charged particle with size. Now drop the smallest rectangle to it. Then drop the next by size at random point. If rectangles too close they repel each other. Keep adding rectangles until all are used. Could this method work?
I think you could look for packing and automatic layout generation algorithms. Automatic VLSI layout generation algorithms might need similar things, just like textile layout questions...
This paper Hegedüs: Algorithms for covering polygons by rectangles seems to address a similar problem. And since this paper is from 1982, it might be interesting to look at the papers which cite this one. Additionally, this meeting seems to be discussing research problems related to this, so might be a starting point for keywords or names who do research in this idea.
I don't know if the computational geometry research has algorithms for your specific problem, or if these algorithms are easy/practical enough to implement. Here is how I would approach it if I had to do it without being able to look up previous work. This is just a direction, by far not a solution...
Formulate it as an optimization problem. You have discrete variables of which rectangles you choose (yes or no) and continuous variables (location and orientation of the triangles). Now you can set up two independent optimizations: a discrete optimization which picks the rectangles; and a continuous that optimizes for the location and orientation once rectangles are given. Interleave these two optimizations. Of course the difficulty lies in the formulation of optimizations, and designing your error energy such that it does not get stuck in some strange configurations (local minima). I'd try to get the continuous as a least squares problem such that I can use standard optimizations libraries.
I think this problem is suitable for solving with genetic algorithm and/or evolutionary strategy algorithm. I've done similar box packing problem with the help of evolutionary strategy algorithm of some kind. Check this out in my blog.
So if you will use such approach - encode into chromosomes box:
x coordinate
y coordinate
angle
Then try to minimize such fitness function-
y = w1 * box_intersection_area +
w2 * box_area_out_of_shape +
w3 * average_circle_radius_in_free_space
Choose weights w1,w2,w3 such as to affect importance of factors. When genetic algorithm will find partial solution - remove boxes which still overlaps together or are out of shape - and you will have at least legal (but not necessary optimal) solution.
good luck in this interesting problem !
It is NP hard indeed and since it has hi-tech application, reasonably efficients approximate strategies are not even in patents, let alone published papers.
The best you can do with a limited budget is to start by limiting the problem. Assume that all rectangles are exactly the same, Assume that all rectangles which are binary sub-divisions of your standard rectangle are also allowed since you can efficiently pre-pack them to fit your core division. For extra points you can also form several fixed schemas for gluing core rectangles to cover a few larger shapes with substantially different proportions. Assume that you can change dimensions of your standard rectangle/cell as long as the rest (pre-packing and gluing schema) remains the same - this gives you parameters to decide approximate size of the core rectangle based on rectangles you are given.
Now you can play with aspect ratios to approximate the error such limited system could guarantee. For the first iterations assume that it can have 50% error with a simple sub-division schema and then change schema to reduce the error but without increasing asymptotic complexity of pre-packing. At the end of the day you are always just assigning given rectangles to your pre-calculated and now fixed grid and binary sub-divisions - meaning you are not trying to do a layout or backtrack at all - you are always happy with the first approximate fit into the grid.
Work on defining classes of rectangles that pack well with your schema - that's again to keep the whole process inverted - you are never trying to actually fit what you are given - you are defining what you have to be given in order to fir it well - then you punt the rest as error since it is approximation.
Then you can try to do a bit more, but not much more - any slip into backtracking or nailing arbitrary small error and it's exponential.
If you are at a research facility and can get some supercomputer time - run a set of exhaustive searches with pathological mixes there just to see how optimal packing may look like and to see if you can derive a few more sub-division schemas and/or classes of rectangle sets.
That should be enough for the first 2 yrs or research :-)
Related
consider an image like this:
by grouping pixels by color into distinct rectangles, different configurations might be achieved, for example:
the goal is to find one of the best configurations, i.e. a configuration which has the least possible number of rectangles (rectangles sizes are not important).
any idea on how to design an efficient algorithm which is able to solve this problem?
EDIT:
i think the best answer is the one by #dshin, as they proved that this problem is a NP-HARD one so there probably isn't any efficient solution that is able to guarantee an optimal result.
other answers provide reasonable compromises to get an acceptable solution, but that won't always be the optimal one.
Each connected colored region is a rectilinear polygon that can be considered independently, and so your problem amounts to solving the minimum rectangle covering for rectilinear polygons. This is a well-studied problem that finds applications in some fields, like VLSI.
For convex rectilinear polygons, there is an algorithm that finds the optimal solution in polynomial time, described in this 1984 thesis.
The non-convex case is NP-hard (reference), so an efficient optimal solution likely does not exist. But there are several algorithms which produce good empirical results. This 1990 publication describes three separate algorithms, each of which are guaranteed to use at most twice as many rectangles as the optimal solution. This 2016 publication describes an algorithm that uses the common IP + LP relaxation technique, which apparently produces better results in real-life problem instances, although lacking in theoretical guarantees. Unfortunately, both publications are behind paywalls, and I haven't been able to find free resources that describe the algorithms.
If you are just looking for something reasonable, and your problem instances are not pathological in nature, then the algorithms described in other answers are probably good enough.
I don't have a proof but my feeling is a greedy approach should solve this problem:
Start on the upper left (or in whichever corner)
Expand rectangle 1px to the right as long as colors match
Expand rectangle 1px to the bottom as long as all colors in that row match
Line by line and column by column, find the next pixel that is not already part of a square (maybe keep track of visited pixels in a second array) and repeat 2 and 3.
You can switch lines and columns and go up and left or whatever instead and end up with different configurations, but from playing this through in my mind I think the number of rectangles should always be the same.
The idea here is based on the following links: Link 1 and Link 2.
In both the cases, the largest possible rectangle is computed within a given polygon/shape. Check both the above links for details.
We can extend the idea above to the problem at hand.
Steps:
Filter the image by color (say red)
Find the largest possible rectangle in the red region. After doing so mask it.
Repeat to find the next biggest rectangle until all the portions in red have been covered.
Repeat the above for every unique color.
Overview:
Given a collection of say, 50, images with various widths and heights, how would one go about programmatically arranging them in an interesting* abstract way? (see image below)
By interesting I mean, no large gaps, and no easily distinguishable rows or columns (negative space forms a lot of T-like intersections).
For my specific case, all images have a set max dimension of 150px, which could mean the height OR width is a max of 150px (could be 150px by 450px, or 378px by 150px).
This seems like it could be a classic programming challenge but I'm finding the topic hard to Google...
EDIT: Changed image to show that there is no restriction on how the overall arrangement must be (doesn't have to fit inside a set area)
If you are not opposed to jquery plugin, you can check this out - http://masonry.desandro.com/
Your problem is NP-Hard.
This thread shows that even with one type of nXm rectangles, it is NP-Hard to find if there is a solution , so your more generalized problem is of course NP-Hard as well [The only one type of rectangle is a private case of this problem]
You could try a backtracking solution if you are after optimized solution, or a heuristic approach such as genetic algorithms or hill climbing, which will be faster - but will usually find a non optimal result.
I have built something similar to this (although it is probably not the most sophisticated solution). My approach was to use a quadtree to organize the rectangles that I had placed on the canvas. I then just basically went around the center point in a spiral, trying to place new rectangles, and using the quadtree to detect collisions. If I detected a collision then I would move the rectangle I was trying to place to the edge of the rectangle that it collided with that was furthest from the center and repeat the collision checking process.
Again, probably not the most sophisticated method, and it does tend to leave some larger gaps between rectangles (the borders between them are not uniform), but to my taste it gave good results.
Here is the problem:
I have many sets of points, and want to come up with a function that can take one set and rank matches based on their similarity to the first. Scaling, translation, and rotation do not matter, and some points may be missing from any of the sets of points. The best match is the one that if scaled and translated in the ideal way has the least mean square error between points (maybe with a cap on penalty, or considering only the best fraction of points to handle missing points).
I am trying to come up with a good way to do this, and am wondering if there are any well known algorithms that can handle this type of problem? Just the name of something would be awesome! I lack a formal CSCI or math education, and am doing the best to teach myself.
A few things I have tried
The first thing that comes to mind is to normalize the points somehow, but I dont think that this is helpful because the missing points may throw things off.
The best way I can think of is to estimate a starting point by translating to match their centroids, scaling so that the largest distances from the centroid of the sets match. From there, do an A* search, scaling, rotating, and translating until I reach a maximum, and then compare the two sets. (I hope I am using the term A* correctly, I mean trying small translations and scalings and selecting the move giving the best match) I think this will find the global maximum most of the time, but is not guaranteed to. I am looking for a better way that will always be correct.
Thanks a ton for the help! It has been fun and interesting trying to figure this out so far, so I hope it is for you as well.
There's a very clever algorithm for identifying starfields. You find 4 points in a diamond shape and then using the two stars farthest apart you define a coordinate system locating the other two stars. This is scale and rotation invariant because the locations are relative to the first two stars. This forms a hash. You generate several of these hashes and use those to generate candidates. Once you have the candidates you look for ones where multiple hashes have the correct relationships.
This is described in a paper and a presentation on http://astrometry.net/ .
This paper may be useful: Shape Matching and Object Recognition Using Shape Contexts
Edit:
There is a couple of relatively simple methods to solve the problem:
To combine all possible pairs of points (one for each set) to nodes, connect these nodes where distances in both sets match, then solve the maximal clique problem for this graph. Since the maximal clique problem is NP-complete, the complexity is probably O(exp(n^2)), so if you have too many points, don't use this algorithm directly, use some approximation.
Use Generalised Hough transform to match two sets of points. This approach has less complexity (O(n^4)). But it is more complicated, so I cannot explain it here.
You can find the details in computer vision books, for example "Machine vision: theory, algorithms, practicalities" by E. R. Davies (2005).
I have polygons that define the contour of counties in the UK. These shapes are very detailed (10k to 20k points each), thus rendering the related computations (is point X in polygon P?) quite computationaly expensive.
Thus, I would like to "subsample" my polygons, to obtain a similar shape but with less points. What are the different techniques to do so?
The trivial one would be to take one every N points (thus subsampling by a factor N), but this feels too "crude". I would rather do some averaging of points, or something of that flavor. Any pointer?
Two solutions spring to mind:
1) since the map of the UK is reasonably squarish, you could choose to render a bitmap with the counties. Assign each a specific colour, and then render the borders with a 1 or 2 pixel thick black line. This means you'll only have to perform the expensive interior/exterior calculation if a sample happens to lie on the border. The larger the bitmap, the less often this will happen.
2) simplify the county outlines. You can use a recursive Ramer–Douglas–Peucker algorithm to recursively simplify the boundaries. Just make sure you cache the results. You may also have to solve this not for entire county boundaries but for shared boundaries only, to ensure no gaps. This might be quite tricky.
Here you can find a project dealing exactly with your issues. Although it works primarily with an area "filled" by points, you can set it to work with a "perimeter" type definition as yours.
It uses a k-nearest neighbors approach for calculating the region.
Samples:
Here you can request a copy of the paper.
Seemingly they planned to offer an online service for requesting calculations, but I didn't test it, and probably it isn't running.
HTH!
Polygon triangulation should help here. You'll still have to check many polygons, but these are triangles now, so they are easier to check and you can use some optimizations to determine only a small subset of polygons to check for a given region or point.
As it seems you have all the algorithms you need for polygons, not only for triangles, you can also merge several triangles that are too small after triangulation or if triangle count gets too high.
I've been searching far and wide on the seven internets, and have come to no avail. The closest to what I need seems to be The cutting stock problem, only in 2D (which is disappointing since Wikipedia doesn't provide any directions on how to solve that one). Another look-alike problem would be UV unwrapping. There are solutions there, but only those that you get from add-ons on various 3D software.
Cutting the long talk short - what I want is this: given a rectangle of known width and height, I have to find out how many shapes (polygons) of known sizes (which may be rotated at will) may I fit inside that rectangle.
For example, I could choose a T-shaped piece and in the same rectangle I could pack it both in an efficient way, resulting in 4 shapes per rectangle
as well as tiling them based on their bounding boxes, case in which I could only fit 3
But of course, this is only an example... and I don't think it would be much use to solving on this particular case. The only approaches I can think of right now are either like backtracking in their complexity or solve only particular cases of this problem. So... any ideas?
Anybody up for a game of Tetris (a subset of your problem)?
This is known as the packing problem. Without knowing what kind of shapes you are likely to face ahead of time, it can be very difficult if not impossible to come up with an algorithm that will give you the best answer. More than likely unless your polygons are "nice" polygons (circles, squares, equilateral triangles, etc.) you will probably have to settle for a heuristic that gives you the approximate best solution most of the time.
One general heuristic (though far from optimal depending on the shape of the input polygon) would be to simplify the problem by drawing a rectangle around the polygon so that the rectangle would be just big enough to cover the polygon. (As an example in the diagram below we draw a red rectangle around a blue polygon.)
Once we have done this, we can then take that rectangle and try to fit as many of that rectangle into the large rectangle as possible. This simplfies the problem into a rectangle packing problem which is easier to solve and wrap your head around. An example of an algorithm for this is at the following link:
An Effective Recursive Partitioning Approach for the Packing of Identical Rectangles in a Rectangle.
Now obviously this heuristic is not optimal when the polygon in question is not close to being the same shape as a rectangle, but it does give you a minimum baseline to work with especially if you don't have much knowledge of what your polygon will look like (or there is high variance in what the polygon will look like). Using this algorithm, it would fill up a large rectangle like so:
Here is the same image without the intermediate rectangles:
For the case of these T-shaped polygons, the heuristic is not the best it could be (in fact it may be almost a worst case scenario for this proposed approximation), but it would work very well for other types of polygons.
consider what the other answer said by placing the t's into a square, but instead of just leaving it as a square set the shapes up in a list. Then use True and False to fill the nested list as the shape i.e. [[True,True,True],[False,True,False]] for your T shape. Then use a function to place the shapes on the grid. To optimize the results, create a tracker which will pay attention to how many false in a new shape overlap with trues that are already on the grid from previous shapes. The function will place the shape in the place with the most overlaps. There will have to be modifications to create higher and higher optimizations, but that is the general premise which you are looking for.