algorithm to recursively divide a polygon into in/out quadrants: what's it called and where's the code? - computational-geometry

I have a lot of points (hundreds of thousands) and I want to check which ones are inside a polygon. For a relatively small polygon (i.e., likely to contain only tens or hundreds of points) I can just use the bounding box of the polygon as an initial check, and then do a regular point-in-poly check for those points inside the box. But imagine a large (i.e., likely to contain thousands of my points), irregularly shaped polygon. Many points will pass the bounding box check, and furthermore the point-in-poly check will be more expensive because the larger polygon is made up of many more points. So I'd like to be able to filter most points in or out without having to do the full point-in-poly check.
So, I have a plan, and mainly I want to know if what I'm describing is a well-known algorithm, and if so what it's called and where I might find existing code for it. I don't believe what I'm describing is either a quad-tree or an r-tree, and I don't know how to search for it. I'm calling it a "rect tree" below.
The idea is, to handle these larger polygons:
Do a "rect tree" pre-process, where the depth of the rect tree varies by the size of the polygon (i.e., allow more depth for a larger polygon). The rect tree would divide the bounding box of the polygon into four quarters. It would check if each quarter-rect is fully inside the polygon, fully outside the polygon, or neither. In the case of neither it would recursively divide the subrects, continuing in this way until all rects were either fully inside or outside, or the max depth had been reached. So the idea is that (a) the pre-processing time to make this tree, even though it itself will do several point-in-polygon checks, is well worth it because that time is dwarfed by the number of points to be checked, and (b) the vast majority of points can be dealt with using simple bounding box checks (generally a few such checks as you descend the tree), and then a relatively small number would have to do the full point-in-polygon check (for when you reach a leaf node that is still "neither").
What's that algorithm called? And where is the code? It doesn't in fact seem so hard to write, but I figured I'd ask before jumping into coding.

I actually ended up using a related but different approach. I realized that essentially this tree structure I was building up was no more than the polygon drawn at a low resolution. For instance, if my tree went down to a depth of 8, really that was just like drawing my polygon on a bitmap with resolution 256x256 and then doing pixel hit tests against that polygon. So I extended that idea and used a fast graphics library (the CImg library). I draw the polygon on a black-and-white bitmap of size 4000x4000. Then I just check the points as pixels against that bitmap. The magic is that drawing that huge bitmap is really fast compared to the time it was taking me to construct the tree. So I get much higher resolution than I ever could have with my tree.
One issue is being able to detect points near the very edge of the polygon, which may be included or excluded incorrectly due to rounding/resolution issues, even at the 4000x4000 size. If you need to know precisely whether those points are in or out, you could draw a stroke around the polygon in another color, and if your pixel test hit that color, you'd do the full point in poly check. For my purposes the 4000x4000 resolution was good enough (I could tolerate incorrect inclusion/exclusion for some of my edge points).
So the fundamental trick of this solution is the idea that polygon drawing algorithms are just super fast compared to other ways you might "digitize" your polygon.

Related

How to fill an area with a collection of rectangles and no space left

I have a collection of different sized rectangles that need to be placed on a 2D surface of a known dimension, with the condition that:
there is no overlap between the blocks,
the rectangles may not be rotated,
there is no area left blank (whole surface needs to be filled).
The rectangles are actually generated by breaking down the surface area, so they ought to fill up the area completely once you start putting everything together.
I figured there would be a dozen algorithms available for this problem, but the algorithms I find are more like best effort algorithms such as sprite generators that do not have the precondition that the whole area needs to be (can be...) filled -- which obviously is not necessary when building sprites, however, it is in my case.
I am a bit lost here, either this problem isn't as simple as I thought, or I am searching on wrong keywords.
Some topics I have found but do not fully suit my needs:
What algorithm can be used for packing rectangles of different sizes into the smallest rectangle possible in a fairly optimal way? (in my case, the area is preset)
How to arrange N rectangles to cover minimum area (in my case, minimum area must equal zero)
Is there any algorithm out there that may suit my needs?
IMHO, the most natural solution is recursive. For the form of source area is not set. And after removing a rectangle from it, we have the same task, only with smaller area and -1 rectangle.
I would start from the edges, because there the possible variants are already limited. So, simply go by spiral, trying to put rectangles along the edge. If no rectangle fits, go back. That will be the simplest and not so slow raw force method.

How to find if a 3D object fits in another 3D object (the container)?

Given two 3d objects, how can I find if one fits inside the second (and find the location of the object in the container).
The object should be translated and rotated to fit the container - but not modified otherwise.
Additional complications:
The same situation - but look for the best fit solution, even if it's not a proper match (minimize the volume of the object that doesn't fit in the container)
Support for elastic objects - find the best fit while minimizing the "distortion" in the objects
This is a pretty general question - and I don't expect a complete solution.
Any pointers to relevant papers \ articles \ libraries \ tools would be useful
Here is one perhaps less than ideal method.
You could try fixing the position (in 3D space) of 1 shape. Placing the other shape on top of that shape. Then create links that connect one point in shape to a point in the other shape. Then simulate what happens when the links are pulled equally tight. Causing the point that isn't fixed to rotate and translate until it's stable.
If the fit is loose enough, you could use only 3 links (the bare minimum number of links for 3D) and try every possible combination. However, for tighter fit fits, you'll need more links, perhaps enough to place them on every point of the shape with the least number of points. Which means you'll some method to determine how to place the links, which is not trivial.
This seems like quite hard problem. Probable approach is to have some heuristic to suggest transformation and than check is it good one. If transformation moves object only slightly out of interior (e.g. on one part) than make slightly adjust to transformation and test it. If object is 'lot' out (e.g. on same/all axis on both sides) than make new heuristic guess.
Just an general idea for a heuristic. Make a rasterisation of an objects with same pixel size. It can be octree of an object volume. Make connectivity graph between pixels. Check subgraph isomorphism between graphs. If there is a subgraph than that position is for a testing.
This approach also supports 90deg rotation(s).
Some tests can be done even on graphs. If all volume neighbours of a subgraph are in larger graph, than object is in.
In general this is 'refined' boundary box approach.
Another solution is to project equal number of points on both objects and do a least squares best fit on the point sets. The point sets probably will not be ordered the same so iterating between the least squares best fit and a reordering of points so that the points on both objects are close to same order. The equation development for this is a lot of algebra but not conceptually complicated.
Consider one polygon(triangle) in the target object. For this polygon, find the equivalent polygon in the other geometry (source), ie. the length of the sides, angle between the edges, area should all be the same. If there's just one match, find the rigid transform matrix, that alters the vertices that way : X' = M*X. Since X' AND X are known for all the points on the matched polygons, this should be doable with linear algebra.
If you want a one-one mapping between the vertices of the polygon, traverse the edges of the polygons in the same order, and make a lookup table that maps each vertex one one poly to a vertex in another. If you have a half edge data structure of your 3d object that'll simplify this process a great deal.
If you find more than one matching polygon, traverse the source polygon from both the points, and keep matching their neighbouring polygons with the target polygons. Continue until one of them breaks, after which you can do the same steps as the one-match version.
There're more serious solutions that're listed here, but I think the method above will work as well.
What a juicy problem !. As is typical in computational geometry this problem
can be very complicated with a mismatched geometric abstraction. With all kinds of if-else cases etc.
But pick the right abstraction and the solution becomes trivial with few sub-cases.
Compute the Distance Transform of your shapes and Voilà! Your solution is trivial.
Allow me to elaborate.
The distance map of a shape on a grid (pixels) encodes the distance of the closest point on the
shape's border to that pixel. It can be computed in both directions outwards or inwards into the shape.
In this problem, the outward distance map suffices.
Step 1: Compute the distance map of both shapes D_S1, D_S2
Step 2: Subtract the distance maps. Diff = D_S1-D_S2
Step 3: if Diff has only positive values. Then your shapes can be contained in each other(+ve => S1 bigger than S2 -ve => S2 bigger than S1)
If the Diff has both positive and negative values, the shapes intersect.
There you have it. Enjoy !

Packing arbitrary polygons within an arbitrary boundary

I was wondering if anybody could point me to the best algorithm/heuristic which will fit my particular polygon packing problem. I am given a single polygon as a boundary (convex or concave may also contain holes) and a single "fill" polygon (may also be convex or concave, does not contain holes) and I need to fill the boundary polygon with a specified number of fill polygons. (I'm working in 2D).
Many of the polygon packing heuristics I've found assume that the boundary and/or filling polygons will be rectangular and also that the filling polygons will be of different sizes. In my case, the filling polygons may be non-rectangular, but all will be exactly the same.
Maybe this is a particular type of packing problem? If somebody has a definition for this type of polygon packing I'll gladly google away, but so far I've not found anything which is similar enough to be of great use.
Thanks.
The question you ask is very hard. To put this in perspective, the (much) simpler case where you're packing the interior of your bounded polygon with non-overlapping disks is already hard, and disks are the simplest possible "packing shape" (with any other shape you have to consider orientation as well as size and center location).
In fact, I think it's an open problem in computational geometry to determine for an arbitrary integer N and arbitrary bounded polygonal region (in the Euclidean plane), what is the "optimal" (in the sense of covering the greatest percentage of the polygon interior) packing of N inscribed non-overlapping disks, where you are free to choose the radius and center location of each disk. I'm sure the "best" answer is known for certain special polygonal shapes (like rectangles, circles, and triangles), but for arbitrary shapes your best "heuristic" is probably:
Start your shape counter at N.
Add the largest "packing shape" you can fit completely inside the polygonal boundary without overlapping any other packing shapes.
Decrement your shape counter.
If your shape counter is > 0, go to step 2.
I say "probably" because "largest first" isn't always the best way to pack things into a confined space. You can dig into that particular flavor of craziness by reading about the bin packing problem and knapsack problem.
EDIT: Step 2 by itself is hard. A reasonable strategy would be to pick an arbitrary point on the interior of the polygon as the center and "inflate" the disk until it touches either the boundary or another disk (or both), and then "slide" the disk while continuing to inflate it so that it remains inside the boundary without overlapping any other disks until it is "trapped" - with at least 2 points of contact with the boundary and/or other disks. But it isn't easy to formalize this "sliding process". And even if you get the sliding process right, this strategy doesn't guarantee that you'll find the biggest "inscribable disk" - your "locally maximal" disk could be trapped in a "lobe" of the interior which is connected by a narrow "neck" of free space to a larger "lobe" where a larger disk would fit.
Thanks for the replies, my requirements were such that I was able to further simplify the problem by not having to deal with orientation and I then even further simplified by only really worrying about the bounding box of the fill element. With these two simplifications the problem became much easier and I used a stripe like filling algorithm in conjunction with a spatial hash grid (since there were existing elements I was not allowed to fill over).
With this approach I simply divided the fill area into stripes and created a spatial hash grid to register existing elements within the fill area. I created a second spatial hash grid to register the fill area (since my stripes were not guaranteed to be within the bounding area, this made checking if my fill element was in the fill area a little faster since I could just query the grid and if all grids where my fill element were to be placed, were full, I knew the fill element was inside the fill area). After that, I iterated over each stripe and placed a fill element where the hash grids would allow. This is certainly not an optimal solution, but it ended up being all that was required for my particular situation and pretty fast as well. I found the required information about creating a spatial hash grid from here. I got the idea for filling by stripes from this article.
This type of problem is very complex to solve geometrically.
If you can accept a good solution instead of the 100% optimal
solution then you can to solve it with a raster algorithm.
You draw (rasterize) the boundary polygon into one in-memory
image and the fill polygon into another in-memory image.
You can then more easily search for a place where the fill polygon will
fit in the boundary polygon by overlaying the two images with
various (X, Y) offsets for the fill polygon and checking
the pixel values.
When you find a place that the fill polygon fits,
you clear the pixels in the boundary polygon and repeat
until there are no more places where the fill polygon fits.
The keywords to google search for are: rasterization, overlay, algorithm
If your fill polygon is the shape of a jigsaw piece, many algorithms will miss the interlocking alignment. (I don't know what to suggest in that case)
One approach to the general problem that works well when the boundary is much larger than
the fill pieces is to tile an infinite plane with the pieces in the best way you can, and then look for the optimum alignment of the boundary on this plane.

How to subsample a 2D polygon?

I have polygons that define the contour of counties in the UK. These shapes are very detailed (10k to 20k points each), thus rendering the related computations (is point X in polygon P?) quite computationaly expensive.
Thus, I would like to "subsample" my polygons, to obtain a similar shape but with less points. What are the different techniques to do so?
The trivial one would be to take one every N points (thus subsampling by a factor N), but this feels too "crude". I would rather do some averaging of points, or something of that flavor. Any pointer?
Two solutions spring to mind:
1) since the map of the UK is reasonably squarish, you could choose to render a bitmap with the counties. Assign each a specific colour, and then render the borders with a 1 or 2 pixel thick black line. This means you'll only have to perform the expensive interior/exterior calculation if a sample happens to lie on the border. The larger the bitmap, the less often this will happen.
2) simplify the county outlines. You can use a recursive Ramer–Douglas–Peucker algorithm to recursively simplify the boundaries. Just make sure you cache the results. You may also have to solve this not for entire county boundaries but for shared boundaries only, to ensure no gaps. This might be quite tricky.
Here you can find a project dealing exactly with your issues. Although it works primarily with an area "filled" by points, you can set it to work with a "perimeter" type definition as yours.
It uses a k-nearest neighbors approach for calculating the region.
Samples:
Here you can request a copy of the paper.
Seemingly they planned to offer an online service for requesting calculations, but I didn't test it, and probably it isn't running.
HTH!
Polygon triangulation should help here. You'll still have to check many polygons, but these are triangles now, so they are easier to check and you can use some optimizations to determine only a small subset of polygons to check for a given region or point.
As it seems you have all the algorithms you need for polygons, not only for triangles, you can also merge several triangles that are too small after triangulation or if triangle count gets too high.

Drawing a Topographical Map

I've been working on a visualization project for 2-dimensional continuous data. It's the kind of thing you could use to study elevation data or temperature patterns on a 2D map. At its core, it's really a way of flattening 3-dimensions into two-dimensions-plus-color. In my particular field of study, I'm not actually working with geographical elevation data, but it's a good metaphor, so I'll stick with it throughout this post.
Anyhow, at this point, I have a "continuous color" renderer that I'm very pleased with:
The gradient is the standard color-wheel, where red pixels indicate coordinates with high values, and violet pixels indicate low values.
The underlying data structure uses some very clever (if I do say so myself) interpolation algorithms to enable arbitrarily deep zooming into the details of the map.
At this point, I want to draw some topographical contour lines (using quadratic bezier curves), but I haven't been able to find any good literature describing efficient algorithms for finding those curves.
To give you an idea for what I'm thinking about, here's a poor-man's implementation (where the renderer just uses a black RGB value whenever it encounters a pixel that intersects a contour line):
There are several problems with this approach, though:
Areas of the graph with a steeper slope result in thinner (and often broken) topo lines. Ideally, all topo lines should be continuous.
Areas of the graph with a flatter slope result in wider topo lines (and often entire regions of blackness, especially at the outer perimeter of the rendering region).
So I'm looking at a vector-drawing approach for getting those nice, perfect 1-pixel-thick curves. The basic structure of the algorithm will have to include these steps:
At each discrete elevation where I want to draw a topo line, find a set of coordinates where the elevation at that coordinate is extremely close (given an arbitrary epsilon value) to the desired elevation.
Eliminate redundant points. For example, if three points are in a perfectly-straight line, then the center point is redundant, since it can be eliminated without changing the shape of the curve. Likewise, with bezier curves, it is often possible to eliminate cetain anchor points by adjusting the position of adjacent control points.
Assemble the remaining points into a sequence, such that each segment between two points approximates an elevation-neutral trajectory, and such that no two line segments ever cross paths. Each point-sequence must either create a closed polygon, or must intersect the bounding box of the rendering region.
For each vertex, find a pair of control points such that the resultant curve exhibits a minimum error, with respect to the redundant points eliminated in step #2.
Ensure that all features of the topography visible at the current rendering scale are represented by appropriate topo lines. For example, if the data contains a spike with high altitude, but with extremely small diameter, the topo lines should still be drawn. Vertical features should only be ignored if their feature diameter is smaller than the overall rendering granularity of the image.
But even under those constraints, I can still think of several different heuristics for finding the lines:
Find the high-point within the rendering bounding-box. From that high point, travel downhill along several different trajectories. Any time the traversal line crossest an elevation threshold, add that point to an elevation-specific bucket. When the traversal path reaches a local minimum, change course and travel uphill.
Perform a high-resolution traversal along the rectangular bounding-box of the rendering region. At each elevation threshold (and at inflection points, wherever the slope reverses direction), add those points to an elevation-specific bucket. After finishing the boundary traversal, start tracing inward from the boundary points in those buckets.
Scan the entire rendering region, taking an elevation measurement at a sparse regular interval. For each measurement, use it's proximity to an elevation threshold as a mechanism to decide whether or not to take an interpolated measurement of its neighbors. Using this technique would provide better guarantees of coverage across the whole rendering region, but it'd be difficult to assemble the resultant points into a sensible order for constructing paths.
So, those are some of my thoughts...
Before diving deep into an implementation, I wanted to see whether anyone else on StackOverflow has experience with this sort of problem and could provide pointers for an accurate and efficient implementation.
Edit:
I'm especially interested in the "Gradient" suggestion made by ellisbben. And my core data structure (ignoring some of the optimizing interpolation shortcuts) can be represented as the summation of a set of 2D gaussian functions, which is totally differentiable.
I suppose I'll need a data structure to represent a three-dimensional slope, and a function for calculating that slope vector for at arbitrary point. Off the top of my head, I don't know how to do that (though it seems like it ought to be easy), but if you have a link explaining the math, I'd be much obliged!
UPDATE:
Thanks to the excellent contributions by ellisbben and Azim, I can now calculate the contour angle for any arbitrary point in the field. Drawing the real topo lines will follow shortly!
Here are updated renderings, with and without the ghetto raster-based topo-renderer that I've been using. Each image includes a thousand random sample points, represented by red dots. The angle-of-contour at that point is represented by a white line. In certain cases, no slope could be measured at the given point (based on the granularity of interpolation), so the red dot occurs without a corresponding angle-of-contour line.
Enjoy!
(NOTE: These renderings use a different surface topography than the previous renderings -- since I randomly generate the data structures on each iteration, while I'm prototyping -- but the core rendering method is the same, so I'm sure you get the idea.)
Here's a fun fact: over on the right-hand-side of these renderings, you'll see a bunch of weird contour lines at perfect horizontal and vertical angles. These are artifacts of the interpolation process, which uses a grid of interpolators to reduce the number of computations (by about 500%) necessary to perform the core rendering operations. All of those weird contour lines occur on the boundary between two interpolator grid cells.
Luckily, those artifacts don't actually matter. Although the artifacts are detectable during slope calculation, the final renderer won't notice them, since it operates at a different bit depth.
UPDATE AGAIN:
Aaaaaaaand, as one final indulgence before I go to sleep, here's another pair of renderings, one in the old-school "continuous color" style, and one with 20,000 gradient samples. In this set of renderings, I've eliminated the red dot for point-samples, since it unnecessarily clutters the image.
Here, you can really see those interpolation artifacts that I referred to earlier, thanks to the grid-structure of the interpolator collection. I should emphasize that those artifacts will be completely invisible on the final contour rendering (since the difference in magnitude between any two adjacent interpolator cells is less than the bit depth of the rendered image).
Bon appetit!!
The gradient is a mathematical operator that may help you.
If you can turn your interpolation into a differentiable function, the gradient of the height will always point in the direction of steepest ascent. All curves of equal height are perpendicular to the gradient of height evaluated at that point.
Your idea about starting from the highest point is sensible, but might miss features if there is more than one local maximum.
I'd suggest
pick height values at which you will draw lines
create a bunch of points on a fine, regularly spaced grid, then walk each point in small steps in the gradient direction towards the nearest height at which you want to draw a line
create curves by stepping each point perpendicular to the gradient; eliminate excess points by killing a point when another curve comes too close to it-- but to avoid destroying the center of hourglass like figures, you might need to check the angle between the oriented vector perpendicular to the gradient for both of the points. (When I say oriented, I mean make sure that the angle between the gradient and the perpendicular value you calculate is always 90 degrees in the same direction.)
In response to your comment to #erickson and to answer the point about calculating the gradient of your function. Instead of calculating the derivatives of your 300 term function you could do a numeric differentiation as follows.
Given a point [x,y] in your image you could calculate the gradient (direction of steepest decent)
g={ ( f(x+dx,y)-f(x-dx,y) )/(2*dx),
{ ( f(x,y+dy)-f(x,y-dy) )/(2*dy)
where dx and dy could be the spacing in your grid. The contour line will run perpendicular to the gradient. So, to get the contour direction, c, we can multiply g=[v,w] by matrix, A=[0 -1, 1 0] giving
c = [-w,v]
Alternately, there is the marching squares algorithm which seems appropriate to your problem, although you may want to smooth the results if you use a coarse grid.
The topo curves you want to draw are isosurfaces of a scalar field over 2 dimensions. For isosurfaces in 3 dimensions, there is the marching cubes algorithm.
I've wanted something like this myself, but haven't found a vector-based solution.
A raster-based solution isn't that bad, though, especially if your data is raster-based. If your data is vector-based too (in other words, you have a 3D model of your surface), you should be able to do some real math to find the intersection curves with horizontal planes at varying elevations.
For a raster-based approach, I look at each pair of neighboring pixels. If one is above a contour level, and one is below, obviously a contour line runs between them. The trick I used to anti-alias the contour line is to mix the contour line color into both pixels, proportional to their closeness to the idealized contour line.
Maybe some examples will help. Suppose that the current pixel is at an "elevation" of 12 ft, a neighbor is at an elevation of 8 ft, and contour lines are every 10 ft. Then, there is a contour line half way between; paint the current pixel with the contour line color at 50% opacity. Another pixel is at 11 feet and has a neighbor at 6 feet. Color the current pixel at 80% opacity.
alpha = (contour - neighbor) / (current - neighbor)
Unfortunately, I don't have the code handy, and there might have been a bit more to it (I vaguely recall looking at diagonal neighbors too, and adjusting by sqrt(2) / 2). I hope this enough to give you the gist.
It occurred to me that what you're trying to do would be pretty easy to do in MATLAB, using the contour function. Doing things like making low-density approximations to your contours can probably be done with some fairly simple post-processing of the contours.
Fortunately, GNU Octave, a MATLAB clone, has implementations of the various contour plotting functions. You could look at that code for an algorithm and implementation that's almost certainly mathematically sound. Or, you might just be able to offload the processing to Octave. Check out the page on interfacing with other languages to see if that would be easier.
Disclosure: I haven't used Octave very much, and I haven't actually tested it's contour plotting. However, from my experience with MATLAB, I can say that it will give you almost everything you're asking for in just a few lines of code, provided you get your data into MATLAB.
Also, congratulations on making a very VanGough-esque slopefield plot.
I always check places like http://mathworld.wolfram.com before going to deep on my own :)
Maybe their curves section would help? Or maybe the entry on maps.
compare what you have rendered with a real-world topo map - they look identical to me! i wouldn't change a thing...
Write the data out as an HGT file (very simple digital elevation data format used by USGS) and use the free and open-source gdal_contour tool to create contours. That works very well for terrestrial maps, the constraint being that the data points are signed 16-bit numbers, which fits the earthly range of heights in metres very well, but may not be enough for your data, which I assume not to be a map of actual terrain - although you do mention terrain maps.
I recommend the CONREC approach:
Create an empty line segment list
Split your data into regular grid squares
For each grid square, split the square into 4 component triangles:
For each triangle, handle the cases (a through j):
If a line segment crosses one of the cases:
Calculate its endpoints
Store the line segment in the list
Draw each line segment in the line segment list
If the lines are too jagged, use a smaller grid. If the lines are smooth enough and the algorithm is taking too long, use a larger grid.

Resources