Fast geospatial search geofencing irregular polygons - algorithm

I have possibly billions of multipoint (sometimes more than 20) polygonal regions (varying sizes , intersecting, nested) distributed throughout the span of a country. Though, the polygonal regions or 'fences' will not usually be too large and most of them will be negligible in size compared to the span . The points of the polygons are described in terms of it's GPS co-ordinates. I need a fast algorithm to check whether the current GPS location of the device falls within any of the enclosed 'fences'.
I have thought of using RAY CASTING ALGORITHM (point in polygon test) after sufficiently narrowing down on the candidates using some search algorithm. The question now being what method would serve as an efficient 2D SEARCH ALGORITHM keeping the following two considerations-
Speed of processing
Data structure should be compact as the storage space is limited (flash storage)
The regions are static and constant. Hence , the 2 dimensional data can be pre sorted to any requirement before being stored.
The unit is to be put on a vehicle , and hence the search need not be repeated for regions sufficiently far from the vehicle's proximity. It's acceptable to have an initial boot up time provided the successive real-time look-ups are extremely fast.
I initially thought of using KD trees to perform a boundary box query after reducing each 'fence' to a point ( approx. by using a point within it) followed by ray casting algorithm.
I have also been looking at hilbert curves and geo hash for the same.
Another method I'm considering is to use RECTANGLES that just enclose the fences. We choose these rectangles for each fence such that it is aligned to the GRID (sides aligned to the latitude and longitude).
That is , find out the max and min value of the latitudes and longitudes for a fence individually , let them be LAT 1, LAT 2, LONG 1 , LONG 2.
Now the points (LAT1, LONG1) , (LAT1, LONG2) , (LAT2, LONG1) , (LAT2, LONG2) are the co-ordinates of the rectangle (aligned to the GRID) that the fence must necessarily be contained in. (I do understand that it will not be a rectangle in the geometrical sense, nonetheless. )
Now use R tree to search for the RECTANGLES that the current gps location falls within. This will narrow the search down to very few results , each of which can be individually tested using RAY CASTING ALGORITHM.
I could also use a hybrid method by using K-D tree for the initial chunk and then applying R tree search.
I am more than open to any other method as well.
What do you think is the best way to approach this problem statement?
EDIT - Removed the size restriction on the polygonal fences.

It sounds like your set of polygons is static. If so, one of the fastest ways is likely to be binary space partitioning. Very briefly, building a BSP tree means picking a single edge from among your billions of polygons, splitting any polygons that cross this (infinite) line into two halves, partitioning the polygons into those on each side of the line, and recursively building a BSP tree on each side. At the bottommost level of the tree are convex polygons that are wholly inside some set of original polygons -- so e.g. if two original polygons A and B intersect but one doesn't fully contain the other, then this will produce at least 3 "basic" polygons at the lowest level, one for the region A but not B, one for B but not A, and one for A and B. (More may be needed if any of these are not convex.) Each of these can be tagged with the list of original polygons that it is part of.
In the ideal case where no polygons are split, finding the complete set of polygons that you are inside is O(log(total number of edges)) once the BSP tree has been built, since deciding which child to visit next in the tree is just an O(1) test of which side of a line the query point is on.
Building the BSP tree takes quite a bit of work up front, especially if you want to make sure it does as little polygon subdivision as possible. It also requires that your polygons are convex, but if that's not the case, you can easily make them so by triangulating them first.
If your set of polygons grows slowly over time, you can of course complement the above technique with regular point-in-polygon testing for the new polygons, occasionally rebuilding the BSP tree when some threshold is reached.

You can try a hierarchical point-in-polygon test for example kirkpatrick data structure but its very complicated. Personaly I would try rectangles, kd-tree, hilbert curves, quadtrees.

Related

Using a spatial index to find points within range of each other

I'm trying to find a spatial index structure suitable for a particular problem : using a union-find data structure, I want to connect\associate points that are within a certain range of each other.
I have a lot of points and I'm trying to optimize an existing solution by using a better spatial index.
Right now, I'm using a simple 2D grid indexing each square of width [threshold distance] of my point map, and I look for potential unions by searching for points in adjacent squares in the grid.
Then I compute the squared Euclidean distance to the adjacent cells combinations, which I compare to my squared threshold, and I use the union-find structure (optimized using path compression and etc.) to build groups of points.
Here is some illustration of the method. The single black points actually represent the set of points that belong to a cell of the grid, and the outgoing colored arrows represent the actual distance comparisons with the outside points.
(I'm also checking for potential connected points that belong to the same cells).
By using this pattern I make sure I'm not doing any distance comparison twice by using a proper "neighbor cell" pattern that doesn't overlap with already tested stuff when I iterate over the grid cells.
Issue is : this approach is not even close to being fast enough, and I'm trying to replace the "spatial grid index" method with something that could maybe be faster.
I've looked into quadtrees as a suitable spatial index for this problem, but I don't think it is suitable to solve it (I don't see any way of performing repeated "neighbours" checks for a particular cell more effectively using a quadtree), but maybe I'm wrong on that.
Therefore, I'm looking for a better algorithm\data structure to effectively index my points and query them for proximity.
Thanks in advance.
I have some comments:
1) I think your problem is equivalent to a "spatial join". A spatial join takes two sets of geometries, for example a set R of rectangles and a set P of points and finds for every rectangle all points in that rectangle. In Your case, R would be the rectangles (edge length = 2 * max distance) around each point and P the set of your points. Searching for spatial join may give you some useful references.
2) You may want to have a look at space filling curves. Space filling curves create a linear order for a set of spatial entities (points) with the property that points that a close in the linear ordering are usually also close in space (and vice versa). This may be useful when developing an algorithm.
3) Have look at OpenVDB. OpenVDB has a spatial index structure that is highly optimized to traverse 'voxel'-cells and their neighbors.
4) Have a look at the PH-Tree (disclaimer: this is my own project). The PH-Tree is a somewhat like a quadtree but uses low level bit operations to optimize navigation. It is also Z-ordered/Morten-ordered (see space filling curves above). You can create a window-query for each point which returns all points within that rectangle. To my knowledge, the PH-Tree is the fastest index structure for this kind of operation, especially if you typically have only 9 points in a rectangle. If you are interested in the code, the V13 implementation is probably the fastest, however the V16 should be much easier to understand and modify.
I tried on my rather old desktop machine, using about 1,000,000 points I can do about 200,000 window queries per second, so it should take about 5 second to find all neighbors for every point.
If you are using Java, my spatial index collection may also be useful.
A standard approach to this is the "sweep and prune" algorithm. Sort all the points by X coordinate, then iterate through them. As you do, maintain the lowest index of the point which is within the threshold distance (in X) of the current point. The points within that range are candidates for merging. You then do the same thing sorting by Y. Then you only need to check the Euclidean distance for those pairs which showed up in both the X and Y scans.
Note that with your current union-find approach, you can end up unioning points which are quite far from each other, if there are a bunch of nearby points "bridging" them. So your basic approach -- of unioning groups of points based on proximity -- can induce an arbitrary amount of distance error, not just the threshold distance.

How to compute the union polygon of two (or more) rectangles

For example we have two rectangles and they overlap. I want to get the exact range of the union of them. What is a good way to compute this?
These are the two overlapping rectangles. Suppose the cords of vertices are all known:
How can I compute the cords of the vertices of their union polygon? And what if I have more than two rectangles?
There exists a Line Sweep Algorithm to calculate area of union of n rectangles. Refer the link for details of the algorithm.
As said in article, there exist a boolean array implementation in O(N^2) time. Using the right data structure (balanced binary search tree), it can be reduced to O(NlogN) time.
Above algorithm can be extended to determine vertices as well.
Details:
Modify the event handling as follows:
When you add/remove the edge to the active set, note the starting point and ending point of the edge. If any point lies inside the already existing active set, then it doesn't constitute a vertex, otherwise it does.
This way you are able to find all the vertices of resultant polygon.
Note that above method can be extended to general polygon but it is more involved.
For a relatively simple and reliable way, you can work as follows:
sort all abscissas (of the vertical sides) and ordinates (of the horizontal sides) independently, and discard any duplicate.
this establishes mappings between the coordinates and integer indexes.
create a binary image of size NxN, filled with black.
for every rectangle, fill the image in white between the corresponding indexes.
then scan the image to find the corners, by contour tracing, and revert to the original coordinates.
This process isn't efficient as it takes time proportional to N² plus the sum of the (logical) areas of the rectangles, but it can be useful for a moderate amount of rectangles. It easily deals with coincidences.
In the case of two rectangles, there aren't so many different configurations possible and you can precompute all vertex sequences for the possible configuration (a small subset of the 2^9 possible images).
There is no need to explicitly create the image, just associate vertex sequences to the possible permutations of the input X and Y.
Look into binary space partitioning (BSP).
https://en.wikipedia.org/wiki/Binary_space_partitioning
If you had just two rectangles then a bit of hacking could yield some result, but for finding intersections and unions of multiple polygons you'll want to implement BSP.
Chapter 13 of Geometric Tools for Computer Graphics by Schneider and Eberly covers BSP. Be sure to download the errata for the book!
Eberly, one of the co-authors, has a wonderful website with PDFs and code samples for individual topics:
https://www.geometrictools.com/
http://www.geometrictools.com/Books/Books.html
Personally I believe this problem should be solved just as all other geometry problems are solved in engineering programs/languages, meshing.
So first convert your vertices into rectangular grids of fixed size, using for example:
MatLab meshgrid
Then go through all of your grid elements and remove any with duplicate edge elements. Now sum the number of remaining meshes and times it by the area of the mesh you have chosen.

Match housenumbers on buildings (special case of point-in-polygon-test)

Task with example
I'm working with geodata (country-size) from openstreetmap. Buildings are often polygons without housenumbers and a single point with the housenumber is placed within the polygon of the building. Buildings may have multiple housenumbers.
I want to match the housenumbers to the polygons of the buildings.
Simple solution
Foreach housenumber perform a point-in-polygon-test with each building-polygon.
Problem
Way too slow for about 50,000,000 buildings and 10,000,000 address-points.
Idea
Build and index for the building-polygons to accelerate the search for the surrounding polygon for each housenumber-point.
Question
What index or strategy would you recommend for this polygon-structure? The polygons never overlap and the area is sparsly covered.
This question is duplicated to gis.stackexchange.com. It was recommendet to post the question there.
Since it sounds like you have well-formed polygons to test against, I'd use a spatial hash with a AABB check, and then finally the full point-in-polygon test. Hopefully at that point you'll be averaging three or less point-in-polygon tests per address.
Break the area your data is over into a simple grid where a grid is a small multiple (2 to 4) of the median building size. (Maybe 100-200 meters?)
Compute the axis aligned bounding box of every polygon, add it (with its bounding box) to each grid location which the bounding box intersects. (It's pretty simple to figure out where an axis aligned bounding box overlaps regular axis aligned grid cells. I wouldn't store the grid in a simple 2D array -- I'd use a hash table that maps 2D integer grid coordinates, e.g. (1023, 301), to a list of polygons)
Then go through all your address points. Look up in your hash table what cell that point is in. Go through all the polygons in that cell and if the point is within any polygon's axis aligned bounding box do the full point-in-polygon test.
This has several advantages:
The data structures are simple -- no fancy libraries needed (other than handling polygons). With C++, your polygon library, and the std namespace this could be implemented in less than an hour.
Spatial structure isn't hierarchical -- when you're looking up the points you only have to do one O(1) lookup in the hash table.
And of course, the usual disadvantage of grids as a spatial structure:
Doesn't handle wildly varying sized polygons particularly well. However, I'm hoping since you're using map data the sizes are almost always within an order of magnitude, and probably much less.
Assuming you end up with N maximum polygons in each of grid and each polygon has P points and you've got B buildings and A addresses, you're looking at O(B*P + N*A). Since B and P are likely relatively small, especially on average, you could consider this O(B + N) -- pretty much linear.

Combining items into fixed sized groups for k-d tree

I'm using a k-d tree for spacial partitioning in a ray-tracer. I want to combine near-by primitives into fixed-sized groups so the data in each group can be deinterleaved and processed simultaneously using SIMD instructions. What is a good fast algorithm to find the (approximately) smallest fixed-sized groups?
Ideally it would augment the k-d tree building algorithm instead of adding a separate pass, but this is complicated by the fact that the primitives are normally so close together that most primitives will belong to more than one leaf node and I can't have groups with duplicate items because floating point precision errors would mess up shadows and reflections.
I figure I'm far from the first person to try this so a solution already exists, but the most relevant solutions I found from searching the internet for grouping objects deal with point data and variable-sized groups.
One popular way to group objects it to pick a random object, then add the "closest" k objects to a group with it. "Closest" is usually defined to mean giving the smallest surface area for the combined bounding box. You can also use the Hilbert curve distance; either the distance on the 3D curve between the objects' centers, or in 6D space where the bounding boxes become points:
(x_min, y_min, z_min, x_max, y_max, z_max)

How to find if a 3D object fits in another 3D object (the container)?

Given two 3d objects, how can I find if one fits inside the second (and find the location of the object in the container).
The object should be translated and rotated to fit the container - but not modified otherwise.
Additional complications:
The same situation - but look for the best fit solution, even if it's not a proper match (minimize the volume of the object that doesn't fit in the container)
Support for elastic objects - find the best fit while minimizing the "distortion" in the objects
This is a pretty general question - and I don't expect a complete solution.
Any pointers to relevant papers \ articles \ libraries \ tools would be useful
Here is one perhaps less than ideal method.
You could try fixing the position (in 3D space) of 1 shape. Placing the other shape on top of that shape. Then create links that connect one point in shape to a point in the other shape. Then simulate what happens when the links are pulled equally tight. Causing the point that isn't fixed to rotate and translate until it's stable.
If the fit is loose enough, you could use only 3 links (the bare minimum number of links for 3D) and try every possible combination. However, for tighter fit fits, you'll need more links, perhaps enough to place them on every point of the shape with the least number of points. Which means you'll some method to determine how to place the links, which is not trivial.
This seems like quite hard problem. Probable approach is to have some heuristic to suggest transformation and than check is it good one. If transformation moves object only slightly out of interior (e.g. on one part) than make slightly adjust to transformation and test it. If object is 'lot' out (e.g. on same/all axis on both sides) than make new heuristic guess.
Just an general idea for a heuristic. Make a rasterisation of an objects with same pixel size. It can be octree of an object volume. Make connectivity graph between pixels. Check subgraph isomorphism between graphs. If there is a subgraph than that position is for a testing.
This approach also supports 90deg rotation(s).
Some tests can be done even on graphs. If all volume neighbours of a subgraph are in larger graph, than object is in.
In general this is 'refined' boundary box approach.
Another solution is to project equal number of points on both objects and do a least squares best fit on the point sets. The point sets probably will not be ordered the same so iterating between the least squares best fit and a reordering of points so that the points on both objects are close to same order. The equation development for this is a lot of algebra but not conceptually complicated.
Consider one polygon(triangle) in the target object. For this polygon, find the equivalent polygon in the other geometry (source), ie. the length of the sides, angle between the edges, area should all be the same. If there's just one match, find the rigid transform matrix, that alters the vertices that way : X' = M*X. Since X' AND X are known for all the points on the matched polygons, this should be doable with linear algebra.
If you want a one-one mapping between the vertices of the polygon, traverse the edges of the polygons in the same order, and make a lookup table that maps each vertex one one poly to a vertex in another. If you have a half edge data structure of your 3d object that'll simplify this process a great deal.
If you find more than one matching polygon, traverse the source polygon from both the points, and keep matching their neighbouring polygons with the target polygons. Continue until one of them breaks, after which you can do the same steps as the one-match version.
There're more serious solutions that're listed here, but I think the method above will work as well.
What a juicy problem !. As is typical in computational geometry this problem
can be very complicated with a mismatched geometric abstraction. With all kinds of if-else cases etc.
But pick the right abstraction and the solution becomes trivial with few sub-cases.
Compute the Distance Transform of your shapes and Voilà! Your solution is trivial.
Allow me to elaborate.
The distance map of a shape on a grid (pixels) encodes the distance of the closest point on the
shape's border to that pixel. It can be computed in both directions outwards or inwards into the shape.
In this problem, the outward distance map suffices.
Step 1: Compute the distance map of both shapes D_S1, D_S2
Step 2: Subtract the distance maps. Diff = D_S1-D_S2
Step 3: if Diff has only positive values. Then your shapes can be contained in each other(+ve => S1 bigger than S2 -ve => S2 bigger than S1)
If the Diff has both positive and negative values, the shapes intersect.
There you have it. Enjoy !

Resources