Could you please help me understand the issue with H3 geospatial indexing?
import h3
geo_antarctic = {"type":"Polygon","coordinates":[[[-170.63764683701507,-85.05113000000047],[-170.63764683701507,-77.89462449499929],[-63.82520589349025,-66.39564184408599],[-49.69216225292467,-77.30460454007437],[-35.16653406678777,-77.89462449499929],[-9.255954059083527,-70.29658532122083],[40.994867774038596,-68.50197979740217],[89.56411960844528,-64.94027568501143],[163.48124599227498,-67.77106116580279],[172.90327508598565,-72.42721956336818],[165.83675326570284,-77.7288586062699],[178.18462781512582,-77.47601087207454],[178.57721236069702,-85.0171471646522],[-178.63764683701507,-85.05113000000047]]]}
idx = h3.polyfill(geo_antarctic, 3)
I'm expecting to get indices like these ones, which are located inside of the geojson polygon above:
83ef9efffffffff
83eea4fffffffff
83f125fffffffff
83f2a4fffffffff
But instead, h3.polyfill returns indices that are “flipped” by 90 degrees like these:
836682fffffffff
830e59fffffffff
83b294fffffffff
836733fffffffff
838f0bfffffffff
830372fffffffff
All works fine for other geojsons that don't span Antarctica..
I’m using Python 3.10.7 and H3 3.7.4.
I would appreciate any hints.
Upd.
I used geo_json_conformant=True parameter and it flipped indices back. But it seems not all resolution 3 indices were generated and my expected indices are not in the list. On the image generated indices are in blue and expected are in red.
Upd 2
Following the suggestion from #nrabinowitz, I triangulated the original polygon from Pole and then polyfilled resulting "slices". Works perfectly fine, all missing indices are in place.
result
import h3
import geojson
geoj = {"type":"Polygon","coordinates":[[[-178.34111242523068,-85.0207089708011],[-178.69267492523034,-77.91567194747755],[-162.52079992523068,-78.4905544838336],[-140.02079992523,-73.8248242864237],[-126.66142492523065,-73.12494935304983],[-103.10673742523004,-74.59011176731619],[-103.45829992523063,-71.07406105104535],[-83.06767492523001,-73.52840349816283],[-61.97392492523001,-64.32087770911836],[-57.052049925230655,-62.43108077917767],[-59.86454992522999,-74.77584672076205],[-39.12236242523063,-77.8418507294947],[-12.052049925230301,-70.61261893331015],[35.05732507477002,-68.52824009786191],[53.33857507476973,-65.51296841598038],[76.54170007476968,-68.39918525054024],[93.06513757477003,-64.77413134111099],[143.69013757477003,-66.08937000336596],[173.22138757477006,-70.72898413027124],[167.94795007477003,-76.26869800825351],[177.79170007476975,-77.23507678015689],[178.60169170931843,-84.94715491814792],[-178.34111242523068,-85.0207089708011]]]}
polygon_coords = geoj["coordinates"][0]
pole_coord = (0.0, -89.999)
all_indexes = set()
for i in range(len(polygon_coords)-1):
polygon = geojson.Polygon([[pole_coord, tuple(polygon_coords[i]), tuple(polygon_coords[i+1])]])
idxes = h3.polyfill(dict(polygon), 5, geo_json_conformant=True)
all_indexes.update(idxes)
with open(f"./absent_polr.csv", "w") as out:
out.write("h3_idx\r\n")
out.write("\r\n".join(all_indexes))
H3 uses a Cartesian model for polygons used in polyfill, not a spherical model, so poles are slightly challenging. The other issue is that we assume the smaller polygon when some of the arcs are greater than 180 degrees, which is frequently the case very near the poles.
See the suggested workaround in this related issue:
I think the best workaround here is to slice up polygons that contain a pole. I think the simplest version of this, which ought to work, is to make triangles, one for each pair of vertexes, with the third vertex being the pole itself.
The other participant in that discussion glossed this as "slicing the polygon up like a pizza," which I thought was very descriptive.
There's a demo of this approach in this Observable notebook.
RediSearch looks promising after reading https://redislabs.com/blog/search-benchmarking-redisearch-vs-elasticsearch/.
We use elasticsearch currently. We rely heavily on its polygon query feature https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-geo-polygon-query.html.
I couldn't find polygon query in RediSearch. Is it there under different name? Is there anyone out using RediSearch for polygon query? How do you achieve that?
For now, only option I see is to use Geo filter to get points in different circles, and then find intersection of those with my polygon in application code.
I'm in the exact same boat, and it seems like as of v1.4.8 geo filters are limited to geo radius filters. However, it does look like an issue was created to add support for geo polygon filters:
https://github.com/RedisLabsModules/RediSearch/issues/680
I'm trying to match a latitude and longitude pair to a road segment that has a start and end latitude and longitude. All of the formulas I've been able to find query by the closest match to a single latitude and longitude, but not to a pair. I suppose one option is to get the average, or center of the segment, but this is not ideal. I'm querying this in SQLLite since my data is in GeoPackage format, but if anyone even has a formula to use I can translate that to SQLLite.
Thanks!
Hmm interesting. Conceptually, and without considering performance, would the following work? (consider this untested pseudo):
SELECT
MIN(ABS(lon_field - :lon1)) AS start_lon,
MIN(ABS(lat_field - :lat1)) AS start_lat,
MIN(ABS(lon_field - :lon2)) as end_lon,
MIN(ABS(lat_field - :lat2)) AS end_lat;
I have a data set of about 20 million coordinates. I want to be able to pass in a latitude, longitude, and distance in miles and return all coordinates that are within the mile range of my given coordinates. I need the response time to ideally be sub 50ms.
I have tried loading all coordinates in memory in a golang service which, on every request, will loop through the data and using haversine filter all coordinates which are within the given miles distance of my given coordinate.
This method sees the results return in around 2 seconds. What approach would be good to increase the speed of the results? I am open to any suggestions.
I am toying around with the idea of grouping all coordinates by degree and only filtering by the nearest to the given coordinates. Haven't had any luck improving the response times yet though. My data set is only a test one too as the real data could potentially be in the hundreds of millions.
I think that this is more of a data structure problem. One good way to store large sets of geospatial coordinates is with an R-tree. It provides logn M search. I have limited knowledge of Go, but I have used an R-Tree to great effect for similarly sized datasets in a similar use case in a JS application. From a quick search it appears as though there are at least a couple Go R-Tree implementations out there.
Idea would be to have a "grid" that partitions coordinates, so that when you do need to do a lookup you can safely return all coordinates in particular cell, do not return any from the cells too far away from target, and only do per coordinate comparison for coordinates that are in the cells that contains some coordinates within distance and some outside the distance.
Simplified to 1D:
Coordinates are from 1 to 100
you partition into 5 blocks of 20
When somebody looks for all coordinates within distance 25 from 47
you return all coordinates in blocks [30,39], [40,49],[50,59],[60,69] and then after doing per coordinate analysis for blocks [20,29] and [70,79] you additionally return 22,23,24,25,26,27,28,29, 70,71,72.
Unfortunately I have no realistic way to estimate speedup of this approach so you would need to implement it and benchmark it by yourself.
MongoDB has various geographic searches $geoNear will allow you to search for points within a specific distance from a point or within a shape.
https://docs.mongodb.com/manual/reference/operator/aggregation/geoNear/
PostGIS for Postgres has something similar, but I am not too familiar with it.
I am struggling with DWithin queries in geomesa.
I have ingested many geo points from OSM and want to make DWithin queries.
I have the following code for query:
val query = new Query("t1", ECQL.toFilter("DWITHIN(geo_point, POINT (14.453943 60.499611), 5000, meters)"))
I expect geomesa to answer with the points not far then 5000 meters from POINT (14.453943 60.499611) (points 2,3,4 on the map).
But geomesa gives me more points than expected.
All the points in the map (1,2,3,4,5,6) are returned for this query.
It seems that geomesa can't properly filter out the points for DWithin query since it does not have support for geodesic distance checks.
So, is there any way to make DWITHIN query work correctly (in a geodesic manner) with geomesa?
Thanks!
GeoMesa uses the geotools dwithin filter function for such queries. Unfortunately, the function only supports native distances (i.e. degrees in WGS84).
Currently, your best bet is to use the geotools GeodeticCalculator class to create a polygon covering your query area and use that in an intersects filter. Alternatively, you could post-filter the results using the Geodetic Calculator.
Going forward, I've created a ticket to handle this use case better in GeoMesa: https://geomesa.atlassian.net/browse/GEOMESA-2263