Elastic Search Geo Spatial search implementation - elasticsearch

I am trying to understand how elastic search supports Geo Spatial search internally.
For the basic search, it uses the inverted index; but how does it combine with the additional search criteria like searching for a particular text within a certain radius.
I would like to understand the internals of how the index would be stored and queried to support these queries

Text & geo queries are executed separately of one another. Let's take a concrete example:
PUT restaurants
{
"mappings": {
"properties": {
"location": {
"type": "geo_point"
},
"menu": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
}
}
}
}
}
}
POST restaurants/_doc
{
"name": "rest1",
"location": {
"lat": 40.739812,
"lon": -74.006201
},
"menu": [
"european",
"french",
"pizza"
]
}
POST restaurants/_doc
{
"name": "rest2",
"location": {
"lat": 40.7403963,
"lon": -73.9950026
},
"menu": [
"pizza",
"kebab"
]
}
You'd then match a text field and apply a geo_distance filter:
GET restaurants/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"menu": "pizza"
}
},
{
"geo_distance": {
"distance": "0.5mi",
"location": {
"lat": 40.7388,
"lon": -73.9982
}
}
},
{
"function_score": {
"query": {
"match_all": {}
},
"boost_mode": "avg",
"functions": [
{
"gauss": {
"location": {
"origin": {
"lat": 40.7388,
"lon": -73.9982
},
"scale": "0.5mi"
}
}
}
]
}
}
]
}
}
}
Since the geo_distance query only assigns a boolean value (--> score=1; only checking if the location is within a given radius), you may want to apply a gaussian function_score to boost the locations that are closer to a given origin.
Finally, these scores are overridable by using a _geo_distance sort where you'd order by the proximity (while of course keeping the match query intact):
...
"query: {...},
"sort": [
{
"_geo_distance": {
"location": {
"lat": 40.7388,
"lon": -73.9982
},
"order": "asc"
}
}
]
}

Related

Find coordinates in a polygon

How can I find polygons that stored in elastic index.
Simple mapping:
PUT /regions
{
"mappings": {
"properties": {
"location": {
"type": "geo_shape"
}
}
}
}
And simple polygon:
/regions/_doc/1
{
"location" : {
"type" : "polygon",
"coordinates" : [
[
[53.847332102970626,27.485155519098047],
[53.84626875748117,27.487134989351038],
[53.8449047241684,27.48501067981124],
[53.84612634308789,27.482945378869765],
[53.847411219859,27.48502677306532],
[53.847332102970626,27.485155519098047]
]
]
}
}
According to documentation I can only search coordinates within polygon only if the polygon is contained in the request Geo-polygon query, but I need to find polygons by coordinates in query. Elasticsearch 7.6 version.
Query:
{
"query": {
"match_all": {}
},
"filter": {
"geo_shape": {
"geometry": {
"shape": {
"coordinates": [
53.846415,
27.485756
],
"type": "point"
},
"relation": "whithin"
}
}
}
}
You were on the right path but your query was heavily malformed. Here's the fix:
{
"query": {
"bool": {
"filter": {
"geo_shape": {
"location": {
"shape": {
"coordinates": [
53.846415,
27.485756
],
"type": "point"
},
"relation": "intersects"
}
}
}
}
}
}
Notice how I used intersects instead of within. The reason is explained in this GIS StackExchange answer.

Elasticsearch - Query to Determine All Unique IDs that are distance X away from a particular ID?

I have data in this format generated from a random walk (to simulate people walking around). It is set up in this manner { location : { lat: someLat, lon: someLong }, id: uniqueId, date:date }. I am trying to write a query given a users unique ID, find how many other unique IDs came within X distance of the given ID between a certain time range. Any hints on how to accomplish this?
My idea is to have a top level filter aggregration, with a nested geo-query of some sort. I think the geo-distance query is the way to go, but I am not sure how to include it into the below query to get all of unique IDs that come within X distance of the ID I am filtering on. The query below is where I am starting from, I am filtering all documents from now - 1 day to now, where the documents user Id is the provided value. How would I check all other documents for their distances against documents that match this query?
{
"aggs" : {
"range": {
"date_range": {
"field": "date",
"format": "MM-yyyy",
"ranges": [
{ "to": "now" },
{ "from": "now-1d" }
]
}
},
"locations" : {
"filter" : {
"term": { "id.keyword": "7a50ab18-886b-42a2-80ad-3d45112e3cfd" }
}
}
}
}
Your hunch is correct. All of this can be done using range & geo_distance filtering and _geo_distance sorting. You wanna filter on the query-level, not in the aggs though:
GET walking/_search
{
"size": 0,
"query": {
"bool": {
"must": [
{
"range": {
"date": {
"gte": "now-1d"
}
}
}
],
"filter": [
{
"geo_distance": {
"distance": "20m",
"location": {
"lat": 48.20150179951008,
"lon": 16.39111876487732
}
}
}
]
}
},
"aggs": {
"rings_around_loc": {
"geo_distance": {
"field": "location",
"origin": {
"lat": 48.20150179951008,
"lon": 16.39111876487732
},
"unit": "m",
"keyed": true,
"ranges": [
{
"to": 10
},
{
"from": 10,
"to": 50
},
{
"from": 50
}
]
}
},
"locations": {
"value_count": {
"field": "id.keyword"
}
}
},
"sort": [
{
"_geo_distance": {
"location": {
"lat": 48.20150179951008,
"lon": 16.39111876487732
},
"order": "asc",
"unit": "m",
"mode": "min",
"distance_type": "arc",
"ignore_unmapped": true
}
}
]
}
Not sure what you need the range buckets for so I left them out.
Full steps to replicate:
PUT walking
{
"mappings": {
"properties": {
"date": {
"type": "date"
},
"id": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
}
}
},
"location": {
"type": "geo_point"
}
}
}
}
And then POST _bulk this random walk data

Elasticsearch geosearch with distance preference

Sorry for the noob question... I have Restaurant objects in Elasticsearch 2.3, each has a GeoPoint and a home delivery distance preference. In pseudocode
restaurant: {
location: (x, y)
deliveryPreference: 10km
}
and a user:
user {
location: (a,b)
}
How would I issue a search for a user looking for all restaurants that can deliver in his area?
The solution involves using geo_shapes. You need to model the restaurant documents as follows:
PUT restaurants
{
"mappings": {
"restaurant": {
"properties": {
"name": {
"type": "text"
},
"location": {
"type": "geo_point"
},
"delivery_area": {
"type": "geo_shape",
"tree": "quadtree",
"precision": "1m"
}
}
}
}
}
You can then index your restaurants as follows:
POST /restaurants/restaurant/1
{
"name": "My Food place",
"location": [-45.0, 45.0], <-- lon, lat !!!
"delivery_area": {
"type": "circle",
"coordinates" : [-45.0, 45.0], <-- lon, lat !!!
"radius" : "10km"
}
}
Each restaurant will thus be associated with a circle shape centered on its location and with a proper radius.
Finally, when a user wants to know which restaurant can deliver at the location she is currently at, you can issue the following geo_shape query:
POST /restaurants/_search
{
"query":{
"bool": {
"filter": {
"geo_shape": {
"delivery_area": {
"shape": {
"type": "point",
"coordinates" : [<user_lon>, <user_lat>]
},
"relation": "contains"
}
}
}
}
}
}
In this query, we are retrieving restaurants whose delivery_area shape contains the point the user is currently located at.
Index the restraunts documents with following mappings
{
"mappings": {
"type_name": {
"properties": {
"name": {
"type": "text"
},
"location": {
"type": "text"
},
"location_geo": {
"type": "geo_point"
}
}
}
}
}
Use geo_filter query
{
"query": {
"bool": {
"filter": {
"geo_distance": {
"distance": "10km",
"location_geo": {
"lat": 40,
"lon": -70
}
}
}
}
}
}

ElasticSearch 2 bucket level sorting

The mapping of database is this:
{
"users": {
"mappings": {
"user": {
"properties": {
credentials": {
"type": "nested",
"properties": {
"achievement_id": {
"type": "string"
},
"percentage_completion": {
"type": "integer"
}
}
},
"current_location": {
"type": "geo_point"
},
"locations": {
"type": "geo_point"
}
}
}
}
}
Now In the mapping, You can see there are two geo-distance fields one is current_location and other is locations. Now I want to sort user based on credentials.percentage_completion which is a nested field. This work fine for example this query,
Example Query:
GET /users/user/_search?size=23
{
"sort": [
{
"credentials.percentage_completion": {
"order": "desc",
"missing": "_last"
}
},
"_score"
],
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"geo_distance": {
"distance": "100000000km",
"user.locations": {
"lat": 19.77,
"lon": 73
}
}
}
}
}
}
I want to change sorting order made into buckets, the desired order is first show all the people who are at 100KM radius of user.current_location and sort them according to credentials.percentage_completion and then rest of users sorted again by credentials.percentage_completion.
I tried putting conditional in sorting and made it multilevel but that will not work because only nested can have filters and that on nested fields child only.
I thought I can use _score for sorting and give more relevance to people who are under 1000 km but geo-distance is a filter, I don't seem to find any way to give relevance in filter.
Is there anything I am missing here , any help would be great.
Thanks
Finally solved it, posting it here so other can also take some lead if they get here. The way to solve this is to give constant relevance score to particular query but as here it was Geo distance so was not able to use that in query, then I found Constant Score query: It allows to wrap a filter inside a query.
This is how query looks:
GET /users/user/_search?size=23
{
"sort": [
"_score",
{
"credentials.udacity_percentage_completion": {
"order": "desc",
"missing": "_last"
}
}
],
"explain": true,
"query": {
"filtered": {
"query": {
"bool": {
"should": [
{
"constant_score": {
"filter": {
"geo_distance": {
"distance": "100km",
"user.current_location": {
"lat": 19.77,
"lon": 73
}
}
},
"boost": 50
}
},
{
"constant_score": {
"filter": {
"geo_distance": {
"distance": "1000000km",
"user.locations": {
"lat": 19.77,
"lon": 73
}
}
},
"boost": 1
}
}
]
}
},
"filter": {
"geo_distance": {
"distance": "10000km",
"user.locations": {
"lat": 19.77,
"lon": 73
}
}
}
}
}
}

Array of locations in elasticserach spatial query

I am new to this elastic search concept i can't find a solution for my problem. suppose consider the following query.
GET banknew/_search/
{
"query": {
"match_all": {}
},
"filter": {
"geo_distance": {
"location": {
"lat": 8.722479,
"lon": 78.13047
},
"distance": "5km"
}
}
}
This will gave me the result. The above query is for 1 location(means 1 lat, lng). But i have to get the result for multiple locations(means for 2 or more lat, lng). What i tried is
GET banknew/_search/
{
"query": {
"match_all": {}
},
"filter": {
"geo_distance": {
"location": [{
"lat": 8.722479,
"lon": 78.13047
},{
"lat": 8.722479,
"lon": 78.13047
} ],
"distance": "5km"
}
}
}
I have to get the result of points within 5km for 1st location and also 2nd location.
But i am receiving error `"error": "SearchPhaseExecutionException[Failed to execute phase [query], all shards failed". Whether its possible. Please guide me. Thanks in advance
You could use another geo_distance filter and wrap it up in a bool filter.
If you are searching result at 5km from first location OR second location, add it in the should clause.
Try something like this :
GET banknew/_search/
{
"query": {
"match_all": {}
},
"filter": {
"bool": {
"should": [
{
"geo_distance": {
"distance": "5km",
"location": {
"lat": lat1,
"lon": lon1
}
}
},
{
"geo_distance": {
"distance": "5km",
"location": {
"lat": lat2,
"lon": lon2
}
}
}
],
"minimum_should_match": 1
}
}
}

Resources