Elasticsearch, Filter documents based on different radius for different geopoint field - elasticsearch

I have ES documents similar to this, I have a location array with a type field.
{
"type": "A/B/C",
"locations1": [
{
"lat": 19.0179332,
"lon": 72.868069
},
{
"lat": 18.4421771,
"lon": 73.8585108
}
]
}
Type value determines the distance applicable for that location.
Let's say, the allowed distance of query for type A is 10km, for type B is 100km, for type C is 1000km.
Given location L, I want to find all documents which satisfy the distance criteria for that document for the given location and the final result should be sorted by distance.
I am not able to figure out how to use dynamic radius for this. Is it possible or I need to change my document structure similar to this?
EDIT:
I was also thinking of destructing the document locations like this
"locationsTypeA": [
{
"lat": 19.0179332,
"lon": 72.868069
},
{
"lat": 18.4421771,
"lon": 73.8585108
}
],
"locationsTypeB": [
{
"lat": 19.0179332,
"lon": 72.868069
},
{
"lat": 18.4421771,
"lon": 73.8585108
}
],
"locationsTypeC": [
{
"lat": 19.0179332,
"lon": 72.868069
},
{
"lat": 18.4421771,
"lon": 73.8585108
}
]
}
And then I can use the query
"query": {
"bool": {
"should": [
{
"geo_distance": {
"distance": "10km",
"locationsTypeA": {
"lat": 12.5,
"lon": 18.2
}
}
},
{
"geo_distance": {
"distance": "100km",
"locationsTypeB": {
"lat": 12.5,
"lon": 18.2
}
}
},
{
"geo_distance": {
"distance": "1000km",
"locationsTypeC": {
"lat": 12.5,
"lon": 18.2
}
}
}
]
}
}
}

Using the 1st doc structure and the mapping looking like:
PUT geoindex
{
"mappings": {
"properties": {
"locations": {
"type": "geo_point"
}
}
}
}
Let's take a random point between Pune and Mumbai to be the origin relative to which we'll perform a scripted geo query using the arcDistance function:
GET geoindex/_search
{
"query": {
"bool": {
"must": [
{
"script": {
"script": {
"source": """
def type = doc['type.keyword'].value;
def dynamic_distance;
if (type == "A") {
dynamic_distance = 10e3;
} else if (type == "B") {
dynamic_distance = 100e3;
} else if (type == "C") {
dynamic_distance = 1000e3;
}
def distance_in_m = doc['locations'].arcDistance(
params.origin.lat,
params.origin.lon
);
return distance_in_m < dynamic_distance
""",
"params": {
"origin": {
"lat": 18.81531,
"lon": 73.49029
}
}
}
}
}
]
}
},
"sort": [
{
"_geo_distance": {
"locations": {
"lat": 18.81531,
"lon": 73.49029
},
"order": "asc"
}
}
]
}

I did the similar but less complex approach
Here's the code:
{
query: {
bool: {
must: [
{
match: {
companyName: {
query: req.text
}
}
},
{
script: {
script: {
params: {
lat: parseFloat(req.lat),
lon: parseFloat(req.lon)
},
source: "doc['location'].arcDistance(params.lat, params.lon) / 1000 < doc['searchRadius'].value",
lang: "painless"
}
}
}
]
}
},
sort: [
{
_geo_distance: {
location: {
lat: parseFloat(req.lat),
lon: parseFloat(req.lon)
},
order: "asc",
unit:"km"
}
}
],

Related

Elastic Search Geo Spatial search implementation

I am trying to understand how elastic search supports Geo Spatial search internally.
For the basic search, it uses the inverted index; but how does it combine with the additional search criteria like searching for a particular text within a certain radius.
I would like to understand the internals of how the index would be stored and queried to support these queries
Text & geo queries are executed separately of one another. Let's take a concrete example:
PUT restaurants
{
"mappings": {
"properties": {
"location": {
"type": "geo_point"
},
"menu": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
}
}
}
}
}
}
POST restaurants/_doc
{
"name": "rest1",
"location": {
"lat": 40.739812,
"lon": -74.006201
},
"menu": [
"european",
"french",
"pizza"
]
}
POST restaurants/_doc
{
"name": "rest2",
"location": {
"lat": 40.7403963,
"lon": -73.9950026
},
"menu": [
"pizza",
"kebab"
]
}
You'd then match a text field and apply a geo_distance filter:
GET restaurants/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"menu": "pizza"
}
},
{
"geo_distance": {
"distance": "0.5mi",
"location": {
"lat": 40.7388,
"lon": -73.9982
}
}
},
{
"function_score": {
"query": {
"match_all": {}
},
"boost_mode": "avg",
"functions": [
{
"gauss": {
"location": {
"origin": {
"lat": 40.7388,
"lon": -73.9982
},
"scale": "0.5mi"
}
}
}
]
}
}
]
}
}
}
Since the geo_distance query only assigns a boolean value (--> score=1; only checking if the location is within a given radius), you may want to apply a gaussian function_score to boost the locations that are closer to a given origin.
Finally, these scores are overridable by using a _geo_distance sort where you'd order by the proximity (while of course keeping the match query intact):
...
"query: {...},
"sort": [
{
"_geo_distance": {
"location": {
"lat": 40.7388,
"lon": -73.9982
},
"order": "asc"
}
}
]
}

Query geo distance on elasticsearch with dynamic distance

I am facing an issue with Elasticsearch.
I would like use the geo distance feature to fetch all the item located N km maximum from a given localization.
Here is my DB schema:
{
"user_id": "abcde",
"pin" : {
"location" : {
"lat" : 40.12,
"lon" : -71.34
}
},
"is_active": true,
"action_zone": 50
}
I have this query which works pretty well:
{
"query": {
"bool" : {
"must" :
[{
"term": {
"is_active": True
}
}],
"filter" : {
"geo_distance" : {
"distance" : "200km",
"pin.location" : {
"lat" : 40,
"lon" : -70
}
}
}
}
}
}
Now, I would like to modify this query a bit to replace dynamically the distance (200km in my example) by the value "action_zone" of each item of in the DB.
That would be great if someone could help me. :)
I found the solution using a script :D Thanks anyway !
{
"query": {
"bool" : {
"must" :
[{
"term": {
"is_active": True
}
},{
"script" : {
"script" : {
"params": {
"lat": 40.8,
"lon": -70.1
},
"source": "doc['location'].arcDistance(params.lat, params.lon) / 1000 < doc['action_zone'].value",
"lang": "painless"
}
}
}]
}
}
}
}
Doc: https://www.elastic.co/guide/en/elasticsearch/reference/6.1/query-dsl-script-query.html
Unfortunately, the geo_distance query doesn't allow to use scripting in order to specify a dynamic distance. What you could do, however, would be to use a terms aggregation on the action_zone field so as to bucket all your documents within a specific action zone.
{
"query": {
"bool": {
"must": [
{
"term": {
"is_active": True
}
}
],
"filter": {
"geo_distance": {
"distance": "200km",
"pin.location": {
"lat": 40,
"lon": -70
}
}
}
}
},
"aggs": {
"zones": {
"terms": {
"field": "action_zone"
}
}
}
}
Otherwise, you could also use a range aggregation on the action_zone field with a few specific distances:
{
"query": {
"bool": {
"must": [
{
"term": {
"is_active": "True"
}
}
],
"filter": {
"geo_distance": {
"distance": "200km",
"pin.location": {
"lat": 40,
"lon": -70
}
}
}
}
},
"aggs": {
"zones": {
"range": {
"field": "action_zone",
"ranges": [
{
"to": 50
},
{
"from": 50,
"to": 100
},
{
"from": 100,
"to": 150
},
{
"from": 150
}
]
}
}
}
}

Elasticsearch geohash_grid returns 1 doc count but query returns a lot

I'm using Elasticsearch 5.1 with geohash_grid query as below:
{
"query": {
...
"geo_bounding_box":...
},
"aggs": {
"lochash": {
"geohash_grid": {
"field": "currentShopGeo",
"precision": 5
}
}
}
}
And here is the results of elasticsearch:
{
....,
"aggregations": {
"lochash": {
"buckets": [
{
"key": "w3gvv",
"doc_count": 1 // only 1 doc_count
}
]
}
}
}
Then, I used "w3gvv" to decode geohash and have a bounding box as below following "w3gvv".
{
"top_left": {
"lat": 10.8984375,
"lon": 106.7431640625
},
"bottom_right": {
"lat": 10.8544921875,
"lon": 106.787109375
}
}
However, when I use the returned bounding box above to search for the document inside, it appears that Elasticsearch returns 13 items more. Anyone have any idea why it is so weird?
Got a solution,
We could use geo_bounds to know the exact boundary of the clusters that are returned by Elasticsearch as below:
"aggs": {
"lochash": {
"geohash_grid": {
"field": "currentShopGeo",
"precision": 5
},
"aggs": {
"cell": {
"geo_bounds": {
"field": "currentShopGeo"
}
}
}
}
}
The result should be:
{
"key": "w3gvv",
"doc_count": 1,
"cell": {
"bounds": {
"top_left": {
"lat": 10.860191588290036,
"lon": 106.75263083539903
},
"bottom_right": {
"lat": 10.860191588290036,
"lon": 106.75263083539903
}
}
}
}
It appears that the results shows exactly where the item is.

How to output in ElasticSearch distance for same location that chosen by geo_distance from multiple locations

I have multiple locations:
Document 1 -
"contact": [
{
"address": {
"geolocation": {
"lon": -73.5409,
"lat": 41.2512
}
}
}
]
Document 2 -
{ "contact": [
{
"address": {
"geolocation": {
"lon": -73.7055,
"lat": 40.6744
}
}
},
{
"address": [
{
"geolocation": {
"lon": -73.9325,
"lat": 40.7482
}
},
{
"geolocation": {
"lon": -87.9921,
"lat": 42.9959
}
},
{
"geolocation": {
"lon": -95.4563,
"lat": 29.8775
}
}
]
}
]
}
geo_distance finds both documents by closest location.
"geo_distance": {
"distance": "275mi",
"distance_type": "plane",
"contact.address.geolocation": {
"lat": 42,
"lon": -71
},
"unit": "mi"
}
}
But when I add script field to output lat, lon, and distance
"script_fields": {
"distance_value": {
"script": "doc.containsKey('contact.address.geolocation') ? doc['contact.address.geolocation'].value ? doc['contact.address.geolocation'].arcDistanceInMiles(42.2882,-71.0474) : null : null"
},
"geolocation": {
"script": "doc.containsKey('contact.address.geolocation') ? doc['contact.address.geolocation'].value : null"
}
}
it output random geolocation element from Document 2.
For document 1 it is 147 miles
But for document 2 it is 1601 miles because it takes different location than in geo_distance filter.
How can I print same value as in geo_distance? I want to show distance to my point.
I've tried this script:
"script_fields": {
"distance_value": {
"script": "if (doc.containsKey('contact.address.geolocation')==false) return null; min = 40000; for(e in doc['contact.address.geolocation']){ c=0; if(e!=null) c = e.arcDistanceInMiles(42.2882,-71.0474); if(c<min) min=c;}; return min;"
}
}
It gives error
No signature of method: org.elasticsearch.common.geo.GeoPoint.arcDistanceInMiles() is applicable for argument types: (java.lang.Double, java.lang.Double)
Also I don't think it will iterate over all gelocation fields.
I found only one way to output same distance as in the filter - add "sort" element:
"sort": [
"_score",
{
"_geo_distance": {
"contact.address.geolocation": [
-71,
42
],
"order": "asc",
"unit": "mi"
}
}
]

How to use elasticsearch distance query with more than one geopoint

Say, I want to search for a document which is within 5kms of any of the three geo points A,B or C. Is it possible to do it within a single query or how to do it?
Yes, you can use a bool/should query with three geo_distance queries.
POST /your_index/_yearch
{
"query": {
"bool": {
"should": [
{
"geo_distance": {
"distance": "5km",
"pin.location": {
"lat": 40,
"lon": -70
}
}
},
{
"geo_distance": {
"distance": "5km",
"pin.location": {
"lat": 41,
"lon": -71
}
}
},
{
"geo_distance": {
"distance": "5km",
"pin.location": {
"lat": 42,
"lon": -72
}
}
}
]
}
}
}

Resources