Elasticsearch Geo Distance query - elasticsearch

I've got a list of places which have their latitude and longitude associated with them in the correct mapping of geo_point
I've also got a query successfully returning results based on geo distance which looks like this:
{
"filter": {
"geo_distance": {
"distance": "30mi",
"companies.locations": {
"lat": "51.8801595",
"lon": "0.577141"
}
}
},
"sort": {
"_geo_distance": {
"companies.locations": {
"lat": "51.8801595",
"lon": "0.577141"
},
"order": "asc",
"unit": "mi",
"mode": "min"
}
},
"from": 0,
"size": 500
}
So this currently returns results within 30miles of the latitude and longitude provided. And this works fine.
I'm struggling with the next step, which I'm hoping someone can point me in the right direction with.
Each place has a field called distance which is an integer. This is the maximum distance a place is willing to travel to a client. So if the distance is 20 (miles) but their latitude and longitude calculates as more than 20miles they should be excluded from the results.
The results come back like this:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": null,
"hits": [
{
"_index": "test",
"_type": "places",
"_id": "AUtvK2OILrMWSKLclj9Z",
"_score": null,
"_source": {
"id": "1",
"name": "Chubby Company",
"summary": "",
"content": "",
"locations": [
{
"lat": 51.8200763,
"lon": 0.5264076
}
],
"address": [
{
"addr1": "xxxx",
"addr2": "",
"town": "MyTown",
"county": "Essex",
"postcode": "XX1 2XX",
"tel1": "01111 111111",
"tel2": "",
"email": null
}
],
"website": "",
"media": {
"logo": "",
"image": "",
"video": ""
},
"product_ids": [
"1",
"3",
"2"
],
"distance": "20"
},
"sort": [
0.031774582056958885
]
}
]
}
}
The sort object is distance in miles, so the result above is 0.03 miles from the client.
I'm trying to utilize this to check against the record using result to exclude it from the results but this is where I'm falling down.
I've tried different combinations of this:
"script": {
"script": "doc['distance'].value < doc['sort'].value"
}
which combined with the query looks like this:
{
"filter": {
"geo_distance": {
"distance": "30mi",
"companies.locations": {
"lat": "51.8801595",
"lon": "0.577141"
}
}
},
"sort": {
"_geo_distance": {
"companies.locations": {
"lat": "51.8801595",
"lon": "0.577141"
},
"order": "asc",
"unit": "mi",
"mode": "min"
}
},
"filtered": {
"filter": {
"script": {
"script": "doc['distance'].value < doc['sort'].value"
}
}
},
"from": 0,
"size": 500
}
But i get an error of:
SearchPhaseExecutionException[Failed to execute phase [query], all
shards failed ... Parse Failure [No parser for element [filtered]
Any advice would be great.
UPDATE
Trying this also fails:
{
"filter": {
"geo_distance": {
"distance": "30mi",
"companies.locations": {
"lat": "51.8801595",
"lon": "0.577141"
}
},
"script": {
"script": "_source.distance < sort.value"
}
},
"sort": {
"_geo_distance": {
"companies.locations": {
"lat": "51.8801595",
"lon": "0.577141"
},
"order": "asc",
"unit": "mi",
"mode": "min"
}
},
"from": 0,
"size": 500
}
with
nested: ElasticsearchParseException[Expected field name but got START_OBJECT \"script\"]; }]","status":400}

I had a similar problem where I wanted to have a conditional distance for each record that had this value stored in the data(searchRadius for me). I ended up using the AndFilterBuilder and ScriptFilterBuilder classes in Java, so the AndFilterBuilder has an array of both a GeoDistanceFilterBuilder and a ScriptFilterBuilder.
{
"from": 0,
"size": 20,
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"and": {
"filters": [
{
"geo_distance": {
"location": [
-104.99230194091797,
39.74000930786133
],
"distance": "3000mi"
}
},
{
"script": {
"script": "doc['location'].arcDistanceInMiles(39.74000930786133, -104.99230194091797) < doc['searchRadius'].value"
}
}
]
}
}
}
},
"fields": "_source",
"script_fields": {
"distance": {
"script": "doc['location'].distance(39.74000930786133, -104.99230194091797)"
}
},
"sort": [
{
"_geo_distance": {
"location": [
-104.99230194091797,
39.74000930786133
],
"unit": "mi"
}
}
]
}

Related

Is there a way to return the geo distance when NOT sorting with _geo_distance?

I need to return the computing distance in the result for the geo location.but not using the sort
currently I'm using sorting but it ignores the exist field
here is my query:
"query": {
"bool": {
"filter": [
{
"match": {
"field 1": "value"
}
},
{
"match": {
"field2": "A"
}
},
{
"geo_distance": {
"distance": "20km",
"location": "34,-2.99"
}
}
], "should": [
{
"exists": {
"field": "field3",
"boost": 10000
}
}
]
}
},
"size": 500,
"sort": [
{
"_geo_distance": {
"location": {
"lat": 34,
"lon": 2.99
},
"order": "asc",
"unit": "km",
"mode": "min",
"distance_type": "arc",
"ignore_unmapped": "true"
}
}
]
I need if the field3 exist it gets higher ranking
If I understand your problem correctly. This query should work:
"query": {
"bool": {
"filter": [
{
"match": {
"field 1": "value"
}
},
{
"match": {
"field2": "A"
}
},
{
"geo_distance": {
"distance": "20km",
"location": "34,-2.99"
}
}
]
}
},
"size": 500,
"sort": [
{
"_script": {
"type": "number",
"script": {
"source": "doc['field3'].size() > 0 ? 10000 : 0"
},
"order": "asc"
}
},
{
"_geo_distance": {
"location": {
"lat": 34,
"lon": 2.99
},
"order": "asc",
"unit": "km",
"mode": "min",
"distance_type": "arc",
"ignore_unmapped": "true"
}
}
]
I am checking if field exist and boosting it's score

Filter on different fields on array of objects

In Elasticsearch, say I have the document like this:
{
"id": "testId",
"inputs": [
{
"status": "STARTED",
"lastUpdatedTime": "2020-06-10T00:00:00.000Z"
},
{
"status": "STARTED",
"lastUpdatedTime": "2020-05-11T00:00:00.000Z"
},
{
"status": "ENDED",
"lastUpdatedTime": "2020-06-11T00:00:00.000Z"
}
]
}
Now, I wanted to filter all the documents such that I would get all the documents with status as ENDED and lastUpdatedTime should be highest in the input array. Eg. in above case, it will return this document as 2020-06-11T00:00:00.000Z > 2020-06-10T00:00:00.000Z and 2020-05-11T00:00:00.000Z and status is ENDED. But say, for below document, it won't return:
{
"id": "testId2",
"inputs": [
{
"status": "STARTED",
"lastUpdatedTime": "2020-06-10T00:00:00.000Z"
},
{
"status": "STARTED",
"lastUpdatedTime": "2020-05-11T00:00:00.000Z"
},
{
"status": "ENDED",
"lastUpdatedTime": "2020-05-11T00:00:00.000Z"
}
]
}
This is because in this document STARTED has the largest lastUpdatedTime. How can I do this kind of filtering in Elasticsearch easily or it is not possible?
By this, below query you will get the result such that lastUpdatedTime should be highest in the input array corresponding to the STATUS="ENDED"
But this just solves only 1 part of your answer, the same query will not give you the desired result (i.e will not work for your second case where highest value of lastUpdatedTime is for STARTED)
Mapping:
{
"mappings": {
"properties": {
"inputs": {
"type": "nested"
},
"lastUpdatedTime": { "type": "date" }
}
}
}
Search Query:
{
"query": {
"nested": {
"path": "inputs",
"query": {
"match": {"inputs.status":"ENDED"}
},
"inner_hits": {
"sort": [
{
"lastUpdatedTime": {
"order": "desc"
}
}
],
"size": 1
}
}
}
}
Result:
"inner_hits": {
"inputs": {
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": null,
"hits": [
{
"_index": "my_index",
"_type": "_doc",
"_id": "-s2KpnIBXf9A6l_vBbmP",
"_nested": {
"field": "inputs",
"offset": 2
},
"_score": null,
"_source": {
"status": "ENDED",
"lastUpdatedTime": "2020-05-11T00:00:00.000Z"
},
"sort": [
-9223372036854775808
]
}
]
}
}
}

Elasticsearch - filter geo_distance query

I have a mapping type with two fields : location (geo_point) and type (short).
I want to list my places by geo proximity and use this kind of query
{
"query": {
"bool": {
"filter": {
"geo_distance": {
"distance": "20km",
"location": {
"lat": 48.856614,
"lon": 2.3522219
}
}
}
}
},
"aggs": {
"types": {
"terms": {
"field": "type"
}
}
},
"post_filter": [],
"page": 1,
"size": 50,
"sort": [
{
"_geo_distance": {
"location": {
"lat": 48.856614,
"lon": 2.3522219
},
"order": "asc",
"unit": "km",
"distance_type": "plane"
}
}
]
}
Is there any way to only include the first 2 places of a special type (e.g. type=2) ?
Add another clause to the filter like
{
"query": {
"bool": {
"filter": [{
"geo_distance": {
"distance": "20km",
"location": {
"lat": 48.856614,
"lon": 2.3522219
}
}
},
{
"term": {"type":"2"}
}]
}
},
"aggs": {
"types": {
"terms": {
"field": "type"
}
}
},
"post_filter": [],
"page": 1,
"size": 2,
"sort": [
{
"_geo_distance": {
"location": {
"lat": 48.856614,
"lon": 2.3522219
},
"order": "asc",
"unit": "km",
"distance_type": "plane"
}
}
]
}

Elastic Search: Aggregation sum on a particular field

I am new to elastic search and requesting some help.
Basically I have some 2 million documents in my elastic search and the documents look like below:
{
"_index": "flipkart",
"_type": "PSAD_ThirdParty",
"_id": "430001_MAM_2016-02-04",
"_version": 1,
"_score": 1,
"_source": {
"metrics": [
{
"id": "Metric1",
"value": 70
},
{
"id": "Metric2",
"value": 90
},
{
"id": "Metric3",
"value": 120
}
],
"primary": true,
"ticketId": 1,
"pliId": 206,
"bookedNumbers": 15000,
"ut": 1454567400000,
"startDate": 1451629800000,
"endDate": 1464589800000,
"tz": "EST"
}
}
I want to write an aggregation query which satisfies below conditions:
1) First query based on "_index", "_type" and "pliId".
2) Do aggregation sum on metrics.value based on metrics.id = "Metric1".
Basically I need to query records based on some fields and aggregate sum on a particular metrics value based on metrics id.
Please can you help me in getting my query right.
Your metrics field needs to be of type nested:
"metrics": {
"type": "nested",
"properties": {
"id": {
"type": "string",
"index": "not_analyzed"
}
}
}
If you want Metric1 to match, meaning upper-case letter, then as you see above the id needs to be not_analyzed.
Then, if you only want metrics.id = "Metric1" aggregations, you need something like this:
{
"query": {
"filtered": {
"filter": {
"bool": {
"must": [
{
"term": {
"pliId": 206
}
}
]
}
}
}
},
"aggs": {
"by_metrics": {
"nested": {
"path": "metrics"
},
"aggs": {
"metric1_only": {
"filter": {
"bool": {
"must": [
{
"term": {
"metrics.id": {
"value": "Metric1"
}
}
}
]
}
},
"aggs": {
"by_metric_id": {
"terms": {
"field": "metrics.id"
},
"aggs": {
"total_delivery": {
"sum": {
"field": "metrics.value"
}
}
}
}
}
}
}
}
}
}
Created new index:
Method : PUT ,
URL : http://localhost:9200/google/
Body:
{
"mappings": {
"PSAD_Primary": {
"properties": {
"metrics": {
"type": "nested",
"properties": {
"id": {
"type": "string",
"index": "not_analyzed"
},
"value": {
"type": "integer",
"index": "not_analyzed"
}
}
}
}
}
}
}
Then I inserted some 200 thousand documents and than ran the query and it worked.
Response:
{
"took": 34,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "google",
"_type": "PSAD_Primary",
"_id": "383701291_MAM_2016-01-06",
"_score": 1,
"_source": {
"metrics": [
{
"id": "Metric1",
"value": 70
},
{
"id": "Metric2",
"value": 90
},
{
"id": "Metric3",
"value": 120
}
],
"primary": true,
"ticketId": 1,
"pliId": 221244,
"bookedNumbers": 15000,
"ut": 1452061800000,
"startDate": 1451629800000,
"endDate": 1464589800000,
"tz": "EST"
}
}
]
},
"aggregations": {
"by_metrics": {
"doc_count": 3,
"metric1_only": {
"doc_count": 1,
"by_metric_id": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "Metric1",
"doc_count": 1,
"total_delivery": {
"value": 70
}
}
]
}
}
}
}
}

Elasticsearch with range and exists filter

I have a ElasticSearch query to get every products within a set range. I would like to add a filter to select only documents which have the attribute "products". My tests with must exists had always error.
/zipcodes_at/zipcode/_search
{
"_source": [
"products"
],
"filter": {
"geo_distance": {
"distance": "100km",
"location": {
"lat": 48.232361,
"lon": 16.324659
}
}
},
"sort": [
{
"_geo_distance": {
"location": {
"lat": 48.232361,
"lon": 16.324695
},
"order": "asc",
"unit": "km",
"distance_type": "plane"
}
}
]
}
Try this:
POST /zipcodes_at/zipcode/_search
{
"_source": [
"products"
],
"query": {
"bool": {
"filter": [
{
"exists": {
"field": "products"
}
},
{
"geo_distance": {
"distance": "100km",
"location": {
"lat": 48.232361,
"lon": 16.324659
}
}
}
]
}
},
"sort": [
{
"_geo_distance": {
"location": {
"lat": 48.232361,
"lon": 16.324695
},
"order": "asc",
"unit": "km",
"distance_type": "plane"
}
}
]
}
You should must use bool filter , and combine geo distance filter along with exist filter.
{
"_source": ["products"],
"query": {
"filtered": {
"filter": {
"bool": {
"must": [{
"exists": {
"field": "products"
}
}, {
"geo_distance_range": {
"from": 0,
"to": 100,
"distance_unit": "km",
"location": {
"lat": 40.73,
"lon": -74.1
}
}
}]
}
}
}
},
"sort": [{
"_geo_distance": {
"location": {
"lat": 48.232361,
"lon": 16.324695
},
"order": "asc",
"unit": "km",
"distance_type": "plane"
}
}]
}

Resources