ElasticSearch - Increase distance if no results - elasticsearch

Would it be possible to create query that would increase the distance field when no results are found?
Here's an example of a query that will return whether there's a match or not:
{
"query": {
"bool": {
"filter": {
"geo_distance": {
"distance": "100km",
"location": {
"lat": 40,
"lon": -7
}
}
}
}
}
}
But I would like that Elasticsearch could keep looking until at least 1 item is found.
I could increase the value programmatically and make a new query with "distance": "200km" and so on... until I find something, but I would like to make a query that would do that automatically.
Elasticsearch version: 6.2

Related

How to sum the size of documents within a time interval?

I'm attempting to estimate the sum of size of n documents across an index using below query :
GET /events/_search
{
"query": {
"bool":{
"must": [
{"range": {"ts": {"gte": "2022-10-10T00:00:00Z", "lt": "2022-10-21T00:00:00Z"}}}
]
}
},
"aggs": {
"total_size": {
"sum": {
"field": "doc['_source'].bytes"
}
}
}
}
This returns documents but the size of the aggregation is 0 :
"aggregations" : {
"total_size" : {
"value" : 0.0
}
}
How to sum the size of documents within a time interval ?
The best way to achieve what you want is to actually add another field that contains the real source size at indexing time.
However, if you want to run it once to see how it looks like, you can leverage runtime fields to compute this at search time, just know that it can put a heavy burden on your cluster. Since the Painless scripting language doesn't yet provide a way to transform the source document to the same JSON you sent at indexing time, we can only approximate the value you're looking for by stringifying the _source Hashmap, yielding this:
GET /events/_search
{
"runtime_mappings": {
"source.size": {
"type": "double",
"script": """
def size = params._source.toString().length() * 8;
emit(size);
"""
}
},
"query": {
"bool":{
"must": [
{"range": {"ts": {"gte": "2022-10-10T00:00:00Z", "lt": "2022-10-21T00:00:00Z"}}}
]
}
},
"aggs": {
"size": {
"sum": {
"field": "source.size"
}
}
}
}
Another way is to install the Mapper size plugin so that you can make use of the _size field computed at indexing time.

Searching in elasticsearch with proximity(slop) zero and one

I have created the following index
PUT /proximity_example_1
{
"mappings":{
"properties":{
"doc_id": {
"type": "text"
},
"test_name":{
"type": "text"
}
}
}
}
Then indexed a document
POST proximity_example_2/_doc
{
"doc_id": "id1",
"test_name": "test proximity here"
}
Then queried with proximity 0, as follow
GET proximity_example_2/_search
{
"query": {
"match_phrase": {
"test_name": {
"query": "proximity test",
"slop": 0.0
}
}
}
}
But I didn't get any result, Then I searched with proximity 1 , and this time also I didn't get any document.
But when I searched with proximity greater than 1, I got results.
GET proximity_example_2/_search
{
"query": {
"match_phrase": {
"test_name": {
"query": "proximity test",
"slop": 2.0
}
}
}
}
GET proximity_example_2/_search
{
"query": {
"match_phrase": {
"test_name": {
"query": "proximity test",
"slop": 3.0
}
}
}
}
So does that mean in elasticsearch when we do a search with proximity 1 or 0 order of the search term matters?
Thank you...
Slop with value 0 is as good as normal phrase search(very restrictive and should have search terms in the exact same order in the Elasticsearch), as you increase the slope this restrictiveness gets reduce and you will have more search results, but beware that increasing to to high number will defeat the purpose of phrase search and you will get irrelevant results.
You can read this and this detailed blog post that explains how it works internally

Elasticsearch custom geo distance filter

From an Elasticsearch query I'd like to retrieve all the points within a variable distance.
Let say I have 2 shops, one is willing to deliver at maximum 3 km and the other one at maximum 5 km:
PUT /my_shops/_doc/1
{
"location": {
"lat": 40.12,
"lon": -71.34
},
"max_delivery_distance": 3000
}
PUT /my_shops/_doc/2
{
"location": {
"lat": 41.12,
"lon": -72.34
},
"max_delivery_distance": 5000
}
For a given location I'd like to know which shops are able to deliver. IE query should return shop1 if given location is within 3km and shop2 if given location is within 5km
GET /my_shops/_search
{
"query": {
"bool": {
"must": {
"match_all": {}
},
"filter": {
"geo_distance": {
"distance": max_delivery_distance,
"location": {
"lat": 40,
"lon": -70
}
}
}
}
}
}
There's another way to solve this without scripting (big performance hogger !!) and let ES sort it out using native Geo shapes.
I would model each document as a circle, with a center location and a (delivery) radius. First, your index mapping should look like this:
PUT /my_shops
{
"mappings": {
"properties": {
"delivery_area": {
"type": "geo_shape",
"strategy": "recursive"
}
}
}
}
Then, your documents then need to have the following form:
PUT /my_shops/_doc/1
{
"delivery_area" : {
"type" : "circle",
"coordinates" : [-71.34, 40.12],
"radius" : "3000m"
}
}
PUT /my_shops/_doc/2
{
"delivery_area" : {
"type" : "circle",
"coordinates" : [-72.34, 41.12],
"radius" : "5000m"
}
}
And finally the query simply becomes a geo_shape query looking at intersections between a delivery point and the delivery area of each shop.
GET /my_shops/_search
{
"query": {
"bool": {
"filter": {
"geo_shape": {
"delivery_area": {
"shape": {
"type": "point",
"coordinates": [ -70, 40 ]
},
"relation": "contains"
}
}
}
}
}
}
That's it! No scripting, just geo operations.
I think that you need to work with a script to use another field as parameter. After some research I come to this answer:
GET my_shops/_search
{
"query": {
"script": {
"script": {
"params": {
"location": {
"lat": 40,
"lon": -70
}
},
"source": """
return doc['location'].arcDistance(params.location.lat, params.location.lon)/1000 <= doc['max_delivery_distance'].value"""
}
}
}
}
Basically, we exploit the fact that the classes related to the GEO points are whitelisted in painless https://github.com/elastic/elasticsearch/pull/40180/ and that scripts accepts additional parameters (your fixed location).
According to the documentation of arcDistance we retrieve the size in meters, so you need to convert this value into km by dividing by 1000.
Additional Note
I assume that location and max_delivery_distance are always (for each document) defined. If it is not the case, you need to cover this case.
Reference
Another related question
https://github.com/elastic/elasticsearch/pull/40180/

Can Elasticsearch search by geo distance and other attributes at the same time?

In Elasticsearch, when I search by geo-distance to a point, can I at the same time filter by another attribute, such as a number being within a range, so that both filters need to be true for the result to come back?
Sure, use bool query, where you can specify multiple clauses in must and (or) filter blocks. Be aware that clauses in must block will contribute to the relevance score and clauses in filter block will not (read more about query and filter context).
For example, query that at same time search by geo-distance with contribution to score and filter an age being within a range without contribution to score:
{
"query": {
"bool": {
"must": [
{
"geo_distance": {
"distance": "100km",
"pin.location": {
"lat": 38.889248,
"lon": -77.050636
}
}
}
],
"filter": [
{
"range": {
"age": {
"gte": 18,
"lte": 65
}
}
}
]
}
}
}

Accessing nested property in Elasticsearch distance script.

My index in elastic search has the following mapping:
"couchbaseDocument": {
"properties": {
"doc": {
"properties": {
"properties": {
"properties": {
"location": {
"type": "geo_point"
The source document is as follows:
{"properties" : {"location":"43.706596,-79.4030464"}}
I am trying to use the distance script to calculate the distance based on geo-points. I found this post Return distance in elasticsearch results? to help me out. I am trying to get all results,filter by radius 1km, get the distance, and sort on geo_point. The query is constructed as follows:
{
"query": {
"match_all": {}
},
"filter": {
"geo_distance": {
"distance": "1km",
"doc.properties.location": {
"lat": 43.710323,
"lon": -79.395284
}
}
},
"script_fields": {
"distancePLANE": {
"params": {
"lat": 43.710323,
"lon": -79.395284
},
"script": "doc[properties]['location'].distanceInKm(lat, lon)"
},
"distanceARC" :{
"params": {
"lat": 43.710323,
"lon": -79.395284
},
"script": "doc[properties]['location'].arcDistanceInKm(lat,lon)"
}
},
"sort": [
{
"_geo_distance":{
"doc.properties.location": [-79.395284,43.710323],
"order": "desc",
"unit": "km"
}
}
],
"track_scores": true
}
I get the following error with status 500:
"PropertyAccessException[[Error: could not access: properties; in class: org.elasticsearch.search.lookup.DocLookup]\n[Near : {... doc[properties]['location'].distan ....}]\n ^\n[Line: 1, Column: 5]]"
I tried rewriting the query in this way:
..."script": "doc['properties']['location'].arcDistanceInKm(lat,lon)"...
Then I get this error:
"CompileException[[Error: No field found for [properties] in mapping with types [couchbaseDocument]]\n[Near : {... doc['properties']['location']. ....}]\n ^\n[Line: 1, Column: 1]]; nested: ElasticSearchIllegalArgumentException[No field found for [properties] in mapping with types [couchbaseDocument]]; "
When I remove the script part from the query all together, the sorting and filtering works just fine. Is there a different way to access nested fields when using scripts? Any insights would be really appreciated!
Thank you!
Managed to get it done with
"script" : "doc.last_location.distance(41.12, -71.34)"
Don't know why but doc['last_location'] does not seem to work at all!
As mentioned in my comment when you sort by _geo_distance the "_sort" field that is returned, is the actual distance. So there is no need to do a separate computation. Details here: http://elasticsearch-users.115913.n3.nabble.com/search-by-distance-and-getting-the-actual-distance-td3317140.html#a3936224

Resources