function_score: treat missing field as perfect hit - elasticsearch

What I need to do is boost documents by location (closer is better). locations is nested type.
Works fine except Elasticsearch does not return documents if locations is missing in document. If fields is missing, Elasticsearch should treat document as perfect hit. Any idea how to achieve this?
My query:
{
"sort": [
{
"_score": "desc"
}
],
"query": {
"function_score": {
"query": {
"bool": {
"must_not": [],
"should": [
{
"nested": {
"path": "locations",
"query": {
"function_score": {
"score_mode": "sum",
"functions": [
{
"gauss": {
"locations.coordinates": {
"origin": {
"lat": "50.1078852",
"lon": "15.0385376"
},
"scale": "100km",
"offset": "20km",
"decay": "0.5"
}
}
}
]
}
}
}
}
]
}
}
}
}
}
BTW: I'm using Elasticsearch 5.0

Add one more function into the functions section with high boost (like 1000) for missing locations, something like this:
{
"query": {
"bool": {
"must_not": {
"exists": {
"field": "locations"
}
}
},
},
"weight": 1000
}
So records with missing locations will come first because of high weight.
Syntax can differ a little. More information on queries here:
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-exists-query.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-function-score-query.html

Related

Elasticsearch Boost near location, boost if no location is available

There is Location field exists with geo_point type
I want to implement with some conditions below.
If there is a location, the closer it is, boost it
If there is no location, boost by 5
Ultimately, i want to implement the following features: If the location exists, will show it in the order of distance, but we would like to boost documents without location so that they are not pushed out last.
below is my query. I reached to get nearest document by location. But i don't know how to boost which is no location.
{
"query": {
"bool": {
"must": {
"match_all": {}
},
"should": {
"distance_feature": {
"field": "location",
"pivot": "1000m",
"boost": 8,
"origin": {
"lat": 33.489009,
"lon": 133.022831
}
}
},
"filter": [
{
"terms" : {
"state": ["AVAILABLE"]
}
}
]
}
}
}
You could try to do it like this:
{
"query": {
"bool": {
"must": {
"match_all": {}
},
"minimum_should_match": 1,
"should": [
{
"distance_feature": {
"field": "locations.parcelLocation",
"pivot": "1000m",
"boost": 8,
"origin": {
"lat": 33.489009,
"lon": 133.022831
}
}
},
{
"bool": {
"must_not": {
"exists": {
"field": "location",
"boost": 5
}
}
}
}
],
"filter": [
{
"terms": {
"state": [
"AVAILABLE"
]
}
}
]
}
}
}

how to add filters to elastic query when using function_score?

Here is my current elastic query:
{
"from": 0,
"size": 10,
"query": {
"function_score": {
"query": {
"bool": {
"must": [{
"multi_match": {
"query": "ocean",
"fields": [],
"fuzziness": "AUTO"
}}],
"must_not": [{
"exists": {
"field": "parentId"
}
}]
}
},
"functions" : [
{
"gauss": {
"createTime": {
"origin": "2020-07-09T23:50:00",
"scale": "365d",
"decay": 0.3
}
}
}
]
}
}
}
How do I properly add filters to this? I think maybe the fact that I'm using function_score makes this different? I would like to add a hard filter, for example, only show me results with uploadUser: 'Mr. Bean' ... but still keep the scoring in place for the results that pass this filter.
I tried using filter in various places, also using must but I either get no results or all the results.
I'm using Elastic Search 7. Thanks for your help
You can try this below search query:
Refer this ES official documentation to know more about Function score query
{
"from": 0,
"size": 10,
"query": {
"function_score": {
"query": {
"bool": {
"filter": {
"term": {
"uploadUser": "Mr. Bean"
}
},
"must": [
{
"multi_match": {
"query": "ocean",
"fields": [
],
"fuzziness": "AUTO"
}
}
],
"must_not": [
{
"exists": {
"field": "parentId"
}
}
]
}
},
"functions": [
{
"gauss": {
"createTime": {
"origin": "2020-07-09T23:50:00",
"scale": "365d",
"decay": 0.3
}
}
}
]
}
}
}

Elasticsearch No query registered for [exists]]

After running the below query, i am getting the exception as No query registered for [exists]. Please help me.
{
"query": {
"function_score": {
"query": {
"bool": {
"must": {
"match": {
"_all": {
"query": "cardio new york"
}
}
}
}
},
"functions": [
{
"gauss": {
"geo_location": {
"origin": {
"lat": 40.7127,
"lon": -74.0059
},
"scale": "100km",
"offset": "0km",
"decay": 0.9
}
}
},
{
"gauss": {
"startdate": {
"origin": "now",
"scale": "30d",
"offset": "30d"
}
},
"weight": 0.5
}
],
"filter": {
"query": {
"bool": {
"must": {
"match": {
"_all": {
"query": "cardio new york"
}
}
},
"should": {
"exists": {
"fields": [
"venue",
"geo_location"
]
}
}
}
}
}
}
}
}
I am trying to filter the search results after the function_score with combining bool match query.
exists is not a query, it's filter you cannot use it in a bool query, instead I would use bool filter and wrap only match into query filter like this:
"filter": {
"bool": {
"must": [{
"query": {
"match": {
"_all": {
"query": "cardio new york"
}
}
}
}, {
"exists": {
"fields": [
"venue",
"geo_location"
]
}
}]
}
}
This depends on your version, before version 2.0 you would use the Exists filter, as demonstrated in imotov's answer.
With 2.0 and after exists filter has been replaced by exists query documentation for current version
thus the newer exists query would look like the following:
{
"index": "foobars",
"type": "foo",
"body": {
"query": {
"bool": {
"must":
{"exists" : { "field" : "bar" }}
}
},
"from": 0,
"size": 20
}
}

ElasticSearch function_score query with filters

Trying to create a search that will bring back results of location that are about 500m away from certain geo point.
I need to filter results from this search based on if location is empty in the source, or not.
I tried things like this:
"filtered" : {
"filter": {
"missing" : { "field" : "location" }
}
}
This is the search JSON I got:
{"query": {
"function_score": {
"query": {
"bool": {
"must": [
{
"match": {"fieldA": "value"}
},
{
"match": { "fieldB": "value"}
}
]
}
},
"functions": [{
"gauss": {
"location": {
"origin": "'.$lat.','.$lon.'",
"scale": "500m",
"offset": "0km",
"decay": 0.33
}
}
}]
}
}
}
I tried putting the filter in different places in the query but it didn't work for me, so I know I'm doing something fundamentally wrong with how the query is structured. In the future I want to add more scoring logic and other filters, but can't find a good example of such queries.
What I should do to make it work?
You can use filtered query with geo_distance to filter out the results first. You can also use "weight" in function_score to explicit boost the distance over the "match" queries:
{
"query": {
"function_score": {
"query": {
"filtered": {
"query": {
"bool": {
"must": [
{
"match": {
"fieldA": "value"
}
},
{
"match": {
"fieldB": "value"
}
}
]
}
},
"filter": {
"geo_distance" : {
"distance" : "500m",
"location" : "'.$lat.','.$lon.'"
}
}
}
},
"functions": [
{
"gauss": {
"location": {
"origin": "'.$lat.','.$lon.'",
"scale": "500m",
"offset": "0km",
"decay": 0.33
}
},
"weight": "3"
}
]
}
}
}

Elasticsearch: query produced is invalid

I'm getting a little frustrated with elasticsearch, after having read the documents, but can't seem to get beyond a 'The query produced is invalid" response. What I am trying to do is use elasticsearch to find imperfect duplicates in geospatial information and a rather large dataset. I want to match on name (boosted) and address, filter results a small geographic box and then reduce the relevance score of matches that are located further from my reference point. Can someone please help? I think I understnad the individual elements of a query, my main problem is putting these together in a way that products something valid.
$query = new \Elastica\Query\Builder('{
"function_score": {
"functions": [
{
"gauss": {
"location": {
"origin": "'.$latitude.', '.$longitude.'",
"scale": "2km"
}
}
}
],
"query": {
"filtered": {
"query": {
"bool": {
"should": [
{
"match": {
"name": {
"query": "'.$name.'",
"boost": 4
}
}
},
{
"match": {
"address": {
"query": "'.$address.'",
"boost": 1
}
}
}
]
}
},
"filter": {
"geo_distance": {
"distance": "2km",
"location": {
"lat": "'.$latitude.'",
"lon": "'.$longitude.'"
}
}
}
}
}
}')
You should surround the whole query by a "query" clause:
$query = new \Elastica\Query\Builder('{
"query": {
"function_score": {
"functions": [
{
"gauss": {
"location": {
"origin": "'.$latitude.', '.$longitude.'",
"scale": "2km"
}
}
}
],
"query": {
"filtered": {
"query": {
"bool": {
"should": [
{
"match": {
"name": {
"query": "'.$name.'",
"boost": 4
}
}
},
{
"match": {
"address": {
"query": "'.$address.'",
"boost": 1
}
}
}
]
}
},
"filter": {
"geo_distance": {
"distance": "2km",
"location": {
"lat": "'.$latitude.'",
"lon": "'.$longitude.'"
}
}
}
}
}
}
}')
To have more feedback when making queries, a good habit is to print the Elasticsearch's response to the query.
Try the query in your SENSE:
POST test/_search
{
"function_score": {
"functions": [
{
"gauss": {
"location": {
"origin": "11, 12",
"scale": "2km"
}
}
}
],
"query": {
"filtered": {
"query": {
"bool": {
"should": [
{
"match": {
"name": {
"query": "name",
"boost": 4
}
}
},
{
"match": {
"address": {
"query": "address",
"boost": 1
}
}
}
]
}
},
"filter": {
"geo_distance": {
"distance": "2km",
"location": {
"lat": "'.$latitude.'",
"lon": "'.$longitude.'"
}
}
}
}
}
}
}
You'll get the response:
SearchParseException[[test][4]: from[-1],size[-1]: Parse Failure [No parser for element [function_score]]]; }]
Telling that Elasticsearch doesn't recognize "function_score". By reading in detail the function_score page on the elasticsearch wiki and looking at the example, you'll then find what's missing: the surrounding "query" clause.
PS: Elastica also provides an alternative syntax for creating the query body with the ElasticaQueryBuilder, which spare you from writing the json.

Resources