Elastic search how can I query either multi match or functions - elasticsearch

I have three following parameters that I will pass to run the query, which are;
query - Either a place name, description or empty,
lat - Either latitude of a place or empty,
lon - Either longitude of a place or empty
Based on above parameters, I get to query list of items based on query scores, then calculate the distance between result and lat, lon.
Now, I have the following script to get the items based on query and distance;
{
"query": {
"function_score": {
"query": {
"bool": {
"must": [{
"multi_match" : {
"query": "Lippo",
"fields": [ "name^6", "city^5", "country^4", "position^3", "address_line^2", "description"]
}
}]
}
},
"functions": [
{
"gauss": {
"position": {
"origin": "-6.184652, 106.7518749",
"offset": "2km",
"scale": "10km",
"decay": 0.33
}
}
}
]
}
}
}
But the thing is, if query is empty, there will be no result at all. What I want is, the result is based on either query or distance.
Is there anyway to achieve this? Any suggestion is appreciated.

setting the zero_terms_query option of multi-match to all should allow you to get the results when query is empty.
Example :
{
"query": {
"function_score": {
"query": {
"bool": {
"must": [{
"multi_match" : {
"query": "Lippo",
"fields": [ "name^6", "city^5", "country^4", "position^3", "address_line^2", "description"],
"zero_terms_query" : "all"
}
}]
}
},
"functions": [
{
"gauss": {
"position": {
"origin": "-6.184652, 106.7518749",
"offset": "2km",
"scale": "10km",
"decay": 0.33
}
}
}
]
}
}
}

Related

how to add filters to elastic query when using function_score?

Here is my current elastic query:
{
"from": 0,
"size": 10,
"query": {
"function_score": {
"query": {
"bool": {
"must": [{
"multi_match": {
"query": "ocean",
"fields": [],
"fuzziness": "AUTO"
}}],
"must_not": [{
"exists": {
"field": "parentId"
}
}]
}
},
"functions" : [
{
"gauss": {
"createTime": {
"origin": "2020-07-09T23:50:00",
"scale": "365d",
"decay": 0.3
}
}
}
]
}
}
}
How do I properly add filters to this? I think maybe the fact that I'm using function_score makes this different? I would like to add a hard filter, for example, only show me results with uploadUser: 'Mr. Bean' ... but still keep the scoring in place for the results that pass this filter.
I tried using filter in various places, also using must but I either get no results or all the results.
I'm using Elastic Search 7. Thanks for your help
You can try this below search query:
Refer this ES official documentation to know more about Function score query
{
"from": 0,
"size": 10,
"query": {
"function_score": {
"query": {
"bool": {
"filter": {
"term": {
"uploadUser": "Mr. Bean"
}
},
"must": [
{
"multi_match": {
"query": "ocean",
"fields": [
],
"fuzziness": "AUTO"
}
}
],
"must_not": [
{
"exists": {
"field": "parentId"
}
}
]
}
},
"functions": [
{
"gauss": {
"createTime": {
"origin": "2020-07-09T23:50:00",
"scale": "365d",
"decay": 0.3
}
}
}
]
}
}
}

function_score: treat missing field as perfect hit

What I need to do is boost documents by location (closer is better). locations is nested type.
Works fine except Elasticsearch does not return documents if locations is missing in document. If fields is missing, Elasticsearch should treat document as perfect hit. Any idea how to achieve this?
My query:
{
"sort": [
{
"_score": "desc"
}
],
"query": {
"function_score": {
"query": {
"bool": {
"must_not": [],
"should": [
{
"nested": {
"path": "locations",
"query": {
"function_score": {
"score_mode": "sum",
"functions": [
{
"gauss": {
"locations.coordinates": {
"origin": {
"lat": "50.1078852",
"lon": "15.0385376"
},
"scale": "100km",
"offset": "20km",
"decay": "0.5"
}
}
}
]
}
}
}
}
]
}
}
}
}
}
BTW: I'm using Elasticsearch 5.0
Add one more function into the functions section with high boost (like 1000) for missing locations, something like this:
{
"query": {
"bool": {
"must_not": {
"exists": {
"field": "locations"
}
}
},
},
"weight": 1000
}
So records with missing locations will come first because of high weight.
Syntax can differ a little. More information on queries here:
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-exists-query.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-function-score-query.html

ElasticSearch function_score query with filters

Trying to create a search that will bring back results of location that are about 500m away from certain geo point.
I need to filter results from this search based on if location is empty in the source, or not.
I tried things like this:
"filtered" : {
"filter": {
"missing" : { "field" : "location" }
}
}
This is the search JSON I got:
{"query": {
"function_score": {
"query": {
"bool": {
"must": [
{
"match": {"fieldA": "value"}
},
{
"match": { "fieldB": "value"}
}
]
}
},
"functions": [{
"gauss": {
"location": {
"origin": "'.$lat.','.$lon.'",
"scale": "500m",
"offset": "0km",
"decay": 0.33
}
}
}]
}
}
}
I tried putting the filter in different places in the query but it didn't work for me, so I know I'm doing something fundamentally wrong with how the query is structured. In the future I want to add more scoring logic and other filters, but can't find a good example of such queries.
What I should do to make it work?
You can use filtered query with geo_distance to filter out the results first. You can also use "weight" in function_score to explicit boost the distance over the "match" queries:
{
"query": {
"function_score": {
"query": {
"filtered": {
"query": {
"bool": {
"must": [
{
"match": {
"fieldA": "value"
}
},
{
"match": {
"fieldB": "value"
}
}
]
}
},
"filter": {
"geo_distance" : {
"distance" : "500m",
"location" : "'.$lat.','.$lon.'"
}
}
}
},
"functions": [
{
"gauss": {
"location": {
"origin": "'.$lat.','.$lon.'",
"scale": "500m",
"offset": "0km",
"decay": 0.33
}
},
"weight": "3"
}
]
}
}
}

Elasticsearch query: Multiply final score using nested object and function score

I have documents with some data and a specific omit list in it (see mapping and example data):
I would like to write an ES query which does the following:
Calculate some "basic" score for the documents (Query 1):
{
"explain": true,
"query": {
"bool": {
"should": [
{
"constant_score": {
"filter": {
"term": {
"type": "TYPE1"
}
}
}
},
{
"function_score": {
"linear": {
"number": {
"origin": 30,
"scale": 20
}
}
}
}
]
}
}
}
At the end multiply the score according to the omit percent of a specific id (In the example I used omit valut for A"omit.id": "A"). As a demonstration in Query 2 I calculated this multiplier.
{
"query": {
"nested": {
"path": "omit",
"query": {
"function_score": {
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"term": {
"omit.id": "A"
}
}
}
},
"functions": [
{
"linear": {
"omit.percent": {
"origin": 0,
"scale": 50,
"offset": 0,
"decay": 0.5
}
}
}
],
"score_mode": "multiply"
}
}
}
}
}
To achieve this final multiplication I faced with the following problems:
If I calculate linear function score inside of a nested query, (according to my interpretation) I cannot use any other field in function_score query.
I cannot multiply the calculated score with any other function_score which is encapsulated into a nested query.
I would like to ask for any advice to resolve this issue.
Note that maybe I should get rid of this nested type and use key-value pairs instead. For example:
{
"omit": {
"A": {
"percent": 10
},
"B": {
"percent": 100
}
}
}
but unfortunately there will be a lot of keys, which would result a huge (continuously growing) mapping, so I not prefer this option.
At least I figured out a possible solution based on a "non-nested way". The complete script can be found here.
I modified the omit list as described in the question:
"omit": {
"A": {
"percent": 10
},
"B": {
"percent": 100
}
}
In addition I set the enabled flag to false to not have these elements in the mapping:
"omit": {
"type" : "object",
"enabled" : false
}
The last trick was to use script_score as a function_score's function, because only there I could use the value of percent by _source.omit.A.percent script:
{
"query": {
"function_score": {
"query": {
...
},
"script_score": {
"lang": "groovy",
"script": "if (_source.omit.A){(100-_source.omit.A.percent)/100} else {1}"
},
"score_mode": "multiply"
}
}
}

Decay filter function for a no-limit value with ElasticSearch

I have the following documents (at least 1 000 000) in an ElasticSearch index:
{"title":"toto", "views":132, "likes":23, "date" : "2014-09-01..." ...}
Where title is indexed with a lang analyser, views and likes fields are integer from 0 to infinite, and the date is a ..date field.
I want to search by title, and boost documents if they are recent and have a high views and likes.
I am using a decay filter function for the date (from today as origin), it's working as expected, but I don't know how to do for boosting the views and likes fields, since I have no max-origin.
Here my search query:
POST /threads/_search
{
"query": {
"function_score": {
"query": {
"multi_match": {
"query": "air france",
"type": "phrase",
"fields": [
"title^4",
"desc"
]
}
},
"functions": [
{
"exp": {
"date": {
"origin": "2014/09/29 13:00:00",
"scale": "12h",
"offset":"6h",
"decay":0.5
}
}
}
]
}
}
}
You could try a "field_value_factor", as per this section in the documentation. And you'd need to test and assess the results, modify the "factor" and the boost you are giving to "title" and then test again and see if it's getting closer to what you need. Also, you can use search=explain to see how ES computes the _score. Something like this:
POST /threads/_search?explain
{
"query": {
"function_score": {
"query": {
"multi_match": {
"query": "air france",
"type": "phrase",
"fields": [
"title^8",
"desc"
]
}
},
"functions": [
{
"exp": {
"date": {
"origin": "2014/09/29 13:00:00",
"scale": "12h",
"offset":"6h",
"decay":0.5
}
}
},
{
"field_value_factor": {
"field": "views",
"modifier": "log2p",
"factor": 0.1
}
},
{
"field_value_factor": {
"field": "likes",
"modifier": "log2p",
"factor": 0.1
}
}
]
}
}
}

Resources