Combine elasticsearch bool query with range boost
I have a complex bool query as follows. I use a bogus search term dgbdrtgndgfndrtgb to fabricate my example, which should not match anything.
{
"from": 0,
"size": 10,
"query": {
"function_score": {
"boost_mode": "replace",
"query": {
"filtered": {
"filter": {
# ...
},
"query": {
"bool": {
"should": [
{
"match": {
"name.suggest_ngrams": {
"query": "dgbdrtgndgfndrtgb",
"fuzziness": "AUTO",
"prefix_length": 1,
"operator": "AND",
"boost": 10
}
}
},
{
"multi_match": {
"query": "dgbdrtgndgfndrtgb",
"fields": [
"name.untouched_lowercase"
],
"boost": 5
}
},
{
"query_string": {
"fields": [
"name.suggest"
],
"query": "dgbdrtgndgfndrtgb*",
"boost": 10
}
},
{
"query_string": {
"fields": [
"name.suggest"
],
"query": "dgbdrtgndgfndrtgb",
"boost": 10
}
},
{
"match": {
"first_word": {
"query": "dgbdrtgndgfndrtgb",
"operator": "AND",
"boost": 10
}
}
},
{
"match": {
"name": {
"query": "dgbdrtgndgfndrtgb",
"operator": "AND",
"boost": 5
}
}
}
]
}
}
}
}
}
}
}
This works well. Now, for any of those matches, I want to add a boost where the name field has fewer than 2 words. In other words, boost single-word matches or sort them to the top of the result set.
So I tried adding a range boost like this:
{
"from": 0,
"size": 10,
"query": {
"function_score": {
"boost_mode": "replace",
"query": {
"filtered": {
"filter": {
# ...
},
"query": {
"bool": {
"should": [
{
"match": {
"name.suggest_ngrams": {
"query": "dgbdrtgndgfndrtgb",
"fuzziness": "AUTO",
"prefix_length": 1,
"operator": "AND",
"boost": 10
}
}
},
{
"multi_match": {
"query": "dgbdrtgndgfndrtgb",
"fields": [
"name.untouched_lowercase"
],
"boost": 5
}
},
{
"query_string": {
"fields": [
"name.suggest"
],
"query": "dgbdrtgndgfndrtgb*",
"boost": 10
}
},
{
"query_string": {
"fields": [
"name.suggest"
],
"query": "dgbdrtgndgfndrtgb",
"boost": 10
}
},
{
"match": {
"first_word": {
"query": "dgbdrtgndgfndrtgb",
"operator": "AND",
"boost": 10
}
}
},
{
"match": {
"name": {
"query": "dgbdrtgndgfndrtgb",
"operator": "AND",
"boost": 5
}
}
},
{
"range": {
"name.word_count": {
"lt": 2,
"boost": 40
}
}
}
]
}
}
}
}
}
}
}
This sorts things like I want, but it also returns single-word matches which do not match the search term dgbdrtgndgfndrtgb.
Is there a way to only boost single-word matches, which also match the search term? I've tried lowering the boost value, which breaks the desired sorting when using a valid (found) search term.
It seems like there should be a way to AND the entire bool query with the range boost. I've tried various permutations to achieve this with no luck and the docs are less than helpful.
One caveat: I cannot use scripting as the index is hosted on AWS which doesn't support it.
Any advice is appreciated.
After sleeping on the problem, it hit me that it is just Boolean logic. So, I came up with this solution that works perfectly, wherein I wrapped the working query logic in a must tag and put the range boost in a should tag.
{
"from": 0,
"size": 10,
"query": {
"function_score": {
"boost_mode": "replace",
"query": {
"filtered": {
"filter": {
# ...
},
"query": {
"bool": {
"must": {
"bool": {
"should": [
{
"match": {
"name.suggest_ngrams": {
"query": "dgbdrtgndgfndrtgb",
"fuzziness": "AUTO",
"prefix_length": 1,
"operator": "AND",
"boost": 10
}
}
},
{
"multi_match": {
"query": "dgbdrtgndgfndrtgb",
"fields": [
"name.untouched_lowercase"
],
"boost": 5
}
},
{
"query_string": {
"fields": [
"name.suggest"
],
"query": "dgbdrtgndgfndrtgb*",
"boost": 10
}
},
{
"query_string": {
"fields": [
"name.suggest"
],
"query": "dgbdrtgndgfndrtgb",
"boost": 10
}
},
{
"match": {
"first_word": {
"query": "dgbdrtgndgfndrtgb",
"operator": "AND",
"boost": 10
}
}
},
{
"match": {
"name": {
"query": "dgbdrtgndgfndrtgb",
"operator": "AND",
"boost": 5
}
}
}
]
}
},
"should": {
"range": {
"name.word_count": {
"lt": 2,
"boost": 40
}
}
}
}
}
}
}
}
}
}
Yay!
Related
I have a complicated query which works fine.the proble is that I'm going to add a condition(filter) to it to filter the result.I need the exact result that I currently get with filtering based on the field called "field7".
"query": {
"bool": {
"should": [
{
"match_bool_prefix": {
"field1": {
"query": "test",
"fuzziness": "auto",
"boost": 1
}
}
},
{
"match": {
"field2": {
"query": "test",
"boost": 10
}
}
},
{
"exists": {
"field": "field3",
"boost": 15
}
},
{
"exists": {
"field": "field4",
"boost": 10
}
},
{
"match_phrase_prefix": {
"field5": {
"query": ""
}
}
}
],
"must": [
{
"bool": {
"filter": [
{
"match": {
"field6": "A"
}
},
{"terms": { "field7": [3,4,5]}}
]
}
}
],
"minimum_should_match": 3
}
},
"size": 20
I have to use "minimum_should_match": 3,to meet my requirements(If i remove it I get unrelated results) but when i use it with filter the result gets notthing.Is there any suggestion how to get current result and filter it based on field7?
#Paris I believe you can use filter term query for field7 since you want to apply filter on the result-set from should+must query. So basically this should suffice:
"query": {
"bool": {
"should": [
{
"match_bool_prefix": {
"field1": {
"query": "test",
"fuzziness": "auto",
"boost": 1
}
}
},
{
"match": {
"field2": {
"query": "test",
"boost": 10
}
}
},
{
"exists": {
"field": "field3",
"boost": 15
}
},
{
"exists": {
"field": "field4",
"boost": 10
}
},
{
"match_phrase_prefix": {
"field5": {
"query": ""
}
}
}
],
"must": {
{"match": {"field6": "A"}}
},
"filter": {
{"term" : {"field7" : 3}},
{"term" : {"field7" : 4}},
{"term" : {"field7" : 5}},
}
}
},
"size": 20
I have different type_id in an ES index , and want to give different value type_id different scores to make some type search result rank is higher .
My query is
{
"query":{
"bool":{
"must":[
{"terms":{"type_id":[9,10]}}
],
"should":[
{"match":{ "display_name":{"query":"keyword","boost":10}}},
{"match":{ "description":{"query":"keyword","boost":2}}}
]
}
}
}
I want to make type_id 9 match scores is higher than type_id 10 when display_name and description is same .
Please guide me in this problem.
Thanks.
You can group your queries like below and use boost to give more weightage to certain ids.
{
"query": {
"bool": {
"must": [
{
"bool": {
"minimum_should_match": 1,
"should": [
{
"term": {
"type_id": {
"value": 9,
"boost": 2
}
}
},
{
"term": {
"type_id": {
"value": 10,
"boost": 1
}
}
}
]
}
},
{
"bool": {
"minimum_should_match": 1,
"should": [
{
"match": {
"display_name": {
"query": "keyword",
"boost": 10
}
}
},
{
"match": {
"description": {
"query": "keyword",
"boost": 2
}
}
}
]
}
}
]
}
}
}
Edit: For query in comment , you can use function_score
{
"query": {
"bool": {
"minimum_should_match": 1,
"should": [
{
"function_score": {
"query": {
"bool": {
"must": [
{
"term": {
"type_id": {
"value": 9
}
}
}
],
"minimum_should_match": 1,
"should": [
{
"match": {
"display_name": {
"query": "keyword"
}
}
},
{
"match": {
"description": {
"query": "keyword"
}
}
}
]
}
},
"boost": "5"
}
},
{
"function_score": {
"query": {
"bool": {
"must": [
{
"term": {
"type_id": {
"value": 10
}
}
}
],
"minimum_should_match": 1,
"should": [
{
"match": {
"display_name": {
"query": "keyword"
}
}
},
{
"match": {
"description": {
"query": "keyword"
}
}
}
]
}
},
"boost": "4"
}
}
]
}
}
}
I have a elasticsearch bool query with multiple should clauses combining multiple match queries on multiple fields with different boost values.
say I have 5 fields :
productName / currency / country / identifierNumber
I want to filter my results conditionally with this logic:
if the results(hits) of the bool query would come from a match query on the field (productName) then Filter by currency.
if the results(hits) of the bool query would come from a match query on the field (identifierNumber) then Do Not Filter By currency
UPDATE
{
"query": {
"function_score": {
"query": {
"bool": {
"should": [
{
"multi_match": {
"query": "parameter",
"fields": [
"productName.test^8",
"productName.raw^4",
"_all^2",
"Zone^10",
"category^12",
"class^71"
],
"fuzziness": "1",
"prefix_length": 1
}
},
{
"match": {
"productName.test": {
"query": "parameter",
"operator": "and",
"fuzziness": "1",
"prefix_length": 3,
"boost": 1000
}
}
},
{
"match": {
"productName.raw": {
"query": "parameter",
"operator": "or",
"fuzziness": "AUTO",
"prefix_length": 3,
"boost": 10
}
}
},
{
"match": {
"identifierNumber": {
"query": "parameter",
"boost": 3000
}
}
},
{
"term": {
"tic": {
"value": "parameter",
"boost": 30000
}
}
},
{
"match": {
"_all": {
"query": "parameter",
"operator": "or",
"fuzziness": "1",
"prefix_length": 2,
"boost": 10
}
}
}
]
}
},
"functions": [
{
"field_value_factor": {
"field": "productvalue"
}
}
],
"boost_mode": "multiply"
}
},
"size": 30,
"highlight": {
"fields": {
"*": {}
},
"require_field_match": false
},
"post_filter": {
"bool": {
"must": [
{
"match": {
"countryName": "France"
}
}]
}
}
}
I have inherited an Elasticsearch query that I am trying to modify. The query I have at the moment is:
{
"fields": [
],
"from": 0,
"size": 51,
"query": {
"filtered": {
"query": {
"query_string": {
"fields": [
"data.*"
],
"default_operator": "AND",
"query": "*Search term*"
}
},
"filter": [
{
"terms": {
"type": [
"typeOne",
"typeTwo",
"typeThree"
]
}
}
]
}
}
}
Now what I have been trying to do is boost one of these terms over the other 2 in the results but have not been able to get it to work. I have tried adding a "boost" value but this has oddly given me the opposite effect - it disables any type that is given a boost.
I tried the following as the "filter" object:
"filter": [
{
"bool": {
"should": [
{
"term": {
"type": "typeOne"
}
},
{
"term": {
"type": "typeTwo"
}
},
{
"term": {
"type": "typeThree",
"boost": 2
}
}
]
}
}
]
But as I said before, instead of boosting "typeThree" it removes all "typeThree" from the results.
Can anyone help me boost a specific term type?
There are multiple ways to structure the query to achieve the above , one approach would be using function_score .It would look something on these lines
Example:
"query": {
"function_score": {
"functions": [
{
"filter": {
"term": {
"type": "typeThree"
}
},
"weight": 2
}
],
"score_mode": "sum",
"boost_mode": "sum",
"query": {
"filtered": {
"query": {
"query_string": {
"fields": [
"data.*"
],
"default_operator": "AND",
"query": "*search term*"
}
},
"filter": [
{
"terms": {
"type": [
"typeOne",
"typeTwo",
"typeThree"
]
}
}
]
}
}
}
}
You can enable explain to see how this affects the scoring
While keety's answer was 98% of the way there, it took a bit of extra googling to get it all together. The problem is that "weight" doesn't work here, instead you must use "boost_factor". The final query looks like this:
{
"fields": [
],
"from": 0,
"size": 51,
"query": {
"function_score": {
"functions": [
{
"filter": {
"term": {
"type": "typeOne"
}
},
"boost_factor": 1.2
},
{
"filter": {
"term": {
"type": "typeTwo"
}
},
"boost_factor": 1.1
},
{
"filter": {
"term": {
"type": "typeThree"
}
},
"boost_factor": 1
}
],
"score_mode": "sum",
"boost_mode": "sum",
"query": {
"filtered": {
"query": {
"query_string": {
"fields": [
"data.*"
],
"default_operator": "AND",
"query": "*search term*"
}
},
"filter": [
{
"terms": {
"type": [
"typeOne",
"typeTwo",
"typeThree"
]
}
}
]
}
}
}
}
}
My goal is to write a query which would rescore documents based on value of a field in the document. To achieve this I was using a rescore query and then sorting the results. However, an explain on the query shows me that the sorting of the documents is done based on the previously computed score and not the new one.
I saw the following which explains that I couldn't use rescore and sort together.
"Sometimes we want to show results, where the ordering of the first documents on the page is affected by the additional rules. Unfortunately this cannot be achieved by the rescore functionality. The first idea points to window_size parameter, but this parameter in fact is not connected with the first documents on the result list but with number of results returned on every shard. In addition window_size cannot be less than page size. (If it is less, ElasticSearch silently use page size). Also, one very important thing – rescoring cannot be combined with sorting, because sorting is done after changes introduced by rescoring."
http://elasticsearchserverbook.com/elasticsearch-0-90-using-rescore/
My query is:
{
"query": {
"filtered": {
"query": {
"bool": {
"should": [
{
"constant_score": {
"query": {
"match": {
"question": {
"query": "diabetes"
}
}
},
"boost": 1
}
},
{
"dis_max": {
"queries": [
{
"constant_score": {
"query": {
"match": {
"question": {
"query": "diabetes"
}
}
},
"boost": 0.01
}
},
{
"constant_score": {
"query": {
"match": {
"answer_text": {
"query": "diabetes"
}
}
},
"boost": 0.0001
}
}
]
}
},
{
"dis_max": {
"queries": [
{
"constant_score": {
"query": {
"match_phrase": {
"question_phrase": {
"query": "what is diabetes",
"slop": 0
}
}
},
"boost": 100
}
},
{
"constant_score": {
"query": {
"match_phrase": {
"question_phrase": {
"query": "what is diabetes",
"slop": 1
}
}
},
"boost": 50
}
},
{
"constant_score": {
"query": {
"match_phrase": {
"question_phrase": {
"query": "what is diabetes",
"slop": 2
}
}
},
"boost": 33
}
},
{
"constant_score": {
"query": {
"match_phrase": {
"question_phrase": {
"query": "what is diabetes",
"slop": 3
}
}
},
"boost": 25
}
},
{
"constant_score": {
"query": {
"query_string": {
"default_field": "question_group_four",
"query": "what__is__diabetes"
}
},
"boost": 0.1
}
},
{
"constant_score": {
"query": {
"query_string": {
"default_field": "question_group_five",
"query": "what__is__diabetes"
}
},
"boost": 0.15
}
},
{
"constant_score": {
"query": {
"query_string": {
"default_field": "concept_words_no_synonyms_20",
"query": "what__is__diabetes"
}
},
"boost": 35
}
},
{
"constant_score": {
"query": {
"query_string": {
"default_field": "concept_words_no_synonyms_15",
"query": "what__is__diabetes"
}
},
"boost": 25
}
},
{
"constant_score": {
"query": {
"query_string": {
"default_field": "concept_words_no_synonyms_10",
"query": "what__is__diabetes"
}
},
"boost": 15
}
},
{
"constant_score": {
"query": {
"query_string": {
"default_field": "concept_words_20",
"query": "what__is__diabetes"
}
},
"boost": 28
}
},
{
"constant_score": {
"query": {
"query_string": {
"default_field": "concept_words_15",
"query": "what__is__diabetes"
}
},
"boost": 16
}
},
{
"constant_score": {
"query": {
"query_string": {
"default_field": "concept_words_10",
"query": "what__is__diabetes"
}
},
"boost": 13
}
},
{
"constant_score": {
"query": {
"query_string": {
"default_field": "concept_words_05",
"query": "what__is__diabetes"
}
},
"boost": 4
}
}
]
}
},
{
"dis_max": {
"queries": [
{
"constant_score": {
"query": {
"query_string": {
"default_field": "question_group_four",
"query": "diabetes"
}
},
"boost": 0.1
}
},
{
"constant_score": {
"query": {
"query_string": {
"default_field": "question_group_five",
"query": "diabetes"
}
},
"boost": 0.15
}
},
{
"constant_score": {
"query": {
"query_string": {
"default_field": "concept_words_no_synonyms_20",
"query": "diabetes"
}
},
"boost": 35
}
},
{
"constant_score": {
"query": {
"query_string": {
"default_field": "concept_words_no_synonyms_15",
"query": "diabetes"
}
},
"boost": 25
}
},
{
"constant_score": {
"query": {
"query_string": {
"default_field": "concept_words_no_synonyms_10",
"query": "diabetes"
}
},
"boost": 15
}
},
{
"constant_score": {
"query": {
"query_string": {
"default_field": "concept_words_20",
"query": "diabetes"
}
},
"boost": 28
}
},
{
"constant_score": {
"query": {
"query_string": {
"default_field": "concept_words_15",
"query": "diabetes"
}
},
"boost": 16
}
},
{
"constant_score": {
"query": {
"query_string": {
"default_field": "concept_words_10",
"query": "diabetes"
}
},
"boost": 13
}
},
{
"constant_score": {
"query": {
"query_string": {
"default_field": "concept_words_05",
"query": "diabetes"
}
},
"boost": 4
}
}
]
}
}
],
"disable_coord": true
}
},
"filter": {
"and": [
{
"term": {
"posted_by_expert": false
}
},
{
"term": {
"tip_question": false
}
},
{
"term": {
"show_in_work_queue": true
}
},
{
"range": {
"verified_answers_count": {
"gt": 0
}
}
}
]
}
}
},
"rescore": {
"window_size": 100,
"query": {
"rescore_query": {
"function_score": {
"functions": [
{
"script_score": {
"script": "_score * _source.concierge_boost"
}
}
]
}
}
}
},
"sort": [
"_score",
{
"count_words_with_high_concepts": {
"order": "asc"
}
},
{
"popularity": {
"order": "desc"
}
},
{
"length": {
"order": "asc"
}
}
],
"fields": [],
"size": 10,
"from": 0
}
Any help highly appreciated !
This is not possible, indeed. But this has been discussed and decided is not worth implementing it at the moment. The discussion on github, though, reveals the difficulty about this - documents need to be sorted, top 100 (in your case) chosen, then a rescore is applied and then they are, again, sorted. I suggest reading the comments in that github issue, especially the ones from simonw. The issue is still open but it doesn't seem it will be implemented soon, if it will at all.
Regarding your sorting after another level of scoring, I understand the need to rescore only few documents, but it seems is not possible. What if you wrap your query in another function_score where you define a script_score function to compute the final score? Something like this:
{
"query": {
"function_score": {
"query": {
.......
},
"functions": [
{
"script_score": {
"script": "doc['concierge_boost'].value"
}
}
]
}
},
"sort": [
"_score",
{
"count_words_with_high_concepts": {
"order": "asc"
}
},
{
"popularity": {
"order": "desc"
}
},
{
"length": {
"order": "asc"
}
}
],
"fields": [],
"size": 10,
"from": 0
}