Function_score, multi_match, script_score, and filter in Elasticsearch - elasticsearch

I'm having trouble adding a filter to my existing multimatch query which is embedded inside of a function_score.
Ideally, I'd like to filter by "term" : { "lang" : "en" }, only get back documents which are in the english language.
I've tried moving around the order, tried wrapping my query in bool, but just can't get the filter to work with the other functions I'm using.
My query code:
GET /my_index/_search/
{
"query": {
"function_score": {
"query": {
"bool": {
"filter": {
"term": {
"lang": "en"
}
},
"multi_match": {
"query": "Sample Query here",
"type": "most_fields",
"fields": [
"body",
"title",
"permalink",
"name"
]
}
}
},
"script_score": {
"script": {
"source": "_score + 10"
}
}
}
}
}
Error code:
{
"error": {
"root_cause": [
{
"type": "parsing_exception",
"reason": "[bool] query does not support [multi_match]",
"line": 11,
"col": 19
}
],
"type": "parsing_exception",
"reason": "[bool] query does not support [multi_match]",
"line": 11,
"col": 19
},
"status": 400
}
I'm using the latest version of Elasticsearch (I believe 6.2)

Try wrapping your multi_match in a must clause like so
"must": {
"multi_match": ...
}
The error message is clear, bool query accepts only filter, must, should

Final Solution:
GET /my_index/_search/
{
"query": {
"function_score": {
"query": {
"bool" : {
"filter": {
"term": {
"lang": "en"
}
},
"must" : {
"multi_match" : {
"query": "Sample Query Here",
"type": "most_fields",
"fields": [ "body", "title", "permalink", "name"]
}
}
}
},
"script_score" : {
"script" : {
"source": "_score + 10"
}
}
}
}
}

Related

How to filter on nested document length by script in Elasticsearch

I am trying to filter documents that have at least a given amount of items in a nested field, but I keep getting the following exception:
"caused_by" : {
"type" : "illegal_argument_exception",
"reason" : "No field found for [items] in mapping"
}
Here's an example code to reproduce:
PUT store
{
"mappings": {
"properties": {
"subject": {
"type": "keyword"
},
"items": {
"type": "nested",
"properties": {
"name": {
"type": "keyword"
},
"count": {
"type": "integer"
}
}
}
}
}
}
POST store/_bulk?refresh=true
{"create":{"_index":"store","_id":"1"}}
{"type":"appliance","items":[{"name":"Color TV"}]}
{"create":{"_index":"store","_id":"2"}}
{"type":"vehicle","items":[{"name":"Car"},{"name":"Bicycle"}]}
{"create":{"_index":"store","_id":"3"}}
{"type":"instrument","items":[{"name":"Guitar"},{"name":"Piano"},{"name":"Drums"}]}
GET store/_search
{
"query": {
"bool": {
"filter": [
{
"script": {
"script": {
"source": "doc['items'].size() > 1"
}
}
}
]
}
}
}
Please note that this is only a simplified filter script of what I really wanted to do, and if I can get over this, I will probable be able to solve my task as well.
Any help would be appreciated.
I ended up solving it with a custom score approach:
GET store/_search
{
"min_score": 0.1,
"query": {
"function_score": {
"query": {
"match_all": {}
},
"functions": [
{
"script_score": {
"script": {
"source": "params['_source']['items'].length > 1 ? 1 : 0"
}
}
}
]
}
}
}

full-text and knn_vector hybrid search for elastic

I am currently working on a search engine and i've started to implement semantic search. I use open distro version of elastic and my mapping look like this for the moment :
{
"settings": {
"index": {
"knn": true,
"knn.space_type": "cosinesimil"
}
},
"mappings": {
"properties": {
"title": {
"type" : "text"
},
"data": {
"type" : "text"
},
"title_embeddings": {
"type": "knn_vector",
"dimension": 600
},
"data_embeddings": {
"type": "knn_vector",
"dimension": 600
}
}
}
}
for basic knn_vector search i use this :
{
"size": size,
"query": {
"script_score": {
"query": {
"match_all": { }
},
"script": {
"source": "cosineSimilarity(params.query_value, doc[params.field1]) + cosineSimilarity(params.query_value, doc[params.field2])",
"params": {
"field1": "title_embeddings",
"field2": "data_embeddings",
"query_value": query_vec
}
}
}
}
}
and i've managed to get a, kind of, hybrid search with this :
{
"size": size,
"query": {
"function_score": {
"query": {
"multi_match": {
"query": query,
"fields": ["data", "title"]
}
},
"script_score": {
"script": {
"source": "cosineSimilarity(params.query_value, doc[params.field1]) + cosineSimilarity(params.query_value, doc[params.field2])",
"params": {
"field1": "title_embeddings",
"field2": "data_embeddings",
"query_value": query_vec
}
}
}
}
}
}
The problem is that if i don't have the word in the document, then it is not returned. For example, with the first search query, when i search for trump (which is not in my dataset) i manage to get document about social network and politic. I don't have these results with the hybrid search.
I have tried this :
{
"size": size,
"query": {
"function_score": {
"query": {
"match_all": { }
},
"functions": [
{
"filter" : {
"multi_match": {
"query": query,
"fields": ["data", "title"]
}
},
"weight": 1
},
{
"script_score" : {
"script" : {
"source": "cosineSimilarity(params.query_value, doc[params.field1]) + cosineSimilarity(params.query_value, doc[params.field2])",
"params": {
"field1": "title_embeddings",
"field2": "data_embeddings",
"query_value": query_vec
}
}
},
"weight": 4
}
],
"score_mode": "sum",
"boost_mode": "sum"
}
}
}
but the multi match part give a constant score to all documents that match and i want to use the filter to rank my document like in normal full text query. Any idea to do it ? Or should i use another strategy? Thank you in advance.
After the help of Archit Saxena here is the solution of my problems :
{
"size": size,
"query": {
"function_score": {
"query": {
"bool": {
"should" : [
{
"multi_match" : {
"query": query,
"fields": ["data", "title"]
}
},
{
"match_all": { }
}
],
"minimum_should_match" : 0
}
},
"functions": [
{
"script_score" : {
"script" : {
"source": "cosineSimilarity(params.query_value, doc[params.field1]) + cosineSimilarity(params.query_value, doc[params.field2])",
"params": {
"field1": "title_embeddings",
"field2": "data_embeddings",
"query_value": query_vec
}
}
},
"weight": 20
}
],
"score_mode": "sum",
"boost_mode": "sum"
}
}
}

How to compare two fields in the same document in ElasticSearch?

I am using ElasticSearch 7.2.0,
I have documents in an index with three fields - id, field1, field2
I want to query and return those documents whose field1 > 20 AND field1 > field2
Following is the data that the index has -
Following is the query that I'm trying -
GET /test/_search
{
"query": {
"function_score": {
"query": {
"constant_score": {
"filter": {
"bool": {
"must": [
{
"range" : {
"field1" : {
"gte" : 20
}
}
},
{
"script": {
"source": "doc['field1'].value > doc['field2'].value",
"params": {
}
}}
]
}
}
}
}
}
}
}
Following is the error -
{
"error": {
"root_cause": [
{
"type": "parsing_exception",
"reason": "[script] query does not support [source]",
"line": 18,
"col": 29
}
],
"type": "parsing_exception",
"reason": "[script] query does not support [source]",
"line": 18,
"col": 29
},
"status": 400
}
Your query is correct! You just need to wrap the script inside another script. The first is to denote a script query, the second is to actually define the script:
{
"query": {
"function_score": {
"query": {
"constant_score": {
"filter": {
"bool": {
"must": [
{
"range": {
"field1": {
"gte": 20
}
}
},
{
"script": {
"script": {
"source": "doc['field1'].value > doc['field2'].value",
"params": {}
}
}
}
]
}
}
}
}
}
}
}

Elasticsearch has_child query with term and function_score, parsing_exception

Sending post request to elastic search following is the post data
{
"query": {
"has_child" : {
"type" : "sometype",
"score_mode" : "sum",
"query" : {
"term" : {
"somefield" : "somevalue"
},
"function_score" : {
"script_score": {"script": "1"}
}
},
"inner_hits": {}
}
}
}
}
Getting response as malformed query
{
"error": {
"root_cause": [
{
"type": "parsing_exception",
"reason": "[term] malformed query, expected [END_OBJECT] but found [FIELD_NAME]",
"line": 10,
"col": 17
}
],
"type": "parsing_exception",
"reason": "[term] malformed query, expected [END_OBJECT] but found [FIELD_NAME]",
"line": 10,
"col": 17
},
"status": 400
}
Read documentation from this link: https://www.elastic.co/guide/en/elasticsearch/reference/5.4/query-dsl-has-child-query.html
Elasticsearch version: 5.4
You should make sure to wrap your term and function_score queries in a bool/filter query, like this:
{
"query": {
"has_child": {
"type": "sometype",
"score_mode": "sum",
"query": {
"bool": {
"must": [
{
"term": {
"somefield": "somevalue"
}
},
{
"function_score": {
"script_score": {
"script": "1"
}
}
}
]
}
},
"inner_hits": {}
}
}
}

elastic search: use terms in nested field

I have a lot of documents with data that looks like this:
"paymentMethods": [
{
"id": 194,
"name": "Wire",
"logo": "wire.gif"
}, {
"id": 399,
"name": "Paper Check",
"logo": "papercheck.gif"
}
Mapping:
"paymentMethods": {
"type": "nested",
"properties": {
"id": {
"type": "long"
},
"logo": {
"type": "string",
"index": "not_analyzed"
},
"name": {
"type": "string",
"index": "not_analyzed"
}
}
}
I try to get all the documents that have paymentMethos.id 399 & 194.
this query is works for me:
"query": {
"filtered": {
"filter": {
"bool": {
"must": [{
"nested": {
"path": "paymentMethods",
"query": {
"bool": {
"must": [
{
"term": {
"paymentMethods.id": 399
}
}
]
}
}
}
}]
}
},
"query": {
"match_all": {}
}
}
}
the problem is that I need all the documents with id 399 & 194
so I tried it:
"must" : [
{ "terms":{"paymentMethods.id" : [399,194]} }
]
but the result is kind of OR I want it as AND.
I also tried this one but it don't work at all
"must" : [{
"term": {
"paymentMethods.id": 399
}
}, {
"term": {
"paymentMethods.id": 194
}
}]
any suggestions how can I get paymentMethods.id 399 & 194?
thanks.
well, after some digging around i found the problem, each bool clause (must,should,must_not) should have it's own nested query i.e.
{"query": {
"bool": {
"must": [
{
"nested": {
"path": "paymentMethods",
"query": {
"term" : { "paymentMethods.id":399 }
}
}
},
{
"nested": {
"path": "paymentMethods",
"query": {
"term" : { "paymentMethods.id":187 }
}
},
etc..
before i tried to search with "terms" which returns document with ANY of the match terms so i got documents with 187 or 399
the code above queries the nested hidden documents twice once for 187 and once for 399 and returns the intersection of both queries => all the documents with 187 & 399
(of-course the second query does not run on all the documents again but runs on the results of the previous filter result)

Resources