Minimum should match on filtered query - elasticsearch

Is it possible to have a query like this
"query": {
"filtered": {
"filter": {
"terms": {
"names": [
"Anna",
"Mark",
"Joe"
],
"execution" : "and"
}
}
}
}
With the "minimum_should_match": "2" statement?
I know that I can use a simple query (I've tried, it works) but I don't need the score to be computed. My goal is just to filter documents which contains 2 of the values.
Does the score generally heavily impact the time needed to retrieves document?
Using this query:
"query": {
"filtered": {
"filter": {
"terms": {
"names": [
"Anna",
"Mark",
"Joe"
],
"execution" : "and",
"minimum_should_match": "2"
}
}
}
}
I got this error:
QueryParsingException[[my_db] [terms] filter does not support [minimum_should_match]]

Minimum should match is not a parameter for the terms filter. If that is the functionality you are looking for, I might rewrite your query like this, to use the bool query wrapped in a query filter:
{
"filter": {
"query": {
"bool": {
"should": [
{
"term": {
"names": "Anna"
}
},
{
"term": {
"names": "Mark"
}
},
{
"term": {
"name": "Joe"
}
}
],
"minimum_should_match": 2
}
}
}
}
You will get documents matching preferably exactly all three, but the query will also match document with exactly two of the three terms. The must is an implicit and. We also do not compute score, as we have executed the query as a filter.

Related

Elasticsearch search query: nested query with OR-gates & AND-gates

I have docs as follow:
{
"name": "...",
"country": "...",
}
I need to find. either one of the following criteria:
name=John AND country=US
name=Andy AND country=UK
How should be write this nested query?
Assuming the default fields mapping is defined, you can use boolean queries as follows:
{
"query": {
"bool": {
"should": [
{
"bool": {
"filter": [
{
"term": {
"name.keyword": "John"
}
},
{
"term": {
"country.keyword": "US"
}
}
]
}
},
{
"bool": {
"filter": [
{
"term": {
"name.keyword": "Andy"
}
},
{
"term": {
"country.keyword": "UK"
}
}
]
}
}
]
}
}
}
You should use must instead of filter if you want the query to contribute to the score.
must
The clause (query) must appear in matching documents and will
contribute to the score.
filter
The clause (query) must appear in matching documents. However unlike
must the score of the query will be ignored. Filter clauses are
executed in filter context, meaning that scoring is ignored and
clauses are considered for caching.

Elastic Search query with should returning 10.000 results but nothing matches

So I have an index of about 60GB data and basically I want to make a query to retrieve 1 specific product based off its reference number.
here is my query:
GET myindex/_search
{
"_source": [
"product.ref",
"product.urls.*",
"product.i18ns.*.title",
"product_sale_elements.quantity",
"product_sale_elements.prices.*.price",
"product_sale_elements.listen_price.*",
"product.images.image_url",
"product.image_count",
"product.images.visible",
"product.images.position"
],
"size": "6",
"from": "0",
"query": {
"function_score": {
"functions": [
{
"field_value_factor": {
"field": "product.sales_count",
"missing": 0,
"modifier": "log1p"
}
},
{
"field_value_factor": {
"field": "product.image_count",
"missing": 0,
"modifier": "log1p"
}
},
{
"field_value_factor": {
"field": "featureCount",
"missing": 0,
"modifier": "log1p"
}
}
],
"query": {
"bool": {
"filter": [
{
"term": {
"product.is_visible": true
}
}
],
"should": [
{
"query_string": {
"default_field": "product.ref",
"query": "13141000",
"boost": 2
}
}
]
}
}
}
},
"aggs": {
"by_categories": {
"terms": {
"field": "categories.i18ns.de_DE.title.raw",
"size": 100
}
}
}
}
My question therefore is, why does this query give me back 10k results whereas I just wanted the 1 single product with that reference number.
If I do:
GET my-index/_search
{
"query": {
"match": {
"product.ref": "13141000"
}
}
}
it matches correctly. How is should different then a normal match query?
If you have must or filter clauses, as you do, then anything than matches must or filter does not have to match your should clause, since it's considered "optional"
You can either move query_string within your should clause to filter or set minimum_should_match to 1 like this
...
"should": [
{
"query_string": {
"default_field": "product.ref",
"query": "13141000",
"boost": 2
}
}
],
"minimum_should_match" : 1,
...
Must - The condition must match.
Should - If the condition matches, then it will improve the score in a non-filter context. (If minimum_should_match is not declared explicitly)
As you can see, must is similar to filter but also provides scoring. Filter will not be providing any scoring.
You can put this clause inside a new must clause:
{
"query_string": {
"default_field": "product.ref",
"query": "13141000",
"boost": 2
}
}
Boost will not effect scoring if you put the above inside the filter clause.
Read more about bool queries here

How can we use exists query in tandem with the search query?

I have a scenario in Elasticsearch where my indexed docs are like this :-
{"id":1,"name":"xyz", "address": "xyz123"}
{"id":1,"name":"xyz", "address": "xyz123"}
{"id":1,"name":"xyz", "address": "xyz123", "note": "imp"}
Here the requirement stress that we have to do a term match query and then provide relevance score to them which is a straight forward thing but the additional aspect here is if any doc found in search result has note field then it should be given higher relevance. How can we achieve it with DSL query? Using exists we can check which docs contain notes but how to integrate with match query in ES query. Have tried lot of ways but none worked.
With ES 5, you could boost your exists query to give a higher score to documents with a note field. For example,
{
"query": {
"bool": {
"must": {
"match": {
"name": {
"query": "your term"
}
}
},
"should": {
"exists": {
"field": "note",
"boost": 4
}
}
}
}
}
With ES 2, you could try a boosted filtered subset
{
"query": {
"function_score": {
"query": {
"match": { "name": "your term" }
},
"functions": [
{
"filter": { "exists" : { "field" : "note" }},
"weight": 4
}
],
"score_mode": "sum"
}
}
}
I believe that you are looking for boosting query feature
https://www.elastic.co/guide/en/elasticsearch/reference/5.1/query-dsl-boosting-query.html
{
"query": {
"boosting": {
"positive": {
<put yours original query here>
},
"negative": {
"filtered": {
"filter": {
"exists": {
"field": "note"
}
}
}
},
"negative_boost": 4
}
}
}

Elastic Search : Match Query not working in Nested Bool Filters

I am able to get data for the following elastic search query :
{
"query": {
"filtered": {
"query": [],
"filter": {
"bool": {
"must": [
{
"bool": {
"should": [
{
"term": {
"gender": "malE"
}
},
{
"term": {
"sentiment": "positive"
}
}
]
}
}
]
}
}
}
}
}
However, If I query using "match" - I get error message with 400 status response
{
"query": {
"filtered": {
"query": [],
"filter": {
"bool": {
"must": [
{
"bool": {
"should": [
{
"match": {
"gender": "malE"
}
},
{
"term": {
"sentiment": "positive"
}
}
]
}
}
]
}
}
}
}
}
Is match query not supported in nested bool filters ?
Since the term query looks for the exact term in the field’s inverted index and I want to query gender data as case_insensitive field - Which approach shall I try ?
Settings of the index :
{
"settings": {
"index": {
"analysis": {
"analyzer": {
"analyzer_keyword": {
"tokenizer": "keyword",
"filter": "lowercase"
}
}
}
}
}
}
Mapping for field Gender:
{"type":"string","analyzer":"analyzer_keyword"}
The reason you're getting an error 400 is because there is no match filter, only match queries, even though there are both term queries and term filters.
Your query can be as simple as this, i.e. no need for a filtered query, simply put your term and match queries into a bool/should:
{
"query": {
"bool": {
"should": [
{
"match": {
"gender": "male"
}
},
{
"term": {
"sentiment": "positive"
}
}
]
}
}
}
This answer is for ElasticSearch 7.x. As I understand from the question, you would like to use a match query for the gender field and a term query for the sentiment field. The mappings for each of these field should look like below:
"sentiment": {
"type": "keyword"
},
"gender": {
"type": "text"
}
The corresponding search API would be:
"query": {
"bool": {
"must": [
{
"terms": {
"sentiment": [
"very positive", "positive"
]
}
},
{
"match": {
"gender": "malE"
}
}
]
}
}
This search API returns all the documents where gender is "Male"/"MALE"/"mALe" etc. So, you may have indexed the gender field holding "mALe", but, the match query for "gender": "malE" will still be able to retrieve it. In the latest version of ElasticSearch, if the query is a match type, the value (which is "gender": "malE") will be automatically lower cased internally before search begins. But, it should not be that tough for a client of the API to pass a lowercase to the match query at the onset itself. Coming to the sentiment field, since, its a keyword field, you can search for values that contain spaces too like very positive.

Elasticsearch multi term filter

I'm quite new to Elasticsearch, so here's my question.
I wanna do a search query with elasticsearch and wanna filter with multiple terms.
If I want to search for a user 'tom', then I would like to have all the matches where the user 'isActive = 1', 'isPrivate = 0' and 'isOwner = 1'.
Here's my search query
"query":{
"filtered": {
"query": {
"query_string": {
"query":"*tom*",
"default_operator": "OR",
"fields": ["username"]
}
},
"filter": {
"term": {
"isActive": "1",
"isPrivate": "0",
"isOwner": "1"
}
}
}
}
When I use 2 terms, it works like a charm, but when i use 3 terms it doesn't.
Thanks for the help!!
You should use bool filter to AND all your terms:
"query":{
"filtered": {
"query": {
"query_string": {
"query":"*tom*",
"default_operator": "OR",
"fields": ["username"]
}
},
"filter": {
"bool" : {
"must" : [
{"term" : { "isActive" : "1" } },
{"term" : { "isPrivate" : "0" } },
{"term" : { "isOwner" : "1" } }
]
}
}
}
}
For version 2.x+ you can use bool query instead of filtered query with some simple replacement: https://www.elastic.co/guide/en/elasticsearch/reference/7.4/query-dsl-filtered-query.html
As one of the comments says, the syntax has changed in recent ES versions. If you are using Elasticsearch 6.+, and you want to use a wildcard and a sequence of terms in your query (such as in the question), you can use something like this:
GET your_index/_search
{
"query": {
"bool": {
"must": [
{
"wildcard": {
"your_field_name_1": {
"value": "tom*"
}
}
},
{
"term": {
"your_field_name_2": {
"value": "US"
}
}
},
{
"term": {
"your_field_name_3": {
"value": "Michigan"
}
}
},
{
"term": {
"your_field_name_4": {
"value": "0"
}
}
}
]
}
}
}
Also, from the documentation about wildcard queries:
Note that this query can be slow, as it needs to iterate over many
terms. In order to prevent extremely slow wildcard queries, a wildcard
term should not start with one of the wildcards * or ?.
I hope this helps.

Resources