Bool filter and SHOULD and MUST combinations - elasticsearch

I have a little confusion about the usage SHOULD and MUST in bool queries. When you have several filters in SHOULD and MUST clauses, can they be place at the same level or they should be nested?
Below is a simplified version of my data and the two queries that I tested, first one failing and the latter working. In real practice, I have many filters in MUST and SHOULD.
I start to believe that if one wants to combine several SHOULD and MUST filters, the outer one must always be SHOULD. Is this a correct assumption? And in case I wanted to use a MUST_NOT, where should it be placed in this context?
My data:
_index,_type,_id,_score,_source.id,_source.type,_source.valueType,_source.sentence,_source.location
"test","var","0","1","0","study","text","Lorem text is jumbled","spain"
"test","var","1","1","1","study","text","bla bla bla","spain"
"test","var","2","1","2","schema","decimal","ipsum","germany"
"test","var","3","1","3","study","integer","lorem","france"
Here is the failing query:
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"bool": {
"must": {
"terms": {
"location": [
"germany"
]
}
},
"should": {
"terms": {
"valueType": [
"integer"
]
}
}
}
}
}
}
}
Here is my WORKING query returning IDs 2 and 3:
{
"query": {
"bool": {
"should": [
{
"terms": {
"location": [
"germany"
]
}
},
{
"bool": {
"must": [
{
"terms": {
"valueType": [
"integer"
]
}
}
]
}
}
]
}
}
}
Many thanks.

First need to understand meaning of filters.
Compound Filter:
must clauses are required (and)
should clauses are optional (or)
So in first block you are checking term in must(and). So this term must be in result set. and should(or) cond 2 may or may not in result set.
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"bool": {
"must": {
....... Cond 1
},
"should": {
....... Cond 2
}
}
}
}
}
}
In your working scenario you are query working because should checking Cond 1 OR Cond 2.
{
"query": {
"bool": {
"should": [ // OR
{
...... Cond 1
},
{
...... Cond 2
}
]
}
}
}

Related

Elasticsearch combine term and range query on nested key/value data

I have ES documents structured in a flat data structure using the nested data type, as they accept arbitrary JSON that we don't control, and we need to avoid a mapping explosion. Here's an example document:
{
"doc_flat":[
{
"key":"timestamp",
"type":"date",
"key_type":"timestamp.date",
"value_date":[
"2023-01-20T12:00:00Z"
]
},
{
"key":"status",
"type":"string",
"key_type":"status.string",
"value_string":[
"warning"
]
},
... more arbitrary fields ...
],
}
I've figured out how to query this nested data set to find matches on this arbitrary nested data, using a query such as:
{
"query": {
"nested": {
"path": "doc_flat",
"query": {
"bool": {
"must": [
{"term": {"doc_flat.key": "status"}},
{"term": {"doc_flat.value_string": "warning"}}
]
}
}
}
}
}
And I figured out how to find documents matching a particular date range:
{
"query": {
"nested": {
"path": "doc_flat",
"query": {
"bool": {
"must": [
{"term": {"doc_flat.key": "timestamp"}},
{
"range": {
"doc_flat.value_date": {
"gte": "2023-01-20T00:00:00Z",
"lte": "2023-01-21T00:00:00Z"
}
}
}
]
}
}
}
}
}
But I'm struggling to combine these two queries together, in order to search for documents that have a nested documents which match these two conditions:
a doc_flat.key of status, and a doc_flat.value_string of warning
a doc_flat.key of timestamp, and a doc_flat.value_date in a range
Obviously I can't just shove the second set of query filters into the same must array, because then no documents will match. I think I need to go "one level higher" in my query and wrap it in another bool query? But I can't get my head around how that would look.
You tried two nested inside Bool query?
{
"query": {
"bool": {
"filter": [
{
"nested": {
"path": "doc_flat",
"query": {
"bool": {
"must": [
{
"term": {
"doc_flat.key": "timestamp"
}
},
{
"range": {
"doc_flat.value_date": {
"gte": "2023-01-20T00:00:00Z",
"lte": "2023-01-21T00:00:00Z"
}
}
}
]
}
}
}
}
],
"must": [
{
"nested": {
"path": "doc_flat",
"query": {
"bool": {
"must": [
{
"term": {
"doc_flat.key": "status"
}
},
{
"term": {
"doc_flat.value_string": "warning"
}
}
]
}
}
}
}
]
}
}
}

Elasticsearch constant_score wrapped inside must does not return expected result

I have the following ES query :
{
"query": {
"bool": {
"should": [
{
"constant_score": {
"boost": 5,
"filter": {
"bool": {
"must": [
{
"ids": {
"values": [
"winnerAthlete-A"
]
}
},
{
"dis_max": {
"queries": [
{
"bool": {
"filter": {
"term": {
"isAthlete": true
}
}
}
},
{
"bool": {
"filter": {
"term": {
"isWinner": true
}
}
}
}
]
}
}
]
}
}
}
},
{
"constant_score": {
"boost": 4,
"filter": {
"bool": {
"must": [
{
"ids": {
"values": [
"winnerAthlete-B"
]
}
},
{
"dis_max": {
"queries": [
{
"bool": {
"filter": {
"term": {
"isAthlete": true
}
}
}
},
{
"bool": {
"filter": {
"term": {
"isWinner": true
}
}
}
}
]
}
}
]
}
}
}
}
]
}
}
}
It does return the result I expect : the 2 documents winnerAthlete-A and winnerAthlete-B, assigning a score of 5.0 to winnerAthlete-A and a score of 4.0 to winnerAthlete-B.
Now, when I turn the should on the third line of the query into a must, the query does not match any document whereas I would expect the exact same result. I can't wrap my head around why. I have tried using the ES _explain keyword to understand why this query doesn't match when using must but it didn't help me.
Any idea why this query rewritten with a must does not return anything whereas the should version does return the expected result ?
Should works like "OR" . It will return a document which matches any of clauses.
Must works like "AND" . Document must satisfy both clauses.
Your query is not returning any result because there is no single document which has ids as winnerAthlete-A as well as winnerAthlete-B

Complex Nested Query Where to Place Bool Match

I have a complex index within Elastic that I need to query by 3 parameters.
Thanks to this answered question I am able to query by 2 of the 3 parameters, however the 3rd parameter is not at the same nested level as the other two.
The schema looks this..
The following query works for the 2 of the 3 parameters...
But the 3rd parameter is at a different level the the other two so this query does not return the expected document.
Given that the bool match query for "boundedContexts.aggregateRoot.aggregateType.name" is at a different nested level, how would I write this query so that it will query on that field ?
This works...
{
"query": {
"nested": {
"path": "boundedContexts",
"query": {
"nested": {
"path": "boundedContexts.aggregateRoots",
"query": {
"bool": {
"must": [
{ "match": { "boundedContexts.aggregateRoots.aggregateType.name": "Aggregate" } },
{ "nested": {
"path": "boundedContexts.aggregateRoots.modelMetaData",
"query": {
"bool": {
"must": [
{ "match": { "boundedContexts.aggregateRoots.modelMetaData.modelReferenceId": "4e7c5c0e-93a7-4bf6-9705-cf1327760e21" } },
{ "match": { "boundedContexts.aggregateRoots.modelMetaData.modelType.name": "AggregateRoot" } }
]
}
}
}
}
]
}
}
}
}
}
},
"size": 1,
"sort": [
{
"generatedDate": {
"order": "desc"
}
}
]
}

How do i write a search query that performs multiple tasks in Elasticsearch?

I have read the Elasticsearch documentation. I also took a course. My questions is was how do I write one query to handle all my tasks? I learn by example. The documentation doesn't have many examples. I wrote what I think may be how I accomplish this task but I'm not sure i'm doing this correctly.
The ... is where i would put a match query of some sort
{
"query": {
"bool": {
"should": [
{
"bool": {
"must": {
...
},
"should": {
...
}
}
},
{
"bool": {
"query_string": {
...
}
}
},
{
"bool": {
...
}
},
{
"bool": {
"must": {
...
},
"should": {
...
}
}
}
],
"minimum_should_match": 1
}
}
}
Is this how i would do it?
The bool query contain array of [must, filter, should, mustnot] so you don't have to put another bool on it. Inside each of them you can write another bool query of course.
As you add a minimum_should_match, you right, you have to put it just after the should part. Your query has to look like :
{
"query": {
"bool": {
"should": [
{ "query_string" : ... },
{ "terms" : ... },
{ "bool" : ... },
{ "bool" : {
"must": [
{"query_string": ... },
{"bool": ....}
]
}
}
],
"minimum_should_match": 1
}
}
}
You have a good example here:
https://www.compose.com/articles/elasticsearch-query-time-strategies-and-techniques-for-relevance-part-i/
https://hdmetor.github.io/how-to-combine-queries-in-es/

Elasticsearch Array

I have following values in my document.
"ReturnCode": [ "0", "0" ]
"ReturnCode": [ "0", "1" ]
If I search 0,0 it should return 1st document and If I search 0,1 then it should return 2nd document. I am trying with following query but it's not giving correct result. Result must match with all array elements.
GET test/_search
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"bool": {
"must": [
{
"terms": { "ReturnCode":[ "0","1"] }
}
]
}
}
}
}
}
Thanks
Terms query is an OR query
GET test/_search
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"bool": {
"must": [
{
"term": { "ReturnCode":"0"}
},
{
"term": { "ReturnCode":"1"}
}
]
}
}
}
}
}
You need to create individual term queries inside the must clause as above

Resources