Elasticsearch query behaviour - elasticsearch

I've got two different queries against my elasticsearch. The difference between these two queries is that the first one got the two search criteria in one boolean should query and he second splits it into two single bool should queries. The first one return the expected response but the second one doesnt match to any document even if there are documents which contains both criteria. If i refactor the second one so that the two splitted bool should queries are encapsulatec by a bool should querie it returns the expected response like it is for querie 1.
The question is why does query 2 doesn't return the response as 1 and 3 do? Am i missing something?
EDIT: provided example data
EDIT: my problem solved, it was just a spelling mistake while building the range query in my code and i doesnt recognize it -.- but maybe the explanation from the answer here will help somebody else.
1.
GET _search
{
"query": {
"bool": {
"filter": [
{
"bool": {
"should": [
{
"range": {
"streetNr": {
"from": "1",
"to": "100",
"include_lower": true,
"include_upper": true,
"boost": 1
}
}
},
{
"match": {
"geographicAddress.city": {
"query": "Berlin"
}
}
}
],
"minimum_should_match": "1"
}
}
]
}
}
}
GET _search
{
"query": {
"bool": {
"filter": [
{
"bool": {
"should": [
{
"range": {
"streetNr": {
"from": "1",
"to": "100",
"include_lower": true,
"include_upper": true,
"boost": 1
}
}
}
],
"minimum_should_match": "1"
}
},
{
"bool": {
"should": [
{
"match": {
"geographicAddress.city": {
"query": "Berlin"
}
}
}
],
"minimum_should_match": "1"
}
}
]
}
}
}
GET _search
{
"query": {
"bool": {
"filter": [
{
"bool": {
"should": [
{
"bool": {
"should": [
{
"range": {
"streetNr": {
"from": "1",
"to": "100",
"include_lower": true,
"include_upper": true,
"boost": 1
}
}
}
],
"minimum_should_match": "1"
}
},
{
"bool": {
"should": [
{
"match": {
"geographicAddress.city": {
"query": "Berlin"
}
}
}
],
"minimum_should_match": "1"
}
}
],
"minimum_should_match": "1"
}
}
]
}
}
}
Example data:
{
"_index": "stof_64371064",
"_type": "_doc",
"_id": "1",
"_score": 0.0,
"_source": {
"streetNr": 90,
"geographicAddress": {
"city": "Berlin"
}
}
},
{
"_index": "stof_64371064",
"_type": "_doc",
"_id": "2",
"_score": 0.0,
"_source": {
"streetNr": 10,
"geographicAddress": {
"city": "Berlin"
}
}
}

Please refer ES official documentation on bool query, to get a detailed understanding of various clauses.
The structure of your first search query is like -
{
"query": {
"bool": {
"filter": [
{
"bool": {
"should": [
{},
{}
],
"minimum_should_match": 1
}
}
]
}
}
}
filter clause is wrapping should query, but at the end of should clause, "minimum_should_match": 1 is added which indicates that 1 should clause must be mandatory.
The structure of your second search query is like -
{
"query": {
"bool": {
"filter": [
{
"bool": {
"should": {},
"minimum_should_match": "1"
}
},
{
"bool": {
"should": {},
"minimum_should_match": "1"
}
}
]
}
}
}
Here since you have added "minimum_should_match": "1" after every should clause, then in a way, it acts like a must clause only, as there is only one condition that needs to be matched in the should clause. filter clause is applied enclosing both the bool should clause, so when both the should clause match, then only you will get the result.
The structure of your third search query is like -
{
"query": {
"bool": {
"filter": [
{
"bool": {
"should": [
{
"bool": {
"should": [
{}
],
"minimum_should_match": 1
}
},
{
"bool": {
"should": [
{}
],
"minimum_should_match": 1
}
}
],
"minimum_should_match": 1
}
}
]
}
}
}
In this, you have used multiple combinations of the bool should clause. The first outer bool should clause, wraps two more bool should clause. But here at the end of the outer should clause you have added "minimum_should_match": 1. So though here filter clause is there but it will return a result even if one bool should clause satisfy the condition.
Adding a working example with index data, search query, and search result
Index Data:
{
"streetNr":0,
"geographicAddress":{
"city":"Berlin"
}
}
{
"streetNr":90,
"geographicAddress":{
"city":"Berlin"
}
}
Search Query: (Second search query acc to your question)
{
"query": {
"bool": {
"should": [ <-- note this
{
"bool": {
"should": [
{
"range": {
"streetNr": {
"from": "1",
"to": "100",
"include_lower": true,
"include_upper": true,
"boost": 1
}
}
}
],
"minimum_should_match": "1"
}
},
{
"bool": {
"should": [
{
"match": {
"geographicAddress.city": {
"query": "Berlin"
}
}
}
],
"minimum_should_match": "1"
}
}
]
}
}
}
Search Result:
"hits": [
{
"_index": "stof_64371064",
"_type": "_doc",
"_id": "1",
"_score": 0.0,
"_source": {
"streetNr": 90,
"geographicAddress": {
"city": "Berlin"
}
}
},
{
"_index": "stof_64371064",
"_type": "_doc",
"_id": "2",
"_score": 0.0,
"_source": {
"streetNr": 0,
"geographicAddress": {
"city": "Berlin"
}
}
}
]

In the should with minimum should match=1 you say that if one of the criteria is right return the document as you have set in query 1 and 3 . But in the second query you have set two criteria inside filter and elasticsearch search and returns those documents which both criterias are valid on them. Because of that your second query behaves such as a must in comparison with should in your other queries.

Related

ElasticSearch: Query nested array for empty and specific value in single query

Documents structure -
{
"hits": [
{
"_type": "_doc",
"_id": "ef0a2c44179a513476b080cc2a585d95",
"_source": {
"DIVISION_NUMBER": 44,
"MATCHES": [
{
"MATCH_STATUS": "APPROVED",
"UPDATED_ON": 1599171303000
}
]
}
},
{
"_type": "_doc",
"_id": "ef0a2c44179a513476b080cc2a585d95",
"_source": {
"DIVISION_NUMBER": 44,
"MATCHES": [ ]
}
}
]
}
Question - MATCHES is a nested array inside there is a text field MATCH_STATUS that can have any values say "APPROVED","REJECTED".
I am looking to search ALL documents that contain MATCH_STATUS having values say "APPROVED", "RECOMMENDED" as well as where there is no data in MATCHES (empty array "MATCHES": [ ]). Please note I want this in a single query.
I am able to do this in two separate queries like this -
GET all matches with status = RECOMMENDED, APPROVED
"must": [
{
"nested": {
"path": "MATCHES",
"query": {
"terms": {
"MATCHES.MATCH_STATUS.keyword": [
"APPROVED",
"RECOMMENDED"
]
}
}
}
}
]
GET all matches having empty array "MATCHES" : [ ]
{
"size": 5000,
"query": {
"bool": {
"filter": [],
"must_not": [
{
"nested": {
"path": "MATCHES",
"query": {
"exists": {
"field": "MATCHES"
}
}
}
}
]
}
},
"from": 0
}
You can combine both queries using should clause.
{
"query": {
"bool": {
"minimum_should_match": 1,
"should": [
{
"nested": {
"path": "MATCHES",
"query": {
"bool": {
"minimum_should_match": 1,
"should": [
{
"terms": {
"MATCHES.MATCH_STATUS.keyword": [
"APPROVED",
"RECOMMENDED"
]
}
}
]
}
}
}
},
{
"bool": {
"must_not": [
{
"nested": {
"path": "MATCHES",
"query": {
"bool": {
"filter": {
"exists": {
"field": "MATCHES"
}
}
}
}
}
}
]
}
}
]
}
}
}
Update: To answer your comment.
Missing aggregation does not support nested field for now. There is open issue as of now.
To get count of empty matches, you can use a filter aggregation with the nested query wrapped into the must_not clause of the bool query.
{
"aggs": {
"missing_matches_agg": {
"filter": {
"bool": {
"must_not": {
"nested": {
"query": {
"match_all": {}
},
"path": "MATCHES"
}
}
}
}
}
}
}

How to search on multiple fields of array in elasticsearch

I have a index in elastic search called professor
If for cross field i need "AND" condition
for same field array i need to OR condition
I need to search subject which is Physics or Accounting this is array of fields(OR) statement
AND
I need to search type is Permanent or GUEST condition this is array of fields(OR) statement
AND
I need to search Location is NY(&) condition
test = [{'id':1,'name': 'A','subject': ['Maths','Accounting'],'type':'Contract', 'Location':'NY'},
{ 'id':2,'name': 'AB','subject': ['Physics','Engineering'],'type':'Permanent','Location':'NY'},
{'id':3,'name': 'ABC','subject': ['Maths','Engineering'],'type':'Permanent','Location':'NY'},
{'id':4,'name':'ABCD','subject': ['Physics','Engineering'],'type':['Contract','Guest'],'Location':'NY'}]
Query is below,3rd one got it, How to add 1 and 2
content_search = es.search(index="professor", body={
"query": {
"bool": {
"must": {
"match_all": {}
},
"filter": [
{
"term": {
"Location.keyword": "NY"
}
}
]
}
}
})
content_search ['hits']['hits']
Expected out is id [{ 'id':2,'name': 'AB','subject': ['Physics','Engineering'],'type':'Permanent','Location':'NY'},{'id':4,'name':'ABCD','subject': ['Physics','Engineering'],'type':['Contract','Guest'],'Location':'NY'}]
The filter clause (query) must appear in matching documents. However
unlike must the score of the query will be ignored. Filter clauses are
executed in filter context, meaning that scoring is ignored and
clauses are considered for caching.
Please go through this Elasticsearch documentation on bool queries, to get a detailed understanding about it.
Adding a working example with index data(same as that in question), search query, and search result
Search Query:
{
"query": {
"bool": {
"must": {
"match": {
"Location.keyword": "NY"
}
},
"filter": [
{
"bool": {
"should": [
{
"match": {
"subject.keyword": "Accounting"
}
},
{
"match": {
"subject.keyword": "Physics"
}
}
]
}
},
{
"bool": {
"should": [
{
"match": {
"type.keyword": "Permanent"
}
},
{
"match": {
"type.keyword": "Guest"
}
}
]
}
}
]
}
}
}
Search Result:
"hits": [
{
"_index": "stof_64370980",
"_type": "_doc",
"_id": "2",
"_score": 0.10536051,
"_source": {
"id": 2,
"name": "AB",
"subject": [
"Physics",
"Engineering"
],
"type": "Permanent",
"Location": "NY"
}
},
{
"_index": "stof_64370980",
"_type": "_doc",
"_id": "4",
"_score": 0.10536051,
"_source": {
"id": 4,
"name": "ABCD",
"subject": [
"Physics",
"Engineering"
],
"type": [
"Contract",
"Guest"
],
"Location": "NY"
}
}
]
Another Search Query:
You can even use terms query that returns documents that contain
one or more exact terms in a provided field.The terms query is the
same as the term query, except you can search for multiple values.
{
"query": {
"bool": {
"must": [
{
"terms": {
"subject.keyword": [
"Physics",
"Accounting"
]
}
},
{
"terms": {
"type.keyword": [
"Guest",
"Permanent"
]
}
},
{
"match": {
"Location.keyword": "NY"
}
}
]
}
}
}
Update 1:
{
"query": {
"bool": {
"must": [
{
"terms": {
"subject.keyword": [
"Physics",
"Accounting"
]
}
},
{
"terms": {
"type.keyword": [
"Guest",
"Permanent"
]
}
},
{
"match": {
"Location.keyword": "NY"
}
},
{
"query_string": {
"query": "ABCD"
}
}
]
}
}
}

How to search array of fields in elasticsearch

I have a index in elastic search called professor
If for cross field i need "AND" condition
for same field array i need to OR condition
I need to search subject which is Physics or Accounting this is array of fields(OR) statement
I need to search type is Permanent(&) condition
I need to search Location is NY(&) condition
There is chance that {'type':['Contract','Guest']} type also coming as list
test = [{'id':1,'name': 'A','subject': ['Maths','Accounting'],'type':'Contract', 'Location':'NY'},
{ 'id':2,'name': 'AB','subject': ['Physics','Engineering'],'type':'Permanent','Location':'NY'},
{'id':3,'name': 'ABC','subject': ['Maths','Engineering'],'type':'Permanent','Location':'NY'}]
Query is below,3rd one got it, How to add 1 and 2
content_search = es.search(index="professor", body={
"query": {
"bool": {
"must": {
"match_all": {}
},
"filter": [
{
"term": {
"Location.keyword": "NY"
}
}
]
}
}
})
content_search ['hits']['hits']
Expected out is id [{ 'id':2,'name': 'AB','subject': ['Physics','Engineering'],'type':'Permanent','Location':'NY'}]
You need to use the bool query, to wrap all your conditions
Adding a working example with index data(same as that in question), search query, and search result
Search Query:
{
"query": {
"bool": {
"must": [
{
"match": {
"type.keyword": "Permanent"
}
},
{
"match": {
"Location.keyword": "NY"
}
}
],
"should": [
{
"match": {
"subject.keyword": "Accounting"
}
},
{
"match": {
"subject.keyword": "Physics"
}
}
],
"minimum_should_match": 1,
"boost": 1.0
}
}
}
Search Result:
"hits": [
{
"_index": "stof_64370980",
"_type": "_doc",
"_id": "2",
"_score": 1.8365774,
"_source": {
"id": 2,
"name": "AB",
"subject": [
"Physics",
"Engineering"
],
"type": "Permanent",
"Location": "NY"
}
}
]

How to write a conditional in a search query?

I am searching among documents in a particular district. Documents have various statuses. The aim is to return all documents, except when document's status code is ABCD - such documents should only be returned if their ID is greater than 100. I have tried writing multiple queries, including the one below, which returns only the ABCD documents with ID greater than 100, and none of the other documents. What is wrong here? How can I get the non-ABCD documents as well?
"_source": true,
"from": 0,
"size": 50,
"sort": [
{
"firstStamp": "DESC"
}
],
"query": {
"bool": {
"must": [
{
"term": {
"districtId": "3755"
}
},
{
"bool": {
"must": [
{
"terms": {
"documentStatus.code.keyword": [
"ABCD"
]
}
},
{
"bool": {
"must": {
"script": {
"script": "doc['id'].value > 100"
}
}
}
}
]
}
}
]
}
}
}```
Since you have not added any index mapping, looking at your search
query data seems to be of object field data type. As far as I can
understand, your aim is to return all documents, except when the
document's status code is ABCD and document with status code ABCD
should only be returned if their ID is greater than 100.
Adding a working example with index data, search query, and search result
Index Data:
{
"id":200,
"documentStatus":{
"code":"DEF"
}
}
{
"id":200,
"documentStatus":{
"code":"ABCD"
}
}
{
"id":100,
"documentStatus":{
"code":"ABCD"
}
}
Search Query:
{
"query": {
"bool": {
"should": [
{
"bool": {
"must": [
{
"terms": {
"documentStatus.code.keyword": [
"ABCD"
]
}
},
{
"bool": {
"must": {
"script": {
"script": "doc['id'].value > 100"
}
}
}
}
]
}
},
{
"bool": {
"must_not": {
"terms": {
"documentStatus.code.keyword": [
"ABCD"
]
}
}
}
}
]
}
}
}
Search Result:
"hits": [
{
"_index": "stof_64351595",
"_type": "_doc",
"_id": "2",
"_score": 2.0,
"_source": {
"id": 200,
"documentStatus": {
"code": "ABCD"
}
}
},
{
"_index": "stof_64351595",
"_type": "_doc",
"_id": "3",
"_score": 0.0,
"_source": {
"id": 200,
"documentStatus": {
"code": "DEF"
}
}
}
]
You need to use must_not in your query if you want to have documents which don't have status code = ABCD. So your query would be some thing like this:
"from": 0,
"size": 50,
"sort": [
{
"firstStamp": "DESC"
}
],
{
"query": {
"bool": {
"must": [
{
"term": {
"districtId": "3755"
}
},
{
"range": {
"id": {
"gt": 100
}
}
}
],
"must_not": [
{
"terms": {
"documentStatus.code.keyword": [
"ABCD"
]
}
}
]
}
}
}

Elasticsearch - Query document missing an array value

I would like to query my elasticsearch index in order to retrieve the documents that don't contain a specific value in an array. For instance, if my query is :
{
"query": {
"bool": {
"must": [
{
"match_all": {}
}
],
"must_not": [],
"should": []
}
},
"from": 0,
"size": 10,
"sort": [],
"facets": {}
}
And the dataset :
{
"took": 1,
"hits": {
"total": 1,
"hits": [
{
"_index": "product__1434374235336",
"_type": "product",
"_id": "AU33Xeny0K4pKlL-a7sr",
"_source": {
"interdictions": ["S0P","SK3"],
"code": "foo"
}
},
{
"_index": "product__1434374235336",
"_type": "product",
"_id": "AU33Xeny0K4pKlL-a7sr",
"_source": {
"interdictions": ["S0P","S2V","SK3"],
"code": "bar"
}
}
]
}
}
The objective is to exclude each product that contains the "S2V" interdiction. I initially thought of using a missing filter :
{
"query": {
"bool": {
"must": [
{
"match_all": {}
}
],
"must_not": [],
"should": []
}
},
"filter": {
"missing": {
"terms": {
"interdictions": [
"S2V"
]
}
}
},
"from": 0,
"size": 10,
"sort": [],
"facets": {}
}
But elasticsearch fails to parse the query : QueryParsingException[[product__1434374235336] [missing] filter does not support [interdictions]]; }]",. I then tried with a must_not :
{
"query": {
"bool": {
"must": [
{
"match_all": {}
}
],
"must_not" : {
"terms" : {
"interdictions" : ["S2V"]
}
}
}
},
"from": 0,
"size": 10
}
But the output is incorrect since it returns a product with the S2V interdictions.
So... What is the correct way to do this ?
Thanks !
Try this (with lowercase value for the terms bool):
{
"query": {
"bool": {
"must": [
{
"match_all": {}
}
],
"must_not": {
"terms": {
"interdictions": [
"s2v"
]
}
}
}
},
"from": 0,
"size": 10
}
Most probably, you have an analyzer (maybe the standard default one) that does lower case the terms, so in the ES index the value is indexed as s2v, sk3 etc. And terms doesn't analyze the input value, it's using it as is (in your case with uppercase letters), so it will never match.

Resources