I would like to query my elasticsearch index in order to retrieve the documents that don't contain a specific value in an array. For instance, if my query is :
{
"query": {
"bool": {
"must": [
{
"match_all": {}
}
],
"must_not": [],
"should": []
}
},
"from": 0,
"size": 10,
"sort": [],
"facets": {}
}
And the dataset :
{
"took": 1,
"hits": {
"total": 1,
"hits": [
{
"_index": "product__1434374235336",
"_type": "product",
"_id": "AU33Xeny0K4pKlL-a7sr",
"_source": {
"interdictions": ["S0P","SK3"],
"code": "foo"
}
},
{
"_index": "product__1434374235336",
"_type": "product",
"_id": "AU33Xeny0K4pKlL-a7sr",
"_source": {
"interdictions": ["S0P","S2V","SK3"],
"code": "bar"
}
}
]
}
}
The objective is to exclude each product that contains the "S2V" interdiction. I initially thought of using a missing filter :
{
"query": {
"bool": {
"must": [
{
"match_all": {}
}
],
"must_not": [],
"should": []
}
},
"filter": {
"missing": {
"terms": {
"interdictions": [
"S2V"
]
}
}
},
"from": 0,
"size": 10,
"sort": [],
"facets": {}
}
But elasticsearch fails to parse the query : QueryParsingException[[product__1434374235336] [missing] filter does not support [interdictions]]; }]",. I then tried with a must_not :
{
"query": {
"bool": {
"must": [
{
"match_all": {}
}
],
"must_not" : {
"terms" : {
"interdictions" : ["S2V"]
}
}
}
},
"from": 0,
"size": 10
}
But the output is incorrect since it returns a product with the S2V interdictions.
So... What is the correct way to do this ?
Thanks !
Try this (with lowercase value for the terms bool):
{
"query": {
"bool": {
"must": [
{
"match_all": {}
}
],
"must_not": {
"terms": {
"interdictions": [
"s2v"
]
}
}
}
},
"from": 0,
"size": 10
}
Most probably, you have an analyzer (maybe the standard default one) that does lower case the terms, so in the ES index the value is indexed as s2v, sk3 etc. And terms doesn't analyze the input value, it's using it as is (in your case with uppercase letters), so it will never match.
Related
Documents structure -
{
"hits": [
{
"_type": "_doc",
"_id": "ef0a2c44179a513476b080cc2a585d95",
"_source": {
"DIVISION_NUMBER": 44,
"MATCHES": [
{
"MATCH_STATUS": "APPROVED",
"UPDATED_ON": 1599171303000
}
]
}
},
{
"_type": "_doc",
"_id": "ef0a2c44179a513476b080cc2a585d95",
"_source": {
"DIVISION_NUMBER": 44,
"MATCHES": [ ]
}
}
]
}
Question - MATCHES is a nested array inside there is a text field MATCH_STATUS that can have any values say "APPROVED","REJECTED".
I am looking to search ALL documents that contain MATCH_STATUS having values say "APPROVED", "RECOMMENDED" as well as where there is no data in MATCHES (empty array "MATCHES": [ ]). Please note I want this in a single query.
I am able to do this in two separate queries like this -
GET all matches with status = RECOMMENDED, APPROVED
"must": [
{
"nested": {
"path": "MATCHES",
"query": {
"terms": {
"MATCHES.MATCH_STATUS.keyword": [
"APPROVED",
"RECOMMENDED"
]
}
}
}
}
]
GET all matches having empty array "MATCHES" : [ ]
{
"size": 5000,
"query": {
"bool": {
"filter": [],
"must_not": [
{
"nested": {
"path": "MATCHES",
"query": {
"exists": {
"field": "MATCHES"
}
}
}
}
]
}
},
"from": 0
}
You can combine both queries using should clause.
{
"query": {
"bool": {
"minimum_should_match": 1,
"should": [
{
"nested": {
"path": "MATCHES",
"query": {
"bool": {
"minimum_should_match": 1,
"should": [
{
"terms": {
"MATCHES.MATCH_STATUS.keyword": [
"APPROVED",
"RECOMMENDED"
]
}
}
]
}
}
}
},
{
"bool": {
"must_not": [
{
"nested": {
"path": "MATCHES",
"query": {
"bool": {
"filter": {
"exists": {
"field": "MATCHES"
}
}
}
}
}
}
]
}
}
]
}
}
}
Update: To answer your comment.
Missing aggregation does not support nested field for now. There is open issue as of now.
To get count of empty matches, you can use a filter aggregation with the nested query wrapped into the must_not clause of the bool query.
{
"aggs": {
"missing_matches_agg": {
"filter": {
"bool": {
"must_not": {
"nested": {
"query": {
"match_all": {}
},
"path": "MATCHES"
}
}
}
}
}
}
}
I've got two different queries against my elasticsearch. The difference between these two queries is that the first one got the two search criteria in one boolean should query and he second splits it into two single bool should queries. The first one return the expected response but the second one doesnt match to any document even if there are documents which contains both criteria. If i refactor the second one so that the two splitted bool should queries are encapsulatec by a bool should querie it returns the expected response like it is for querie 1.
The question is why does query 2 doesn't return the response as 1 and 3 do? Am i missing something?
EDIT: provided example data
EDIT: my problem solved, it was just a spelling mistake while building the range query in my code and i doesnt recognize it -.- but maybe the explanation from the answer here will help somebody else.
1.
GET _search
{
"query": {
"bool": {
"filter": [
{
"bool": {
"should": [
{
"range": {
"streetNr": {
"from": "1",
"to": "100",
"include_lower": true,
"include_upper": true,
"boost": 1
}
}
},
{
"match": {
"geographicAddress.city": {
"query": "Berlin"
}
}
}
],
"minimum_should_match": "1"
}
}
]
}
}
}
GET _search
{
"query": {
"bool": {
"filter": [
{
"bool": {
"should": [
{
"range": {
"streetNr": {
"from": "1",
"to": "100",
"include_lower": true,
"include_upper": true,
"boost": 1
}
}
}
],
"minimum_should_match": "1"
}
},
{
"bool": {
"should": [
{
"match": {
"geographicAddress.city": {
"query": "Berlin"
}
}
}
],
"minimum_should_match": "1"
}
}
]
}
}
}
GET _search
{
"query": {
"bool": {
"filter": [
{
"bool": {
"should": [
{
"bool": {
"should": [
{
"range": {
"streetNr": {
"from": "1",
"to": "100",
"include_lower": true,
"include_upper": true,
"boost": 1
}
}
}
],
"minimum_should_match": "1"
}
},
{
"bool": {
"should": [
{
"match": {
"geographicAddress.city": {
"query": "Berlin"
}
}
}
],
"minimum_should_match": "1"
}
}
],
"minimum_should_match": "1"
}
}
]
}
}
}
Example data:
{
"_index": "stof_64371064",
"_type": "_doc",
"_id": "1",
"_score": 0.0,
"_source": {
"streetNr": 90,
"geographicAddress": {
"city": "Berlin"
}
}
},
{
"_index": "stof_64371064",
"_type": "_doc",
"_id": "2",
"_score": 0.0,
"_source": {
"streetNr": 10,
"geographicAddress": {
"city": "Berlin"
}
}
}
Please refer ES official documentation on bool query, to get a detailed understanding of various clauses.
The structure of your first search query is like -
{
"query": {
"bool": {
"filter": [
{
"bool": {
"should": [
{},
{}
],
"minimum_should_match": 1
}
}
]
}
}
}
filter clause is wrapping should query, but at the end of should clause, "minimum_should_match": 1 is added which indicates that 1 should clause must be mandatory.
The structure of your second search query is like -
{
"query": {
"bool": {
"filter": [
{
"bool": {
"should": {},
"minimum_should_match": "1"
}
},
{
"bool": {
"should": {},
"minimum_should_match": "1"
}
}
]
}
}
}
Here since you have added "minimum_should_match": "1" after every should clause, then in a way, it acts like a must clause only, as there is only one condition that needs to be matched in the should clause. filter clause is applied enclosing both the bool should clause, so when both the should clause match, then only you will get the result.
The structure of your third search query is like -
{
"query": {
"bool": {
"filter": [
{
"bool": {
"should": [
{
"bool": {
"should": [
{}
],
"minimum_should_match": 1
}
},
{
"bool": {
"should": [
{}
],
"minimum_should_match": 1
}
}
],
"minimum_should_match": 1
}
}
]
}
}
}
In this, you have used multiple combinations of the bool should clause. The first outer bool should clause, wraps two more bool should clause. But here at the end of the outer should clause you have added "minimum_should_match": 1. So though here filter clause is there but it will return a result even if one bool should clause satisfy the condition.
Adding a working example with index data, search query, and search result
Index Data:
{
"streetNr":0,
"geographicAddress":{
"city":"Berlin"
}
}
{
"streetNr":90,
"geographicAddress":{
"city":"Berlin"
}
}
Search Query: (Second search query acc to your question)
{
"query": {
"bool": {
"should": [ <-- note this
{
"bool": {
"should": [
{
"range": {
"streetNr": {
"from": "1",
"to": "100",
"include_lower": true,
"include_upper": true,
"boost": 1
}
}
}
],
"minimum_should_match": "1"
}
},
{
"bool": {
"should": [
{
"match": {
"geographicAddress.city": {
"query": "Berlin"
}
}
}
],
"minimum_should_match": "1"
}
}
]
}
}
}
Search Result:
"hits": [
{
"_index": "stof_64371064",
"_type": "_doc",
"_id": "1",
"_score": 0.0,
"_source": {
"streetNr": 90,
"geographicAddress": {
"city": "Berlin"
}
}
},
{
"_index": "stof_64371064",
"_type": "_doc",
"_id": "2",
"_score": 0.0,
"_source": {
"streetNr": 0,
"geographicAddress": {
"city": "Berlin"
}
}
}
]
In the should with minimum should match=1 you say that if one of the criteria is right return the document as you have set in query 1 and 3 . But in the second query you have set two criteria inside filter and elasticsearch search and returns those documents which both criterias are valid on them. Because of that your second query behaves such as a must in comparison with should in your other queries.
Query is below
{
"from" : 0,
"size" : 100,
"query": {
"match_all": {}
}
}
I need to filter from the match_all if name is test
i tried with
{
"from" : 0,
"size" : 100,
"query": {
"match_all": {}
},
"filter": [ "term": { "name": "test" }}]
}
I got error 'Unknown key for a START_ARRAY in [filter].')
You will need to wrap your query in a bool query , try out this search query:
{
"from":0,
"size":10,
"query": {
"bool": {
"must": {
"match_all": {}
},
"filter": [
{
"term": {
"grocery_name": "elastic"
}
}
]
}
}
}
Update 1:
According to the comment mentioned by #Nons
Search Query:
Terms query return documents that contain an exact term in a provided
field.
{
"from":0,
"size":10,
"query": {
"bool": {
"must": {
"match_all": {}
},
"filter": [
{
"term": {
"parentName.keyword": "Developer" <-- note this
}
}
]
}
}
}
Search Result:
"hits": [
{
"_index": "stof_64275684",
"_type": "_doc",
"_id": "1",
"_score": 1.0,
"_source": {
"id": "1",
"name": "A",
"parentName": "Developer",
"Data": [
{
"id": "455",
"name": "Google",
"lastUpdatedDate": "2020-09-10",
"parent_id": "1"
}
],
"Function": [
{
"id": "1",
"name": "Major"
}
]
}
}
]
You can even use a match query where the provided text is analyzed
before matching.
{
"from": 0,
"size": 10,
"query": {
"bool": {
"must": {
"match": {
"parentName": "developer"
}
}
}
}
}
I would recommend to use the Chrome ElasticSearch Head plugin. It allows to test and run searches against Elastic very easily (functionality is similar to MySql Workbech).
Please find example of usage of plugin below (combination of condition and aggregation).
I am searching among documents in a particular district. Documents have various statuses. The aim is to return all documents, except when document's status code is ABCD - such documents should only be returned if their ID is greater than 100. I have tried writing multiple queries, including the one below, which returns only the ABCD documents with ID greater than 100, and none of the other documents. What is wrong here? How can I get the non-ABCD documents as well?
"_source": true,
"from": 0,
"size": 50,
"sort": [
{
"firstStamp": "DESC"
}
],
"query": {
"bool": {
"must": [
{
"term": {
"districtId": "3755"
}
},
{
"bool": {
"must": [
{
"terms": {
"documentStatus.code.keyword": [
"ABCD"
]
}
},
{
"bool": {
"must": {
"script": {
"script": "doc['id'].value > 100"
}
}
}
}
]
}
}
]
}
}
}```
Since you have not added any index mapping, looking at your search
query data seems to be of object field data type. As far as I can
understand, your aim is to return all documents, except when the
document's status code is ABCD and document with status code ABCD
should only be returned if their ID is greater than 100.
Adding a working example with index data, search query, and search result
Index Data:
{
"id":200,
"documentStatus":{
"code":"DEF"
}
}
{
"id":200,
"documentStatus":{
"code":"ABCD"
}
}
{
"id":100,
"documentStatus":{
"code":"ABCD"
}
}
Search Query:
{
"query": {
"bool": {
"should": [
{
"bool": {
"must": [
{
"terms": {
"documentStatus.code.keyword": [
"ABCD"
]
}
},
{
"bool": {
"must": {
"script": {
"script": "doc['id'].value > 100"
}
}
}
}
]
}
},
{
"bool": {
"must_not": {
"terms": {
"documentStatus.code.keyword": [
"ABCD"
]
}
}
}
}
]
}
}
}
Search Result:
"hits": [
{
"_index": "stof_64351595",
"_type": "_doc",
"_id": "2",
"_score": 2.0,
"_source": {
"id": 200,
"documentStatus": {
"code": "ABCD"
}
}
},
{
"_index": "stof_64351595",
"_type": "_doc",
"_id": "3",
"_score": 0.0,
"_source": {
"id": 200,
"documentStatus": {
"code": "DEF"
}
}
}
]
You need to use must_not in your query if you want to have documents which don't have status code = ABCD. So your query would be some thing like this:
"from": 0,
"size": 50,
"sort": [
{
"firstStamp": "DESC"
}
],
{
"query": {
"bool": {
"must": [
{
"term": {
"districtId": "3755"
}
},
{
"range": {
"id": {
"gt": 100
}
}
}
],
"must_not": [
{
"terms": {
"documentStatus.code.keyword": [
"ABCD"
]
}
}
]
}
}
}
Key String will be like
"india,singapore" without quotes.
How to split and search the keyword
Expected result will be match the country with india or singapore.
So far i tried..
{
"_source": "country_name",
"query": {
"bool": {
"must": [
{
"term": {
"country_name.keyword": "india,singapore"
}
}
],
"must_not": [],
"should": []
}
},
"from": 0,
"size": 10,
"sort": [],
"aggs": {}
}
But it will showing only those content have match the exact key string "india,singapore"
you can use terms query in place of term query like below:
{
"_source": "country_name",
"query": {
"bool": {
"must": [
{
"terms": {
"country_name.keyword": ["india","singapore"]
}
}
]
}
},
"from": 0,
"size": 10
}