Elasticsearch: How to write an 'OR' clause in filter context? - elasticsearch

I'm looking for syntax/example compatible with ES version is 6.7.
I have seen the docs, I don't see any examples for this and the explanation isn't clear enough to me. I have tried writing query according to that, but I keep on getting syntax error. I have seen below questions on SO already but they don't help me:
Filter context for should in bool query (Elasticsearch)
It doesn't have any example.
Multiple OR filter in Elasticsearch
I get a syntax error
"type": "parsing_exception",
"reason": "no [query] registered for [filtered]",
"line": 1,
"col": 31
Maybe it's for a different version of ES.
All I need is a simple example with two 'or'ed conditions (mine is one range and one term but I guess that shouldn't matter much), both I would like to have in filter context (I don't care about scores, nor text search).
If you really need it, I can show my attempts (need to remove some 'sensitive'(duh) parts from it before posting), but they give parsing/syntax errors so I don't think there is any sense in them. I am aware that questions which don't show any efforts are considered bad for SO but I don't see any logic in showing attempts that aren't even parsed successfully, and any example would help me understand the syntax.

You need to wrap your should query in a filter query.
{
"query":{
"bool":{
"filter":[{
"bool":{
"should":[
{ // Query 1 },
{ // Query 2 }
]
}
}]
}
}
}

I had a similar scenario (even the range and match filter), with one more nested level, two conditions to be 'or'ed (as in your case) and another condition to be logically 'and'ed with its result. As #Pierre-Nicolas Mougel suggested in another answer I had nested bool clauses with one more level around the should clause.
{
"_source": [
"my_field"
],
"query": {
"bool": {
"filter": {
"bool": {
"must": [
{
"bool": {
"should": [
{
"range": {
"start": {
"gt": "1558878457851",
"lt": "1557998559147"
}
}
},
{
"range": {
"stop": {
"gt": "1558898457851",
"lt": "1558899559147"
}
}
}
]
}
},
{
"match": {
"my_id": "<My_Id>"
}
}
],
"must_not": []
}
}
}
},
"from": 0,
"size": -1,
"sort": [],
"aggs": {}
}
I read in the docs that minimum_should_match can be used too for forcing filter context. This might help you if this query doesn't work.

Related

How does the flow works in elasticsearch queries?

I have written a query which has couple of condition as shown below.
GET /agreement/_search
{
"query": {
"bool": {
"must": [
{
"multi_match": {
"query": "T-0668",
"fields": [
"agreecondition.agreementId",
"agreecondition.conditionContractId"
]
}
},
,
{
"range": {
"agreecondition.validFrom": {
"gte": "02/18/2019"
}
}
},
{
"range": {
"agreecondition.validTo": {
"lte": "03/07/2019"
}
}
}
],
"filter": [
{
"terms": {
"agreecondition.promotionId.keyword": [
"x",
"y"
]
}
}
]
}
}
}
My question is how the flow works?
Ex: Does the ES first gets the results for the must condition's multi-match and on the output of the multi-match, does the range condition applies? followed by filter(on top of the output of the range condition)?
I just wanted to get a clarity on this, if my assumption is wrong, then i need to re-write the query.
You can check elasticsearch official blog on query execution order to understand this in details but you might just not get all the details you are looking for, due to limitation elastic put as mentioned at the end of the blog:
Q: How can I check which query/filter got executed first?
A: We don't really expose this information, which is very internal. However if you
check the output of the profile API, you can count how many times
nextDoc/advance have been called on the one hand, and matches on the
other hand. Query nodes that have the higher counts have been run
first.
Note: Profile API will be very handful for you as suggested in the blog as well.

ElasticSearch query with MUST and SHOULD

I have this query to get data from AWS elasticSearch instance v6.2
{
"query": {
"bool": {
"must": [
{
"term": {"logLevel": "error"}
},
{
"bool": {
"should": [
{
"match": {"EventCategory": "Home Management"}
}
]
}
}
],
"filter": [{
"range": { "timestamp": { "gte": 155254550880 }}
}
]
}
},
"size": 10,
"from": 0
}
My data has multiple EventCategories for example 'Home Management' and 'User Account Management'. Problem with this is inside should having match returns all data because phrase 'Management' is in both categories. If I use term instead of match, it don't returns anything at all even when the given value is exactly same as in document.
I need to get data when any of given category is matched with rest of filters.
EDIT:
There may none, one or more than one EventCategory be passed to should clause
I'm not sure why you added a should within a must. Do you expect to have more than one should cases? It looks a bit odd.
As for your question, you can't use the term query on an analysed field, but only on keyword typed fields. If your EventCategory field has the default mapping, you can run the term query against the default non-analysed multi-field of EventCategory as follows:
...
{
"term": { "EventCategory.keyword": "Home Management" }
}
...
Furthermore, if you just want to filter in/out documents without caring about their relevance, I'd recommend you to move all the conditions in the filter block, to speed-up your query and make a better use of the cache.
Below query should work.
I've just removed should and created two must clauses one for each of event and management. Note that the query is meant for text datatypes.
{
"query":{
"bool":{
"must":[
{
"term":{
"logLevel":"error"
}
},
{
"match":{
"EventCategory":"home"
}
},
{
"match":{
"EventCategory":"management"
}
}
],
"filter":[
{
"range":{
"timestamp":{
"gte":155254550880
}
}
}
]
}
},
"size":10,
"from":0
}
Hope it helps!

Elasticsearch - filter conditions order

Can you tell me please if the conditions in Elasticsearch filter are evaluated in the order as they are in the request json or if Elasticsearch will make some optimization in it?
I have a query like:
{
"sort": {
"publishDate": "desc"
},
"query": {
"bool": {
"filter": [
{
"range": {
"publishDate": {
"lte": "2018-10-26",
"gt": "2018-08-31"
}
}
},
{
"terms": {
"ico": [
31322832,
34444444
]
}
}
]
}
}
}
and I think the optimal order of filters when evaluating is terms first and range next. So what happens in Elasticsearch? Filters will be evaluated in request order or will be optimized? Also if somebody knows how is it in Elasticsearch 2?
Thanks.
Check out this article about execution order of filters and queries, it is really great. I hope it help you ES execution order

ElasticSearch Query, match a certain term and count given a date range

I feel like this shouldn't be as difficult as its turning out to be, I've been attempting to use the:
index/_search
and
index/_count
endpoints, using query, bool, must filter etc. It seems no matter how I construct it, I cannot use range and date, with the match filter. The elasticsearch documentation doesn't seem to show complex queries like this so I'm not exactly sure how to construct it. The main query I've been manipulating is:
{
"query":{
"bool":{
"must":{
"range":{
"date":{
"gte":"now-1d/d",
"lt" :"now/d"
}
},
"match":{
"KEY":"VALUE"
}
}
}
}
}
I either get "no query registered for date", or "unknown key for a start_object in match" Been all over stackoverflow and can't seem to find an answer to this, it seems like it should be quite a simple query to make against a data store such as this. What am I missing here?
must can take an array of conditions if you want to combine them. Try this format :
{
"query": {
"bool": {
"must": [
{
"range": {
"date": {
"gte": "now-1d/d",
"lt": "now/d"
}
}
},
{
"match": { "KEY": "VALUE" }
}
]
}
}
}

Can _score from different queries be compared?

In my application, I issue multiple queries, each of which to a different index. Then, I merge the results from these queries, and sort them using the _score attribute, in order to rank them according to their relavance. But I wonder if this makes sense at all, since the results came from different queries?
I guess my question is: can _scores from different queries be compared?
Instead of issuing multiple queries , it would be a good idea to club them together in a single query.
You can use index query to do index specefic operation.
So something like
{
"bool": {
"should": [
{
"indices": {
"indices": [
"index1"
],
"query": {
"term": {
"tag": "wow"
}
}
}
},
{
"indices": {
"indices": [
"index2"
],
"query": {
"term": {
"name": "laptop"
}
}
}
}
]
}
}
Once this is done , results would be sorted based on the _score.
Hope that helps.

Resources