How to combine Boolean AND with Boolean OR in Elasticsearch query? - elasticsearch

Query: Get employee name "Mahesh" whose id is "200" and joining datetime is in a given date range and his epf status must be either 'NOK' or 'WRN'. (Possible values of epf_status are {OK,NOK,WRN,CANCELLED}.
I have written the following query, that matches epf_status also with OK, CANCELLED, but it must only match when epf_status is either 'NOK' or 'WRN'. What else do I need to change to make it work, as required?
GET myindex01/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"empname": { "query": "Mahesh", "operator": "AND" }
}
},
{
"match": {
"empid": { "query": "200", "operator": "AND" }
}
},
{
"range": {
"joining_datetime": {
"gte": "2020-01-01T00:00:00",
"lte": "2022-06-24T23:59:59"
}
}
}
],
"should": [
{ "match": { "epf_status": "NOK" } },
{ "match": { "epf_status": "WRN" } }
]
}
}
}
SAMPLE DATA:
{"Mahesh","200","2022-04-01","OK"}
{"Mahesh","200","2022-04-01","NOK"}
{"Mahesh","200","2022-04-01","WRN"}
{"Mahesh","200","2022-04-01","CANCELLED"}
REQUIRED OUTPUT:
{"Mahesh","200","2022-04-01","NOK"}
{"Mahesh","200","2022-04-01","WRN"}

Tldr;
You could be using the terms query for that I believe.
Returns documents that contain one or more exact terms in a provided field.
To solve
GET myindex01/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"empname": { "query": "Mahesh", "operator": "AND" }
}
},
{
"match": {
"empid": { "query": "200", "operator": "AND" }
}
},
{
"range": {
"joining_datetime": {
"gte": "2020-01-01T00:00:00",
"lte": "2022-06-24T23:59:59"
}
}
}
],
"should": [
{ "terms": { "epf_status": ["NOK", "WRN"] } }
]
}
}
}

Related

With Elasticsearch, how to use an OR instead of AND within filter->terms query?

I have this following query with elastic:
{
"query": {
"bool": {
"filter": [{
"terms": {
"participants.group": ["group1","group2"]
}
}, {
"range": {
"recordDate": {
"gte": "2020-05-14 00:00:00.000",
"lte": "2020-07-22 20:30:56.566"
}
}
}]
}
}
}
Currently, this finds records with participants with group "group1" and "group2".
How to change the query so it finds records with participants from "group1" or "group2?
Is it possible to do it without changing the structure of the query?
I'm assuming that the field participants.group is of keyword type and not text type.
Assuming that, the query you have roughly translates to (group1) or (group2) or (group1 and group2).
All you need to do is modify the query as below and add a must_not clause like below:
POST my_filter_index/_search
{
"query": {
"bool": {
"filter": [
{
"bool": {
"must": [
{
"range": {
"recordDate": {
"gte": "2020-05-14 00:00:00.000",
"lte": "2020-07-22 20:30:56.566"
}
}
}
],
"should": [
{
"terms": {
"participants.group": ["group1", "group2"]
}
}
]
}
}
],
"must_not": [
{
"bool": {
"must": [
{
"term": {
"participants.group": "group1"
}
},
{
"term": {
"participants.group": "group2"
}
}
]
}
}
]
}
}
}
Let me know if that works!

combine two queries of elasticsearch?

I have a "date_created_tranx" and "phone_number_cust" fields. Few entries of date_created_tranx are null . I want to have particular phone_number within date_range and with null value.
a = {
"query": {
"bool": {
"must": [
{
"range": {
"date_created_tranx": {
"gte": "2019-12-01",
"lte": "2020-05-07"
}
}
},
{
"regexp": {
"phone_number_cust": ".*702625.*"
}
}
]
}
}
}
b = {
"query": {
"bool": {
"must": [{
"regexp": {
"phone_number_cust": ".*702625.*"
}
}],
"must_not": [{
"exists": {
"field": "date_created_tranx"
}
}
]
}
}
}
How to combine these ??
I cannot call it twice because The result is paginated
I am totally new to elastic search . Any leads will be helpful.
I tried
doc2 = {
"query" :{
"bool" : {
"must":[
a,
b
]
}
}
}
It throws
Error: RequestError: RequestError(400, 'parsing_exception', 'no [query] registered for [query]')
The query you're looking for is this one, i.e.:
We have a constraint on the phone number and we also check that either the date_created_tranx is within bounds or does not exist (i.e. is null).
{
"query": {
"bool": {
"minimum_should_match": 1,
"should": [
{
"range": {
"date_created_tranx": {
"gte": "2019-12-01",
"lte": "2020-05-07"
}
}
},
{
"bool": {
"must_not": {
"exists": {
"field": "date_created_tranx"
}
}
}
}
],
"filter": [
{
"regexp": {
"phone_number_cust": ".*702625.*"
}
}
]
}
}
}

Find distinct/unique people without a birthday or have a birthday earlier than 3/1/1963

We have some employees and needed to find those we haven't entered their birthday or are born before 3/1/1963:
{
"query": {
"bool": {
"should": [
{
"bool": {
"must_not": [{ "exists": { "field": "birthday" } }]
}
},
{
"bool": {
"filter": [{ "range": {"birthday": { "lte": 19630301 }} }]
}
}
]
}
}
}
We now need to get distinct names...we only want 1 Jason or 1 Susan, etc. How do we apply a distinct filter to the "name" field while still filtering for the birthday as above? I've tried:
{
"query": {
"bool": {
"should": [
{
"bool": {
"must_not": [
{
"exists": {
"field": "birthday"
}
}
]
}
},
{
"bool": {
"filter": [
{
"range": {
"birthday": {
"lte": 19630301
}
}
}
]
}
}
]
}
},
"aggs": {
"uniq_gender": {
"terms": {
"field": "name"
}
}
},
"from": 0,
"size": 25
}
but just get results with duplicate Jasons and Susans. At the bottom it will show me that there are 10 Susans and 12 Jasons. Not sure how to get unique ones.
EDIT:
My mapping is very simple. The name field doesn't need to be keyword...can be text or anything else as it is just a field that just gets returned in the query.
{
"mappings": {
"birthdays": {
"properties": {
"name": {
"type": "keyword"
},
"birthday": {
"type": "date",
"format": "basic_date"
}
}
}
}
}
Without knowing your mapping, I'm guessing that your field name is not analyzed and able to be used on terms aggregation properly.
I suggest you, use filtered aggregation:
{
"aggs": {
"filtered_employes": {
"filter": {
"bool": {
"must": [
{
"bool": {
"must_not": [
{
"exists": {
"field": "birthday"
}
}
]
}
},
{
"range": {
"birthday": {
"lte": 19630301
}
}
}
]
}
},
"aggs": {
"filtered_employes_by_name": {
"terms": {
"field": "name"
}
}
}
}
}
}
In other hand your query is not correct your applying a should bool filter. Change it by must and the aggregation will return only results from employes with (missing birthday) and (born before date).

ElasticSearch should/must clause not working as expected

Below is my elastic query
GET _search
{
"query": {
"bool": {
"must": {
"match": {
"marriages.marriage_year": "1630"
}
},
"should": {
"match": {
"first_name": {
"query": "mary",
"fuzziness": "2"
}
}
},
"must": {
"range": {
"marriages.marriage_year": {
"gt": "1620",
"lte": "1740"
}
}
}
}
}
}
It is returning data with marriages.marriage_year= "1630" with Mary as first_name as highest score.I also want to include marriages.marriage_year between 1620 - 1740 which are not shown in the results. It is showing data only for marriage_year 1630
That's because you have two bool/must clauses and the second one gets eliminated when the JSON query is parsed. Rewrite it like this instead and it will work:
{
"query": {
"bool": {
"must": [
{
"match": {
"marriages.marriage_year": "1630"
}
},
{
"range": {
"marriages.marriage_year": {
"gt": "1620",
"lte": "1740"
}
}
}
],
"should": {
"match": {
"first_name": {
"query": "mary",
"fuzziness": "2"
}
}
}
}
}
}
UPDATE
Then you need to do it differently and in the bool/must you need to have only the range query and move the match inside the bool/should section:
{
"query": {
"bool": {
"must": [
{
"range": {
"marriages.marriage_year": {
"gt": "1620",
"lte": "1740"
}
}
}
],
"should": [
{
"match": {
"first_name": {
"query": "mary",
"fuzziness": "2"
}
}
},
{
"match": {
"marriages.marriage_year": "1630"
}
}
]
}
}
}

Elasticsearch return exact match first then other matches

I have some PageDocuments which I would like to search based on the title, excluding PageDocuments with a path starting with some particular text. This field is analyzed. I would like some fuzziness to help users with spelling mistakes. I need to be able to do partial matches so some would match some text and this is some text.
If I use the following query I don't get an exact match back as the first result because of tf-idf
{
"size": 20,
"query": {
"bool": {
"must": [
{
"match": {
"title": {
"query": "myterm",
"fuzziness": 1
}
}
}
],
"must_not": [
{
"wildcard": {
"path": {
"value": "/test/*"
}
}
}
]
}
}
}
So then I added a not_analyzed version of the title field at title.not_analyzed and tried adding a function score to increase the weighting of an exact match using term.
{
"query": {
"function_score": {
"functions": [
{
"weight": 2,
"filter": {
"fquery": {
"query": {
"term": {
"title.not_analyzed": {
"value": "myterm"
}
}
}
}
}
}
],
"query": {
"bool": {
"must": [
{
"match": {
"title": {
"query": "myterm",
"fuzziness": 1
}
}
}
],
"must_not": [
{
"wildcard": {
"path": {
"value": "/path/*"
}
}
}
]
}
},
"boost_mode": "multiply"
}
}
}
But this gives me the same results. How can I get the exact matches returned first?
We found a solution to this by adding a combination of should and boost.
{
"size": 20,
"query": {
"bool": {
"must": [
{
"match": {
"title": {
"query": "myterm",
"fuzziness": 1
}
}
}
],
"must_not": [
{
"wildcard": {
"path": {
"value": "/path/*"
}
}
}
],
"should": [
{
"term": {
"title": {
"value": "myterm",
"boost": 10
}
}
}
]
}
}
}

Resources