How do I combine elasticsearch filter queries with OR using the Java API - elasticsearch

I have these fields:
country (can be "DK", "US", "UK" and so on)
media ("book", "ebook", "cd")
state ("active", "inactive")
I would like to search for all documents that have country="DK" AND ((media="book" AND state="inactive") OR (media="ebook" AND state="ACTIVE)
I am creating a BoolQueryBuilder like this:
BoolQueryBuilder bqb = QueryBuilders
.boolQuery()
.must(QueryBuilders.termsQuery("country", "DK"));
bqb.filter(QueryBuilders.termsQuery("media", "book"));
bqb.filter(QueryBuilders.termsQuery("state", "inactive");
NativeSearchQuery searchQuery = new NativeSearchQueryBuilder()
.withQuery(bqb)
.build();
From what I can understand from this Stackoverflow question: elasticsearch bool query combine must with OR I should create a query looking like this:
{
"query": {
"bool": {
"must": [
{
"term": {"country": "DK"}
},
{
"bool": {
"should": [
{
"bool": {
"must": [
{"term": {"media": "book"}},
{"term": {"state": "inactive"}}
]
}
},
{
"bool": {
"must": [
{"term": {"media": "ebook"}},
{"term": {"state": "active"}}
]
}
}
]
}
}
]
}
}
}
Is this correct?
How do I do this with the Java API?

After some trial and error this seems to work:
BoolQueryBuilder bqb = QueryBuilders
.boolQuery()
.must(QueryBuilders.termsQuery("country", "DK"));
BoolQueryBuilder query1 = QueryBuilders.boolQuery();
query1.filter(QueryBuilders.termsQuery("media", "book"));
query1.filter(QueryBuilders.termsQuery("state", "inactive");
BoolQueryBuilder query2 = QueryBuilders.boolQuery();
query2.filter(QueryBuilders.termsQuery("media", "ebook"));
query2.filter(QueryBuilders.termsQuery("state", "active");
bqb.filter(QueryBuilders.boolQuery().should(query1).should(query2));
NativeSearchQuery searchQuery = new NativeSearchQueryBuilder()
.withQuery(bqb)
.build();
And it generated this query:
{
"bool" : {
"must" : [
{
"terms" : {
"country" : [
"DK"
],
"boost" : 1.0
}
}
],
"filter" : [
{
"bool" : {
"should" : [
{
"bool" : {
"filter" : [
{
"terms" : {
"media" : [
"book"
],
"boost" : 1.0
}
},
{
"terms" : {
"state" : [
"inactive"
],
"boost" : 1.0
}
}
],
"adjust_pure_negative" : true,
"boost" : 1.0
}
},
{
"bool" : {
"filter" : [
{
"terms" : {
"media" : [
"ebook"
],
"boost" : 1.0
}
},
{
"terms" : {
"state" : [
"active"
],
"boost" : 1.0
}
}
],
"adjust_pure_negative" : true,
"boost" : 1.0
}
}
],
"adjust_pure_negative" : true,
"boost" : 1.0
}
}
],
"adjust_pure_negative" : true,
"boost" : 1.0
}
}

Related

Elastic Search combination of with Multiple Range, Term filters with And and Or operators

I have a filter with multiple data range filter with And and OR operators. I have to get filter results which satisfies both date range filters or any one of the date range filter.
"query":{
"bool" : {
"must" : [
{
"match_phrase_prefix" : {
"searchField" : {
"query" : "Adam",
"slop" : 0,
"max_expansions" : 50,
"boost" : 1.0
}
}
}
],
"filter" : [
{
"term" : {
"srvcType" : {
"value" : "FullTime",
"boost" : 1.0
}
}
},
{"range" : { "or": {"startDt": {"from" : "2010-05-16","to" : "2022-02-18","include_lower": true,"include_upper" : true,"boost" : 1.0}} }},
{"range" : { "or": {"endDt": {"from" : "2015-05-16","to" : "2022-02-18","include_lower" : true,"include_upper" : true,"boost" : 1.0}}}}
],
"adjust_pure_negative" : true,
"boost" : 1.0
}
}
}
I tried to run the query like above, I got parsing_exception - query does not support StartDt.
{
"query":{
"bool" : {
"must" : [
{
"match_phrase_prefix" : {
"searchField" : {
"query" : "Adam",
"slop" : 0,
"max_expansions" : 50,
"boost" : 1.0
}
}
}
],
"filter" : [
{
"term" : {
"srvcType" : {
"value" : "FullTime",
"boost" : 1.0
}
}
},
{"range" : {"startDt": {"from" : "2010-05-16","to" : "2022-02-18","include_lower": true,"include_upper" : true,"boost" : 1.0}} },
{"range" : {"endDt": {"from" : "2015-05-16","to" : "2022-02-18","include_lower" : true,"include_upper" : true,"boost" : 1.0}}}
],
"adjust_pure_negative" : true,
"boost" : 1.0
}
}
}
If you need AND semantics for your date range filters, you can let both range queries in the bool/filter array.
However, if you need OR semantics you can use the bool/should query, like below:
{
"query": {
"bool": {
"must": [
{
"match_phrase_prefix": {
"searchField": {
"query": "Adam",
"slop": 0,
"max_expansions": 50,
"boost": 1
}
}
}
],
"filter": [
{
"term": {
"srvcType": {
"value": "FullTime",
"boost": 1
}
}
}
],
"minimum_should_match": 1,
"should": [
{
"range": {
"startDt": {
"from": "2010-05-16",
"to": "2022-02-18",
"include_lower": true,
"include_upper": true,
"boost": 1
}
}
},
{
"range": {
"endDt": {
"from": "2015-05-16",
"to": "2022-02-18",
"include_lower": true,
"include_upper": true,
"boost": 1
}
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
}
}

Why does this query cause 'too many clauses'?

I have a query with only a few 'shoulds' and 'filters', but one of the filters has a terms query with ~20,000 terms in it. Our max_terms_count is 200k but this is complaining about 'clauses'.
Caused by: org.elasticsearch.ElasticsearchException: Elasticsearch exception [type=too_many_clauses, reason=too_many_clauses: maxClauseCount is set to 1024]
I've written queries containing terms queries with far more terms than this. Why is this query causing a 'too many clauses' error? How can I rewrite this query to get the same result without the error?
{
"query" : {
"bool" : {
"filter" : [
{
"nested" : {
"query" : {
"range" : {
"dateField" : {
"from" : "2019-12-03T21:34:30.653Z",
"to" : "2020-12-02T21:34:30.653Z",
"include_lower" : true,
"include_upper" : true,
"boost" : 1.0
}
}
},
"path" : "observed_feeds",
"ignore_unmapped" : false,
"score_mode" : "none",
"boost" : 1.0
}
}
],
"should" : [
{
"bool" : {
"filter" : [
{
"terms" : {
"ipAddressField" : [
"123.123.123.123",
"124.124.124.124",
... like 20,000 of these
],
"boost" : 1.0
}
}
],
"adjust_pure_negative" : true,
"boost" : 1.0
}
}
],
"adjust_pure_negative" : true,
"minimum_should_match" : "1",
"boost" : 1.0
}
}
}
Edit: one note - The reason I'm wrapping the terms query in a should -> bool is because there are times where we need to have multiple terms queries OR'd together. This happened to not be one of them.
The reason you are facing this with terms query is because the should clause is outside filter clause and contributing to score calculation. This is the reason these terms are subject to max_clause_count. If score is not required for that part then you can rephrase you query as below:
{
"query": {
"bool": {
"filter": [
{
"nested": {
"query": {
"range": {
"dateField": {
"from": "2019-12-03T21:34:30.653Z",
"to": "2020-12-02T21:34:30.653Z",
"include_lower": true,
"include_upper": true,
"boost": 1
}
}
},
"path": "observed_feeds",
"ignore_unmapped": false,
"score_mode": "none",
"boost": 1
}
},
{
"bool": {
"should": [
{
"bool": {
"filter": [
{
"terms": {
"ipAddressField": [
"123.123.123.123",
"124.124.124.124",
... like 20,000 of these
],
"boost": 1
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
}
]
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
}
}

How to built AND condition between should and must elastic search bool query

Here is the sample USER document
{
"id" : "1234567",
"userId" : "testuser01",
"firstName" : "firstname",
"lastName" : "lastname",
"orgId" : "567890",
"phoneNumber" : "1234567890"
}
I want to build a search query where in I want to pull all those users which belong to particular orgId AND which matches the search text entered by user in any of the fields (userId, firstname, etc.)
ex. if search is made using text "first", I want to pull all those records which belong to particular orgId AND fields containing first in it.
Sample query I am trying is
"query" : {
"bool" : {
"must" : [
{
"term" : {
"orgId.keyword" : {
"value" : "567890",
"boost" : 1.0
}
}
}
],
"should" : [
{
"simple_query_string" : {
"query" : "first*",
"fields" : [
"lastName^1.0"
],
"flags" : -1,
"default_operator" : "or",
"lenient" : false,
"analyze_wildcard" : true,
"boost" : 1.0
}
},
{
"simple_query_string" : {
"query" : "first*",
"fields" : [
"userId^1.0"
],
"flags" : -1,
"default_operator" : "or",
"lenient" : false,
"analyze_wildcard" : true,
"boost" : 1.0
}
},
{
"simple_query_string" : {
"query" : "first*",
"fields" : [
"orgId^1.0"
],
"flags" : -1,
"default_operator" : "or",
"lenient" : false,
"analyze_wildcard" : true,
"boost" : 1.0
}
},
{
"simple_query_string" : {
"query" : "first*",
"fields" : [
"firstName^1.0"
],
"flags" : -1,
"default_operator" : "or",
"lenient" : false,
"analyze_wildcard" : true,
"boost" : 1.0
}
},
{
"simple_query_string" : {
"query" : "first*",
"fields" : [
"phoneNumber^1.0"
],
"flags" : -1,
"default_operator" : "or",
"lenient" : false,
"analyze_wildcard" : true,
"boost" : 1.0
}
},
{
"simple_query_string" : {
"query" : "first*",
"fields" : [
"id^1.0"
],
"flags" : -1,
"default_operator" : "or",
"lenient" : false,
"analyze_wildcard" : true,
"boost" : 1.0
}
}
],
"disable_coord" : false,
"adjust_pure_negative" : true,
"boost" : 1.0
}
},
"sort" : [
{
"userId.keyword" : {
"order" : "asc"
}
}
]
}
Issue I am facing is, I want to have AND condition between MUST and SHOULD.
You don't need to specify the query for each field in query_string query. Rather you can specify the list of fields as below:
{
"query": {
"bool": {
"must": [
{
"term": {
"orgId.keyword": {
"value": "567890",
"boost": 1
}
}
},
{
"simple_query_string": {
"query": "first*",
"fields": [
"lastName^1.0",
"userId^1.0",
"orgId^1.0",
"firstName^1.0",
"phoneNumber^1.0",
"id^1.0"
]
}
}
]
}
},
"sort": [
{
"userId.keyword": {
"order": "asc"
}
}
]
}
Also to answer
How to built AND condition between should and must elastic search bool query?
here is a sample query for this:
{
"query": {
"bool": {
"must": [
{
"term": {
"field1": "someval"
}
},
{
"bool": {
"should": [
{
"terms": {
"field2": [
"v1",
"v2"
]
}
},
{
"query_string": {
"query": "this AND that OR thus"
}
}
]
}
}
]
}
}
}

How to combine "must" and "should" in Elasticsearch query?

I need to "translate" this pseudo-SQL query in Elasticsearch query DSL:
select from invoice where invoiceType = 'REGULAR' and receiver =
'CUSTOMER' and (invoiceStatus = 'DISPATCHED' or invoiceStatus = 'PAYED')
I have this:
{
"query": {
"bool": {
"must": [
{ "match": { "invoiceType": "REGULAR" }},
{ "match": { "receiver": "CUSTOMER" }},
{ "bool" : {
"should": [
{"match" : {"invoiceStatus": "DISPATCHED"}},
{"match" : {"invoiceStatus": "PAYED"}}
]
}
}
]
}
}
}
That query is returning 0 results, but I know there are many that matches what I'm searching for. AFAIK, must would be like 'AND' and should like 'OR'. What am I missing?
Not sure that it will work for you or not but you can make a try and see what you get? Though I did some change with match to term. Hope this will help you.
GET /invoice/_search
{
"query" : {
"constant_score" : {
"filter" : {
"bool" : {
"must" : [
{ "term" : {"invoiceType" : "REGULAR"}},
{ "term": { "receiver": "CUSTOMER" }},
{ "bool" : {
"should" : [
{"terms": {"invoiceStatus": ["DISPATCHED","PAYED"]}}
]
}}
]
}
}
}
}
}
OR
GET /invoice/_search
{
"query" : {
"constant_score" : {
"filter" : {
"bool" : {
"must" : [
{ "term" : {"invoiceType" : "REGULAR"}},
{ "term": { "receiver": "CUSTOMER" }},
{ "bool" : {
"should" : [
{"term": {"invoiceStatus": "DISPATCHED"}},
{"term": {"invoiceStatus": "PAYED"}}
]
}}
]
}
}
}
}
}

Elasticsearch bool query formation with multiple must clause

I have a query like the following -
{
"query": {
"bool": {
"must": {
"bool" : { "should": [
{ "match": { "camp_id": "Elasticsearch" }},
{ "match": { "camp_id": "Solr" }} ] }
},
"must": {
"bool" : { "should": [
{ "match": { "ad_id": "Elastic" }},
{ "match": { "ad_id": "dummy" }} ] }
},
"must_not": { "match": {"authors": "radu gheorge" }},
.....
.....
}
}
}
In short, (camp_id = 'elasticsearch' or camp_id = 'solr') AND (ad_id = 'elasticsearch' or ad_id = 'solr') ....
After good amount of research, I wrote the following java code -
final SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
final BoolQueryBuilder finalBoolQuery = new BoolQueryBuilder();
BoolQueryBuilder campaignBoolQuery = null;
if (campaignIds != null) {
campaignBoolQuery = QueryBuilders.boolQuery();
for (int campaignId : campaignIds) {
campaignBoolQuery.should(QueryBuilders.matchQuery("camp_id", campaignId));
}
}
BoolQueryBuilder creativeBoolQuery = null;
if (creativeIds != null) {
creativeBoolQuery = QueryBuilders.boolQuery();
for (int creativeId : creativeIds) {
creativeBoolQuery.should(QueryBuilders.matchQuery("ad_id", creativeId));
}
}
finalBoolQuery.must(campaignBoolQuery);
finalBoolQuery.must(creativeBoolQuery);
searchSourceBuilder.query(finalBoolQuery).size(9999);
System.out.println(searchSourceBuilder.toString());
With the above code, I expected that I would have 1 must clause for 'camp_id' and another 1 for 'ad_id' but following is what I got -
{
"size" : 9999,
"query" : {
"bool" : {
"must" : [
{
"bool" : {
"should" : [
{
"match" : {
"camp_id" : {
"query" : 1,
"operator" : "OR",
"prefix_length" : 0,
"max_expansions" : 50,
"fuzzy_transpositions" : true,
"lenient" : false,
"zero_terms_query" : "NONE",
"boost" : 1.0
}
}
},
{
"match" : {
"camp_id" : {
"query" : 2,
"operator" : "OR",
"prefix_length" : 0,
"max_expansions" : 50,
"fuzzy_transpositions" : true,
"lenient" : false,
"zero_terms_query" : "NONE",
"boost" : 1.0
}
}
}
],
"disable_coord" : false,
"adjust_pure_negative" : true,
"boost" : 1.0
}
},
{
"bool" : {
"should" : [
{
"match" : {
"ad_id" : {
"query" : 1,
"operator" : "OR",
"prefix_length" : 0,
"max_expansions" : 50,
"fuzzy_transpositions" : true,
"lenient" : false,
"zero_terms_query" : "NONE",
"boost" : 1.0
}
}
}
],
"disable_coord" : false,
"adjust_pure_negative" : true,
"boost" : 1.0
}
}
],
"disable_coord" : false,
"adjust_pure_negative" : true,
"boost" : 1.0
}
}
}
There is only one must clause which wraps both camp_id and ad_id. Can someone please point out what am I missing? I am using elastic search version - 5.5.0 and jest - 2.4.0 as my java client.
your outer bool sample query contains two must clauses, however that must be a single must clause, that contains of an array of objects. I suppose you are overwriting the first must clause with the second, when calling must() twice.

Resources