Aggreagate results based on 3 different elasticsearch queries - elasticsearch

I have 3 different search queries coming from different sources, I want to aggregate these queries in to a single query that will return the results that is union of these 3 queries (OR operation on query)
For example:
Query 1:
query: {
bool: {
filter: [
{ terms: { tags.keyword: ['apple', 'banana'] }},
{ terms: { language.keyword: ['en'] }},
]
}
}
Query 2:
query: {
bool: {
filter: [
{ terms: { tags.keyword: ['orange', 'mango'] }},
{ terms: { language.keyword: ['it'] }},
{ terms: { source.keyword: ['Royal Garden'] }},
]
}
}
Query 3:
query: {
bool: {
filter: [
{ terms: { owner.keyword: ['Dan Chunmun'] }},
{ terms: { language.keyword: ['en'] }},
{ terms: { source.keyword: ['Royal Garden'] }},
]
}
}
I what to have the search result that is:
Result = Query 1 OR Query 2 OR Query 3 (Union of all 3 queries)
I was looking at How to combine multiple bool queries in elasticsearch question, but there it is not explained how to merge the query.
I tried using should clause but not able to get the expected result so far.
I tried combining the bool part of the queries above like:
const boolTerms: any = [];
Queries.map(q => {
return boolTerms.push(q.query);
});
// combined query
filter : {
bool: {
should: boolTerms
}
}

There are two ways for combining queries
Query string query is meant for these use cases only. In query string query you can write each and every query in a string format
Let define a clause A=tags.keyword: ['apple', 'banana']
Now way you can combine multiple is this
{
"query": {
"query_string": {
"query": "(A and B) OR (D and E and F) or(G and H and E)"
}
}
}
Here A and B are all clauses of query 1.
But since query string query is a full-text query and analyzers will be applied to query terms, so for your case bool query would be used, in which you can combine tern queries as well
Here is an Example
{
"query": {
"bool": {
"should": [
{
"bool": {
"must": [
{
"terms": {
"tags.keyword": ["apple", "banana"]
}
},
{
"terms": {
"language.keyword": ["en"]
}
}
]
}
},
{
"bool": {
"must": [
{
"terms": {
"tags.keyword": ["orange", "banana"]
}
},
{
"terms": {
"language.keyword": ["it"]
}
}
]
}
}
]
}
}
}

Related

ElasticSearch / OpenSearch term search with logical OR

I have been scratching my head for a while looking at OpenSearch documentation and stackoverflow questions. How can I do something like this:
Select documents WHERE studentId in [1234, 5678] OR applicationId in [2468, 1357].
As long as studentId exactly matches one of the supplied values, or applicationId exactly matches one of the supplied values, then that document should be included in the response.
When I want to search for multiple values for a single field and get an exact match the following works:
{
"must":[
{
"terms": {
"studentId":["1234", "5678"]
}
}
]
}
This will find me exact matches on studentId in [1234, 5678].
If I try to add the condition to also look for (logical or) applicationId in [2468, 1357] then the following will not work:
{
"must":[
{
"terms": {
"studentId":["1234", "5678"]
}
},
{
"terms": {
"applicationId":["2468", "1357"]
}
}
]
}
because this will do a logical and on the two queries. I want logical or.
I cannot use should because this returns irrelevant results. The following does not work for me:
{
"should":[
{
"terms": {
"studentId":["1234", "5678"]
}
},
{
"terms": {
"applicationId":["2468", "1357"]
}
}
]
}
This seems to return all results, ranked by relevance. I find that the returned results do not actually match, despite the fact that this is a terms search.
Can you try with following query..
{
"query": {
"bool": {
"should": [
{
"bool": {
"must": [
{
"terms": {
"studentId":["1234", "5678"]
}
}
]
}
},
{
"bool": {
"must": [
{
"terms": {
"applicationId":["2468", "1357"]
}
}
]
}
}
]
}
}
}

How To Combine Multiple Queries In ElasticSearch

I am encountering an issue trying to correctly combine elastic search queries, in SQL my query would look something like this:
Select * from ABPs where (PId = 10 and PUId = 1130) or (PId = 30 and PUId = 2000) or (PlayerID = '12345')
I can achieve each of these by themselves and get correct results.
Query A) (PId = 10 and PUId = 1130)
translates to
{
"query": {
"bool": {
"must": [
{
"term": {
"PId": "1366"
}
},
{
"term": {
"PUId": "10"
}
}
]
}
}
}
Query B) (PId = 10 and PUId = 1130)
translates the same as above just with different values
Query C) (PlayerID = '12345')
translates to
{
"query": {
"match": {
"PlayerUuid": "62fe0832-7881-477c-88bb-9cbccdbfb3c3"
}
}
}
I have been trying to figure out how to get all of these into the same ES search query and I am just not having any luck at all and was hoping someone with more extensive ES experience would be able to give me a hand.
You can make use of Bool query using should(Logical OR) and must(Logical AND) clause.
Below is the ES query representation of the clause Select * from ABPs where (PId = 10 and PUId = 1130) or (PId = 30 and PUId = 2000) or (PlayerID = '12345')
POST <your_index_name>/_search
{
"query": {
"bool": {
"should": [
{
"bool": {
"must": [
{
"term": {
"PId": "10"
}
},
{
"term": {
"PUId": {
"value": "1130"
}
}
}
]
}
},
{
"bool": {
"must": [
{
"term": {
"PId": "30"
}
},
{
"term": {
"PUId": "2000"
}
}
]
}
},
{
"bool": {
"must": [
{
"term": {
"PlayerId": "12345"
}
}
]
}
}
]
}
}
}
Note that I'm assuming the fields PId, PUId and PlayerId are all of type keyword.
Wrap all your queries into a should-clause of a bool-query which you put in the filter-clause of another top-level bool-query.
Pseudo-code (as I’m typing on a cell phone):
“bool”: {
“filter”: {
“bool”: {
“should”: [
{query1},
{query2},
{query3}
]
}
}
}
In a bool- query made up of only should-clauses, will make it a requirement that at least one of the queries in the should-clause has to match (minimum_should_match-will be in such a scenario).
Update with the actual query (additional explanation):
POST <your_index_name>/_search
{
"query": {
"bool": {
"filter": {
"bool": {
"should": [
{"bool": {{"must": [ {"term": {"PId": "10"}},{"term": {"PUId": "1130"}} ]}}},
{"bool": {{"must": [ {"term": {"PId": "30"}},{"term": {"PUId": "2000"}} ]}}},
{"term": {"PlayerId": "12345"}}
]
}
}
}
}
}
The example above is wrapping your actual bool-query in a filer-clause of another top-level bool-query to follow best-practices and guarantee for a better performance: whenever you don't care about the score, especially when it's always about exact-matching queries, you should put them into filter-clauses. For those queries Elasticsearch will not calculate a score and therefore can even potentially cache the results of that query for even better performance.

Elasticsearch: Full match of query value from beginning to end of field

I have problem with full match querying of field value. title and gender - fields of indexed docs
query: {
query_string: {
query: "box AND gender:\"women\"",
default_field: "title"
}
}
I use double quotes to match full query for gender. But if there is gender "men,women" with title 'box' it also will be in results. I know, that elasticsearch does not support regexp characters ^ and $ for beginning and end of the string, so I couldn't make /^women$/.
What do I need to do if I want docs matching only 'women' gender, not 'men,women' ?
Q:
What do I need to do if I want docs matching only 'women' gender, not 'men,women' ?
For exact searches you should use a terms query rather than a fulltext-search query like the query_string. So to get all documents that matches exactly gender == women you should do it like so:
GET your-index/_search
{
"query": {
"bool": {
"must": [
{
"term": {
"gender.keyword": {
"value": "women"
}
}
}
]
}
}
}
Please be aware that this query assumes that the gender-field is also mapped as a keyword.
To complete the query you would add another must-clause to get all documents that have box in the title field women as the value of the gender-field.
GET your-index/_search
{
"query": {
"bool": {
"must": [
{
"term": {
"gender.keyword": {
"value": "women"
}
}
},
{
"match": {
"title": "box"
}
}
]
}
}
}
Thank you apt-get_install_skill. Keyword did the work, but with some addings.
Summary this is solution:
query: {
bool: {
must: {
query_string: {
query: "box",
default_field: "title"
}
},
filter: {
bool: {
should: [
{term: {"gender.keyword": "women"}}
]
}
}
}
}
I need should as array for searching multiple genders if I will need it. For example, some docs have unisex gender, such as 'women,men'
Example with multiple genders:
query: {
bool: {
must: {
query_string: {
query: "box",
default_field: "title"
}
},
filter: {
bool: {
should: [
{term: {"gender.keyword": "women"}},
{term: {"gender.keyword": "kids"}}
#summary it may be gender 'girls'
]
}
}
}
}

ElasticSearch NEST combining AND with OR queries

Problem
How do you write NEST code to generate an elastic search query for this simple boolean logic?
term1 && (term2 || term3 || term4)
Pseudo code on my implementation of this logic using Nest (5.2) statement to query ElasticSearch (5.2)
// additional requirements
( truckOemName = "HYSTER" && truckModelName = "S40FT" && partCategoryCode = "RECO" && partID != "")
//Section I can't get working correctly
AND (
( SerialRangeInclusiveFrom <= "F187V-6785D" AND SerialRangeInclusiveTo >= "F187V-6060D" )
OR
( SerialRangeInclusiveFrom = "" || SerialRangeInclusiveTo = "" )
)
Interpretation of Related Documentation
The "Combining queries with || or should clauses" in Writing Bool Queries mentions
The bool query does not quite follow the same boolean logic you expect from a programming language. term1 && (term2 || term3 || term4) does not become
bool
|___must
| |___term1
|
|___should
|___term2
|___term3
|___term4
you could get back results that only contain term1
which is exactly what I think is happening.
But their answer to solve this is above my understanding of how to apply it with Nest. The answer is either?
Add parentheses to force evaluation order (i am)
Use boost factor? (what?)
Code
Here's the NEST code
var searchDescriptor = new SearchDescriptor<ElasticPart>();
var terms = new List<Func<QueryContainerDescriptor<ElasticPart>, QueryContainer>>
{
s =>
(s.TermRange(r => r.Field(f => f.SerialRangeInclusiveFrom)
.LessThanOrEquals(dataSearchParameters.SerialRangeEnd))
&&
s.TermRange(r => r.Field(f => f.SerialRangeInclusiveTo)
.GreaterThanOrEquals(dataSearchParameters.SerialRangeStart)))
//None of the data that matches these ORs returns with the query this code generates, below.
||
(!s.Exists(exists => exists.Field(f => f.SerialRangeInclusiveFrom))
||
!s.Exists(exists => exists.Field(f => f.SerialRangeInclusiveTo))
)
};
//Terms is the piece in question
searchDescriptor.Query(s => s.Bool(bq => bq.Filter(terms))
&& !s.Terms(term => term.Field(x => x.OemID)
.Terms(RulesHelper.GetOemExclusionList(exclusions))));
searchDescriptor.Aggregations(a => a
.Terms(aggPartInformation, t => t.Script(s => s.Inline(script)).Size(50000))
);
searchDescriptor.Type(string.Empty);
searchDescriptor.Size(0);
var searchResponse = ElasticClient.Search<ElasticPart>(searchDescriptor);
Here's the ES JSON query it generates
{
"query":{
"bool":{
"must":[
{
"term":{ "truckOemName": { "value":"HYSTER" }}
},
{
"term":{ "truckModelName": { "value":"S40FT" }}
},
{
"term":{ "partCategoryCode": { "value":"RECO" }}
},
{
"bool":{
"should":[
{
"bool":{
"must":[
{
"range":{ "serialRangeInclusiveFrom": { "lte":"F187V-6785D" }}
},
{
"range":{ "serialRangeInclusiveTo": { "gte":"F187V-6060D" }}
}
]
}
},
{
"bool":{
"must_not":[
{
"exists":{ "field":"serialRangeInclusiveFrom" }
}
]
}
},
{
"bool":{
"must_not":[
{
"exists":{ "field":"serialRangeInclusiveTo" }
}
]
}
}
]
}
},
{
"exists":{
"field":"partID"
}
}
]
}
}
}
Here's the query we'd like it to generate that seems to work.
{
"query": {
"bool": {
"must": [
{
"bool": {
"must": [
{
"term": { "truckOemName": { "value": "HYSTER" }}
},
{
"term": {"truckModelName": { "value": "S40FT" }}
},
{
"term": {"partCategoryCode": { "value": "RECO" }}
},
{
"exists": { "field": "partID" }
}
],
"should": [
{
"bool": {
"must": [
{
"range": { "serialRangeInclusiveFrom": {"lte": "F187V-6785D"}}
},
{
"range": {"serialRangeInclusiveTo": {"gte": "F187V-6060D"}}
}
]
}
},
{
"bool": {
"must_not": [
{
"exists": {"field": "serialRangeInclusiveFrom"}
},
{
"exists": { "field": "serialRangeInclusiveTo"}
}
]
}
}
]
}
}
]
}
}
}
Documentation
Combining Filters
Bool Query
Writing Bool Queries
With overloaded operators for bool queries, it is not possible to express a must clause combined with a should clause i.e.
term1 && (term2 || term3 || term4)
becomes
bool
|___must
|___term1
|___bool
|___should
|___term2
|___term3
|___term4
which is a bool query with two must clauses where the second must clause is a bool query where there has to be a match for at least one of the should clauses. NEST combines the queries like this because it matches the expectation for boolean logic within .NET.
If it did become
bool
|___must
| |___term1
|
|___should
|___term2
|___term3
|___term4
a document is considered a match if it satisfies only the must clause. The should clauses in this case act as a boost i.e. if a document matches one or more of the should clauses in addition to the must clause, then it will have a higher relevancy score, assuming that term2, term3 and term4 are queries that calculate a relevancy score.
On this basis, the query that you would like to generate expresses that for a document to be considered a match, it must match all of the 4 queries in the must clause
"must": [
{
"term": { "truckOemName": { "value": "HYSTER" }}
},
{
"term": {"truckModelName": { "value": "S40FT" }}
},
{
"term": {"partCategoryCode": { "value": "RECO" }}
},
{
"exists": { "field": "partID" }
}
],
then, for documents matching the must clauses, if
it has a serialRangeInclusiveFrom less than or equal to "F187V-6785D" and a serialRangeInclusiveFrom greater than or equal to "F187V-6060D"
or
serialRangeInclusiveFrom and serialRangeInclusiveTo
then boost that documents relevancy score. The crucial point is that
If a document matches the must clauses but does not match any
of the should clauses, it will still be a match for the query (but
have a lower relevancy score).
If that is the intent, this query can be constructed using the longer form of the Bool query

In Elasticsearch how to use multiple term filters when number of terms are not fixed they can vary?

I know for using multiple term filters one should use bools but the problem here is that i dont know how many terms there gonna be for example i want to filter results on strings with OR ("aa", "bb", "cc", "dd", "ee") now i want my searches that will contain any of the strings but the problem is that sometimes this array size will be 15 or 10 or 20 now how can i handle number of terms in filters my code is given below.
var stores = docs.stores; // **THIS IS MY ARRAY OF STRINGS**
client.search({
index: 'merchants',
type: shop_type,
body: {
query: {
filtered: {
filter: {
bool: {
must: [
{
// term: { 'jeb_no': stores }, // HERE HOW TO FILTER ALL ARRAY STRINGS WITH OR CONDITION
}
]
}
}
}
}, script_fields : {
"area": {
"script" : "doc['address.area2']+doc['address.area1']"
}
}
}
})
I think this will do. Use terms instead of term
{
"query": {
"bool": {
"must": [
{
"terms": {
"jeb_no": stores
}
}
]
}
}
}

Resources