elasticsearch bool query combine must with OR - elasticsearch

I am currently trying to migrate a solr-based application to elasticsearch.
I have this lucene query:
((
name:(+foo +bar)
OR info:(+foo +bar)
)) AND state:(1) AND (has_image:(0) OR has_image:(1)^100)
As far as I understand this is a combination of must clauses combined with boolean OR:
Get all documents containing (foo AND bar in name) OR (foo AND bar in info). After that filter results by condition state=1 and boost documents that have an image.
I have been trying to use a bool query with must but I am failing to get boolean OR into must clauses. Here is what I have:
GET /test/object/_search
{
"from": 0,
"size": 20,
"sort": {
"_score": "desc"
},
"query": {
"bool": {
"must": [
{
"match": {
"name": "foo"
}
},
{
"match": {
"name": "bar"
}
}
],
"must_not": [],
"should": [
{
"match": {
"has_image": {
"query": 1,
"boost": 100
}
}
}
]
}
}
}
As you can see, must conditions for info are missing.
** UPDATE **
I have updated my elasticsearch query and got rid of that function score. My base problem still exists.

OR is spelled should
AND is spelled must
NOR is spelled should_not
Example:
You want to see all the items that are (round AND (red OR blue)):
{
"query": {
"bool": {
"must": [
{
"term": {"shape": "round"}
},
{
"bool": {
"should": [
{"term": {"color": "red"}},
{"term": {"color": "blue"}}
]
}
}
]
}
}
}
You can also do more complex versions of OR, for example, if you want to match at least 3 out of 5, you can specify 5 options under "should" and set a "minimum_should" of 3.
Thanks to Glen Thompson and Sebastialonso for finding where my nesting wasn't quite right before.
Thanks also to Fatmajk for pointing out that "term" becomes a "match" in ElasticSearch Version 6.

I finally managed to create a query that does exactly what i wanted to have:
A filtered nested boolean query.
I am not sure why this is not documented. Maybe someone here can tell me?
Here is the query:
GET /test/object/_search
{
"from": 0,
"size": 20,
"sort": {
"_score": "desc"
},
"query": {
"filtered": {
"filter": {
"bool": {
"must": [
{
"term": {
"state": 1
}
}
]
}
},
"query": {
"bool": {
"should": [
{
"bool": {
"must": [
{
"match": {
"name": "foo"
}
},
{
"match": {
"name": "bar"
}
}
],
"should": [
{
"match": {
"has_image": {
"query": 1,
"boost": 100
}
}
}
]
}
},
{
"bool": {
"must": [
{
"match": {
"info": "foo"
}
},
{
"match": {
"info": "bar"
}
}
],
"should": [
{
"match": {
"has_image": {
"query": 1,
"boost": 100
}
}
}
]
}
}
],
"minimum_should_match": 1
}
}
}
}
}
In pseudo-SQL:
SELECT * FROM /test/object
WHERE
((name=foo AND name=bar) OR (info=foo AND info=bar))
AND state=1
Please keep in mind that it depends on your document field analysis and mappings how name=foo is internally handled. This can vary from a fuzzy to strict behavior.
"minimum_should_match": 1 says, that at least one of the should statements must be true.
This statements means that whenever there is a document in the resultset that contains has_image:1 it is boosted by factor 100. This changes result ordering.
"should": [
{
"match": {
"has_image": {
"query": 1,
"boost": 100
}
}
}
]
Have fun guys :)

This is how you can nest multiple bool queries in one outer bool query
this using Kibana,
bool indicates we are using boolean
must is for AND
should is for OR
GET my_inedx/my_type/_search
{
"query" : {
"bool": { //bool indicates we are using boolean operator
"must" : [ //must is for **AND**
{
"match" : {
"description" : "some text"
}
},
{
"match" :{
"type" : "some Type"
}
},
{
"bool" : { //here its a nested boolean query
"should" : [ //should is for **OR**
{
"match" : {
//ur query
}
},
{
"match" : {}
}
]
}
}
]
}
}
}
This is how you can nest a query in ES
There are more types in "bool" like,
Filter
must_not

I recently had to solve this problem too, and after a LOT of trial and error I came up with this (in PHP, but maps directly to the DSL):
'query' => [
'bool' => [
'should' => [
['prefix' => ['name_first' => $query]],
['prefix' => ['name_last' => $query]],
['prefix' => ['phone' => $query]],
['prefix' => ['email' => $query]],
[
'multi_match' => [
'query' => $query,
'type' => 'cross_fields',
'operator' => 'and',
'fields' => ['name_first', 'name_last']
]
]
],
'minimum_should_match' => 1,
'filter' => [
['term' => ['state' => 'active']],
['term' => ['company_id' => $companyId]]
]
]
]
Which maps to something like this in SQL:
SELECT * from <index>
WHERE (
name_first LIKE '<query>%' OR
name_last LIKE '<query>%' OR
phone LIKE '<query>%' OR
email LIKE '<query>%'
)
AND state = 'active'
AND company_id = <query>
The key in all this is the minimum_should_match setting. Without this the filter totally overrides the should.
Hope this helps someone!

If you were using Solr's default or Lucene query parser, you can pretty much always put it into a query string query:
POST test/_search
{
"query": {
"query_string": {
"query": "(( name:(+foo +bar) OR info:(+foo +bar) )) AND state:(1) AND (has_image:(0) OR has_image:(1)^100)"
}
}
}
That said, you may want to use a boolean query, like the one you already posted, or even a combination of the two.

$filterQuery = $this->queryFactory->create(QueryInterface::TYPE_BOOL, ['must' => $queries,'should'=>$queriesGeo]);
In must you need to add the query condition array which you want to work with AND and in should you need to add the query condition which you want to work with OR.
You can check this: https://github.com/Smile-SA/elasticsuite/issues/972

Related

ElasticSearch lucene query with subclauses conversion to ES syntax

I've been trying to convert a lucene style query to ES query syntax but I'm getting stuck on sub-clauses. e.g.
(title:history^10 or series:history) and (NOT(language:eng) OR language:eng^5) and (isfree eq 'true' OR (isfree eq 'false' AND owned eq 'abc^5'))
This states that "get me a match for history in 'title' or 'series' but boost the title match AND where the language doesn't have to be english, but if if is then boost it AND where the match is free or where it isn't free then make sure it's owned by customer abc".
I feel this is a tricky query but it seems to work correctly. Converting the clauses to ES syntax is confusing me as I don't really have the concept of brackets. I think I need to use bool queries... I have the following which I know doesn't apply the criteria correctly - it says you should have (language:eng OR isFree eq 'true' OR owned:abc). I can't seem to make the mental leap to build the must/should with NOT's in it.
Help please?
"query": {
"bool": {
"must": [
{
"multi_match": {
"query": "history",
"fields": [
"title^10.0",
"series"
]
}
}
],
"should": [
{
"term": {
"language": {
"value": "eng",
"boost": 5
}
}
},
{
"term": {
"isFree": {
"value": true
}
}
},
{
"term": {
"owned": {
"value": "abc",
"boost": 5
}
}
}
]
}
},
Your query is almost correct, the only thing that wasn't translated correctly was this part of the query:
(isfree eq 'true' OR (isfree eq 'false' AND owned eq 'abc^5'))
If I understand your post correctly, this is basically saying boost the 'owned' field by a factor of five when it's value is 'abc' and the price is free. To implement this, you need to use an additional bool query that:
Filters results by isFree: true
Boosts the owned field of any documents matching abc
"bool": {
"filter": [
{
"term": {
"isFree": {
"value": false
}
}
}
],
"must": [
{
"term": {
"owned": {
"value": "abc",
"boost": 5
}
}
}
]
}
Since this is not intended to limit the result set and only boost results that meet this criteria, the bool query above should be placed inside your parent bool's should section. The final query looks like:
POST /myindex/_search
{
"explain": true,
"query": {
"bool": {
"must": [
{
"multi_match": {
"query": "history",
"fields": [
"title^10",
"series"
]
}
}
],
"should": [
{
"term": {
"language": {
"value": "eng",
"boost": 5
}
}
},
{
"bool": {
"filter": [
{
"term": {
"isFree": {
"value": false
}
}
}
],
"must": [
{
"term": {
"owned": {
"value": "abc",
"boost": 5
}
}
}
]
}
}
]
}
}
}
Note: Using should and must yield the same results for that inner bool, I honestly am not sure which would be better to use so I just arbitrarily used must.

elastic search dsl: filter is not working as it should be

I am doing a POST with below obj as body parameter which is supposed to get results and filter them according to their category:
{
"size": 100,
"query": {
"bool": {
"filter": [{"term": {"some_category": "SOME CATEGORY Value"}}
],
"should": [
{"match": {"field_1": "Some value"}},
{"match": {"field_2": "Some value"}}
]
}
}
}
If I remove the filter its working but not filtering as expected.
Can anyone tell me where I am going wrong or tell me the query which I should be using?
This syntax is suggested in the docs as well but still, it's not working.
Here is the link which directed me to this type of query.
I think you are applying term query on 'text' field 'some_category'. You should query on the 'keyword' type of the field 'some_category' - 'some_category.keyword'
{
"size": 100,
"query": {
"bool": {
"filter": [
{
"term": {
"some_category.keyword": "SOME CATEGORY Value"
}
}
],
"should": [
{
"match": {
"field_1": "Some value"
}
},
{
"match": {
"field_2": "Some value"
}
}
]
}
}
}

ElasticSearch query_string with filter failed to get the results

I have the following Elasticsearch query (its usually bigger, but stripped out the part which causes the issues):
'query' => [
'bool' => [
'must' => [
'query_string' => [
'query' => $keywords,
'fields' => ['title']
]
],
'filter' => [
'term' => [
'school_id' => 1
]
]
]
]
But if I remove the filter it's working fine, but what I want is to filter only the search with the specific school id.
Why don't you instead of filtering the data - just take what you need in the first place?
Filtering is used for a binary result-in a sense, if you would like to know if a document field school_id is 1 or not. If you just want to get a result there is other ways to do it as well.
In your case I think believed you just "jumped" over the mustand the bool and this is the reason your query failed.
As so, you got 2 options, the first to fix yours as follows:
GET /_search
{
"query": {
"bool": {
"must": {
"query_string": {
"default_field": "keywords",
"query": "title"
}
},
"filter": {
"bool": {
"must": [
{
"term": {
"school_id": "1"
}
}
]
}
}
}
}
}
OR if you wish to get a scoring to your result you can use this one:
GET /_search
{
"query": {
"bool": {
"must": [
{
"query_string": {
"default_field": "keywords",
"query": "title"
}
},
{
"match": {
"school_id": {
"query": "1"
}
}
}
]
}
}
}

Search from multiple nested level fields in elasticsearch

I want to search from multiple nested level fields. query like.
select * from product where brand='brand1' and category='category1'.
In elasticsearch I have two nested level mapping one is category and other is brand.
If i wrote only brand or category it return perfect result but how to write both in following query ?
$params = [
'index' => 'my_index',
'type' => 'product',
'body' => [
"query"=>[
"filtered"=>[
"filter"=>[
"bool"=>[
"must"=>[
"bool"=>[
"must"=>[
[
"query"=>[
"match"=>[
"brand"=>[
"query"=>"brand1",
"type"=>"phrase"
]
]
]
],
[
"query"=>[
"match"=>[
"category"=>[
"query"=>"category1",
"type"=>"phrase"
]
]
]
]
]
]
]
]
]
]
]
]
];
By above query I am getting 0 result
You can try below query it will help you out to get respected answer:
GET /product/ur_type/_search
{
"from": 0,
"size": 200,
"query": {
"filtered": {
"filter": {
"bool": {
"must": {
"bool": {
"must": [
{
"query": {
"match": {
"brand": {
"query": "brand1",
"type": "phrase"
}
}
}
},
{
"query": {
"match": {
"category": {
"query": "category1",
"type": "phrase"
}
}
}
}
]
}
}
}
}
}
}
}

Can we use multiple terms condition in elasticsearch filters

Is it possible to use multiple terms condition for specific fields in bool filter?
query": {
"filtered": {
"filter": {
"bool": {
"must": [
{
"terms": {
"events": [
"abc",
"def",
"ghi",
"jkl"
]
},
"terms" : {
"users" : [
"user_1",
"user_2",
"user_3"
]
}
}
]
}
}
}
}
First terms filter is working fine, but i am not able to use second terms, Please correct if i am doing anything wrong with the above query.
You were almost there, you forgot one brace. Here's correct query:
{
"query": {
"filtered": {
"filter": {
"bool": {
"must": [
{
"terms": {
"events": [
"abc",
"def",
"ghi",
"jkl"
]
}
},
{
"terms": {
"users": [
"user_1",
"user_2",
"user_3"
]
}
}
]
}
}
}
}
}
This will evaluate both conditions:
Your event must be one of abc/def/ghi/jkl
User must be either user_1/user_2/user_3
Basicly each terms query/filter needs to be wrapped up in its' own braces and they need to be siblings.

Resources