Can we use multiple terms condition in elasticsearch filters - elasticsearch

Is it possible to use multiple terms condition for specific fields in bool filter?
query": {
"filtered": {
"filter": {
"bool": {
"must": [
{
"terms": {
"events": [
"abc",
"def",
"ghi",
"jkl"
]
},
"terms" : {
"users" : [
"user_1",
"user_2",
"user_3"
]
}
}
]
}
}
}
}
First terms filter is working fine, but i am not able to use second terms, Please correct if i am doing anything wrong with the above query.

You were almost there, you forgot one brace. Here's correct query:
{
"query": {
"filtered": {
"filter": {
"bool": {
"must": [
{
"terms": {
"events": [
"abc",
"def",
"ghi",
"jkl"
]
}
},
{
"terms": {
"users": [
"user_1",
"user_2",
"user_3"
]
}
}
]
}
}
}
}
}
This will evaluate both conditions:
Your event must be one of abc/def/ghi/jkl
User must be either user_1/user_2/user_3
Basicly each terms query/filter needs to be wrapped up in its' own braces and they need to be siblings.

Related

Multiple AND and OR condition in elasticsearch

I having trouble with elasticsearch query.
Data Structrue:
[{agent : "abc", origin: "US"}, {agent : "abc", origin: "US"}
I'm not able to find multiple agent name (OR condition) and (AND Condition) multiple Origin (OR condition)
You can use a combination of bool/must/should clause along with the terms query
{
"query": {
"bool": {
"must": [
{
"terms": {
"agent": [
"abc",
"abc"
]
}
},
{
"terms": {
"origin": [
"US",
"US"
]
}
}
]
}
}
}
Since terms already has OR semantics, you don't need to wrap them in bool/should queries. The following query should do what you expect:
{
"query": {
"bool": {
"filter": [
{
"terms": {
"agent": [
"agent1",
"agent2"
]
}
},
{
"terms": {
"origin": [
"US",
"CA"
]
}
}
]
}
}
}

ElasticSearch lucene query with subclauses conversion to ES syntax

I've been trying to convert a lucene style query to ES query syntax but I'm getting stuck on sub-clauses. e.g.
(title:history^10 or series:history) and (NOT(language:eng) OR language:eng^5) and (isfree eq 'true' OR (isfree eq 'false' AND owned eq 'abc^5'))
This states that "get me a match for history in 'title' or 'series' but boost the title match AND where the language doesn't have to be english, but if if is then boost it AND where the match is free or where it isn't free then make sure it's owned by customer abc".
I feel this is a tricky query but it seems to work correctly. Converting the clauses to ES syntax is confusing me as I don't really have the concept of brackets. I think I need to use bool queries... I have the following which I know doesn't apply the criteria correctly - it says you should have (language:eng OR isFree eq 'true' OR owned:abc). I can't seem to make the mental leap to build the must/should with NOT's in it.
Help please?
"query": {
"bool": {
"must": [
{
"multi_match": {
"query": "history",
"fields": [
"title^10.0",
"series"
]
}
}
],
"should": [
{
"term": {
"language": {
"value": "eng",
"boost": 5
}
}
},
{
"term": {
"isFree": {
"value": true
}
}
},
{
"term": {
"owned": {
"value": "abc",
"boost": 5
}
}
}
]
}
},
Your query is almost correct, the only thing that wasn't translated correctly was this part of the query:
(isfree eq 'true' OR (isfree eq 'false' AND owned eq 'abc^5'))
If I understand your post correctly, this is basically saying boost the 'owned' field by a factor of five when it's value is 'abc' and the price is free. To implement this, you need to use an additional bool query that:
Filters results by isFree: true
Boosts the owned field of any documents matching abc
"bool": {
"filter": [
{
"term": {
"isFree": {
"value": false
}
}
}
],
"must": [
{
"term": {
"owned": {
"value": "abc",
"boost": 5
}
}
}
]
}
Since this is not intended to limit the result set and only boost results that meet this criteria, the bool query above should be placed inside your parent bool's should section. The final query looks like:
POST /myindex/_search
{
"explain": true,
"query": {
"bool": {
"must": [
{
"multi_match": {
"query": "history",
"fields": [
"title^10",
"series"
]
}
}
],
"should": [
{
"term": {
"language": {
"value": "eng",
"boost": 5
}
}
},
{
"bool": {
"filter": [
{
"term": {
"isFree": {
"value": false
}
}
}
],
"must": [
{
"term": {
"owned": {
"value": "abc",
"boost": 5
}
}
}
]
}
}
]
}
}
}
Note: Using should and must yield the same results for that inner bool, I honestly am not sure which would be better to use so I just arbitrarily used must.

How to filter a specific value within a dictionary?

Let's say I have this dictionary:
{
"name": "Jorje",
"surname": "Costali",
"extra_information": {
"real_name": "mamino",
"fake_name": "bambino",
"age": "43",
"gang": "gang34"
}
}
How can I query to get all entries that have "extra_information.gang":"gang34" ? I would like to know how to filter after exact term or having a match.
I have tried:
{
"size": 20,
"query": {
"bool": {
"filter": [
{
"terms": {
"extra_information.gang": [
"gang34"
]
}
}
]
}
}
}
but it does not return any entries.
I have tried:
GET _search
{
"query": {
"bool": {
"must": [
{
"match": {
"extra_information.gang" : "gang34"
}
}
]
}
}
}
and works, but I want to make it into a filter, not a simple match query.
Did you try to use .keyword? like:
"terms": {
"extra_information.gang.keyword": [
"gang34"
]
}
I tried what you wrote on my nested dictionary document, it works like this to me.

ElasticSearch multimatch substring search

I have to combine two filters to match requirements:
- a specific list of values in r.status field
- one of the multiple text fields contains the value.
Result query (with using Nest, but it doesn't matter) looks like:
{
"query": {
"bool": {
"filter": [
{
"bool": {
"must": [
{
"term": {
"isActive": {
"value": true
}
}
},
{
"nested": {
"query": {
"bool": {
"must": [
{
"terms": {
"r.status": [
"VALUE_1",
"VALUE_2",
"VALUE_3"
]
}
},
{
"bool": {
"should": [
{
"match": {
"r.g.firstName": {
"type": "phrase",
"query": "SUBSTRING_VALUE"
}
}
},
{
"match": {
"r.g.lastName": {
"type": "phrase",
"query": "SUBSTRING_VALUE"
}
}
}
]
}
}
]
}
},
"path": "r"
}
}
]
}
}
]
}
}
}
Also tried with multi_match query:
{
"query": {
"bool": {
"filter": [
{
"bool": {
"must": [
{
"term": {
"isActive": {
"value": true
}
}
},
{
"nested": {
"query": {
"bool": {
"must": [
{
"terms": {
"r.status": [
"VALUE_1",
"VALUE_2",
"VALUE_3"
]
}
},
{
"multi_match": {
"query": "SUBSTRING_VALUE",
"fields": [
"r.g.firstName",
"r.g.lastName"
]
}
}
]
}
},
"path": "r"
}
}
]
}
}
]
}
}
}
FirstName and LastName are configured in index mappings as text:
"firstName": {
"type": "text"
},
"lastName": {
"type": "text"
}
Elastic gives a lot of full-text search options: multi_match, phrase, wildcards etc. But all of them fail in my case looking a sub-string in my text fields. (terms query and isActive one work well, I just tried to run only them).
What options do I have also or maybe where I made a mistake?
UPD: Combined wildcards worked for me, but such query looks ugly. Looking for a more elegant solution.
The elasticsearch way is to use ngram tokenizer.
The ngram analyzer will split your terms with a sliding window. For example, the input "Hello World" will generate the following terms:
Hel
Hell
Hello
ell
ello
...
Wor
World
orl
...
You can configure the minimum and maximum size of the sliding window (in the example the minimum size is 3). Once the sub terms are generated you can use a match query an the subfield.
Another point, it is weird to use must within a filter. If you are interested in the score, you should always use must otherwise use filter. Read this article for a good understanding.

How to do nested AND and OR filters in ElasticSearch?

My filters are grouped together into categories.
I would like to retrieve documents where a document can match any filter in a category, but if two (or more) categories are set, then the document must match any of the filters in ALL categories.
If written in pseudo-SQL it would be:
SELECT * FROM Documents WHERE (CategoryA = 'A') AND (CategoryB = 'B' OR CategoryB = 'C')
I've tried Nested filters like so:
{
"sort": [{
"orderDate": "desc"
}],
"size": 25,
"query": {
"match_all": {}
},
"filter": {
"and": [{
"nested": {
"path":"hits._source",
"filter": {
"or": [{
"term": {
"progress": "incomplete"
}
}, {
"term": {
"progress": "completed"
}
}]
}
}
}, {
"nested": {
"path":"hits._source",
"filter": {
"or": [{
"term": {
"paid": "yes"
}
}, {
"term": {
"paid": "no"
}
}]
}
}
}]
}
}
But evidently I don't quite understand the ES syntax. Is this on the right track or do I need to use another filter?
This should be it (translated from given pseudo-SQL)
{
"sort": [
{
"orderDate": "desc"
}
],
"size": 25,
"query":
{
"filtered":
{
"filter":
{
"and":
[
{ "term": { "CategoryA":"A" } },
{
"or":
[
{ "term": { "CategoryB":"B" } },
{ "term": { "CategoryB":"C" } }
]
}
]
}
}
}
}
I realize you're not mentioning facets but just for the sake of completeness:
You could also use a filter as the basis (like you did) instead of a filtered query (like I did). The resulting json is almost identical with the difference being:
a filtered query will filter both the main results as well as facets
a filter will only filter the main results NOT the facets.
Lastly, Nested filters (which you tried using) don't relate to 'nesting filters' like you seemed to believe, but related to filtering on nested-documents (parent-child)
Although I have not understand completely your structure this might be what you need.
You have to think tree-wise. You create a bool where you must (=and) fulfill the embedded bools. Each embedded checks if the field does not exist or else (using should here instead of must) the field must (terms here) be one of the values in the list.
Not sure if there is a better way, and do not know the performance.
{
"sort": [
{
"orderDate": "desc"
}
],
"size": 25,
"query": {
"query": { #
"match_all": {} # These three lines are not necessary
}, #
"filtered": {
"filter": {
"bool": {
"must": [
{
"bool": {
"should": [
{
"not": {
"exists": {
"field": "progress"
}
}
},
{
"terms": {
"progress": [
"incomplete",
"complete"
]
}
}
]
}
},
{
"bool": {
"should": [
{
"not": {
"exists": {
"field": "paid"
}
}
},
{
"terms": {
"paid": [
"yes",
"no"
]
}
}
]
}
}
]
}
}
}
}
}

Resources