Complex Search with Elasticsearch - elasticsearch

I've always had problems forming a search request in ES.
Now, I have to update my code from v.2 to v.6 and a lot is different.
Current problem: how can I formulate a complex or and and search?
Right now I have:
{
"index":"<index>",
"type":"<type>",
"body":{
"from":0,
"size":20,
"query":{
"bool":{
"must":{
"or":{
"match":{
"CUSTNR":"24508"
},
"and":{
"prefix":[
{
"ORDERNR":"128"
}
]
}
}
}
}
}
}
}
and I receive the message: {"type":"parsing_exception","reason":"no [query] registered for [or]"
How should I write this?

There is nothing called AND/OR in elasticsearch 6 or later, It is mapped like this way
AND is must
OR is should
NOR is should_not
So if you want to see all the customer with CUSTNR = 24508 AND ORDERNR = (128 OR 129) then we can build the query like below where bool indicates we are using boolean operator
{
"query": {
"bool": {
"must": [
{
"term": {"CUSTNR ": 24508 }
},
{
"bool": {
"should": [
{"term": {"ORDERNR ": 128}},
{"term": {"ORDERNR ": 129}}
]
}
}
]
}
}
}
A sample complex query added for your understanding how to form must, should, sort etc.
https://gist.github.com/sany2k8/ec0145738babca643f0a369e55546c85

Related

ElasticSearch / OpenSearch term search with logical OR

I have been scratching my head for a while looking at OpenSearch documentation and stackoverflow questions. How can I do something like this:
Select documents WHERE studentId in [1234, 5678] OR applicationId in [2468, 1357].
As long as studentId exactly matches one of the supplied values, or applicationId exactly matches one of the supplied values, then that document should be included in the response.
When I want to search for multiple values for a single field and get an exact match the following works:
{
"must":[
{
"terms": {
"studentId":["1234", "5678"]
}
}
]
}
This will find me exact matches on studentId in [1234, 5678].
If I try to add the condition to also look for (logical or) applicationId in [2468, 1357] then the following will not work:
{
"must":[
{
"terms": {
"studentId":["1234", "5678"]
}
},
{
"terms": {
"applicationId":["2468", "1357"]
}
}
]
}
because this will do a logical and on the two queries. I want logical or.
I cannot use should because this returns irrelevant results. The following does not work for me:
{
"should":[
{
"terms": {
"studentId":["1234", "5678"]
}
},
{
"terms": {
"applicationId":["2468", "1357"]
}
}
]
}
This seems to return all results, ranked by relevance. I find that the returned results do not actually match, despite the fact that this is a terms search.
Can you try with following query..
{
"query": {
"bool": {
"should": [
{
"bool": {
"must": [
{
"terms": {
"studentId":["1234", "5678"]
}
}
]
}
},
{
"bool": {
"must": [
{
"terms": {
"applicationId":["2468", "1357"]
}
}
]
}
}
]
}
}
}

Is it possible to access a query term in a script field?

I would like to construct an elasticsearch query in which I can search for a term and on-the-fly compute a new field for each found document, which is calculated based on some existing fields as well as the query term. Is this possible?
For example, let's say in my EL query I am searching for documents which have the keyword "amsterdam" in the "text" field.
"filter": [
{
"match_phrase": {
"text": {
"query": "amsterdam"
}
}
}]
Now I would also like to have a script field in my query, which computes some value based on other fields as well as the query.
So far, I have only found how to access the other fields of a document though, using doc['someOtherField'], for example
"script_fields" : {
"new_field" : {
"script" : {
"lang": "painless",
"source": "if (doc['citizens'].value > 10000) {
return "large";
}
return "small";"
}
}
}
How can I integrate the query term, e.g. if I wanted to add to the if statement "if the query term starts with a-e"?
You're on the right track but script_fields are primarily used to post-process your documents' attributes — they won't help you filter any docs because they're run after the query phase.
With that being said, you can use scripts to filter your documents through script queries. Before you do that, though, you should explore alternatives.
In other words, scripts should be used when all other mechanisms and techniques have been exhausted.
Back to your example. I see three possibilities off the top of my head.
Match phrase prefix queries as a group of bool-should subqueries:
POST your-index/_search
{
"query": {
"bool": {
"must": [
{
"bool": {
"should": [
{
"match_phrase_prefix": {
"text_field": "a"
}
},
{
"match_phrase_prefix": {
"text_field": "b"
}
},
{
"match_phrase_prefix": {
"text_field": "c"
}
},
... till the letter "e"
]
}
}
]
}
}
}
A regexp query:
POST your-index/_search
{
"query": {
"bool": {
"must": [
{
"regexp": {
"text_field": "[a-e].+"
}
}
]
}
}
}
Script queries using .charAt comparisons:
POST your-index/_search
{
"query": {
"bool": {
"must": [
{
"script": {
"script": {
"source": """
char c = doc['text_field.keyword'].value.charAt(0);
return c >= params.gte.charAt(0) && c <= params.lte.charAt(0);
""",
"params": {
"gte": "a",
"lte": "e"
}
}
}
}
]
}
}
}
If you're relatively new to ES and would love to see real-world examples, check out my recently released Elasticsearch Handbook. One chapter is dedicated to scripting and as it turns out, you can achieve a lot with scripts (if of course executed properly).

ElasticSearch query with MUST and SHOULD

I have this query to get data from AWS elasticSearch instance v6.2
{
"query": {
"bool": {
"must": [
{
"term": {"logLevel": "error"}
},
{
"bool": {
"should": [
{
"match": {"EventCategory": "Home Management"}
}
]
}
}
],
"filter": [{
"range": { "timestamp": { "gte": 155254550880 }}
}
]
}
},
"size": 10,
"from": 0
}
My data has multiple EventCategories for example 'Home Management' and 'User Account Management'. Problem with this is inside should having match returns all data because phrase 'Management' is in both categories. If I use term instead of match, it don't returns anything at all even when the given value is exactly same as in document.
I need to get data when any of given category is matched with rest of filters.
EDIT:
There may none, one or more than one EventCategory be passed to should clause
I'm not sure why you added a should within a must. Do you expect to have more than one should cases? It looks a bit odd.
As for your question, you can't use the term query on an analysed field, but only on keyword typed fields. If your EventCategory field has the default mapping, you can run the term query against the default non-analysed multi-field of EventCategory as follows:
...
{
"term": { "EventCategory.keyword": "Home Management" }
}
...
Furthermore, if you just want to filter in/out documents without caring about their relevance, I'd recommend you to move all the conditions in the filter block, to speed-up your query and make a better use of the cache.
Below query should work.
I've just removed should and created two must clauses one for each of event and management. Note that the query is meant for text datatypes.
{
"query":{
"bool":{
"must":[
{
"term":{
"logLevel":"error"
}
},
{
"match":{
"EventCategory":"home"
}
},
{
"match":{
"EventCategory":"management"
}
}
],
"filter":[
{
"range":{
"timestamp":{
"gte":155254550880
}
}
}
]
}
},
"size":10,
"from":0
}
Hope it helps!

ElasticSearch Query, match a certain term and count given a date range

I feel like this shouldn't be as difficult as its turning out to be, I've been attempting to use the:
index/_search
and
index/_count
endpoints, using query, bool, must filter etc. It seems no matter how I construct it, I cannot use range and date, with the match filter. The elasticsearch documentation doesn't seem to show complex queries like this so I'm not exactly sure how to construct it. The main query I've been manipulating is:
{
"query":{
"bool":{
"must":{
"range":{
"date":{
"gte":"now-1d/d",
"lt" :"now/d"
}
},
"match":{
"KEY":"VALUE"
}
}
}
}
}
I either get "no query registered for date", or "unknown key for a start_object in match" Been all over stackoverflow and can't seem to find an answer to this, it seems like it should be quite a simple query to make against a data store such as this. What am I missing here?
must can take an array of conditions if you want to combine them. Try this format :
{
"query": {
"bool": {
"must": [
{
"range": {
"date": {
"gte": "now-1d/d",
"lt": "now/d"
}
}
},
{
"match": { "KEY": "VALUE" }
}
]
}
}
}

Elasticsearch should query without computing relevance (_score)

I'm creating filtering queries which operates on two fields. I would like to avoid computing relevance by Elasticsearch. How to achieve OR statement without moving to query context.
My simplified model has two boolean fields:
{
is_opened,
is_send
}
I'd like to prepare query with logic:
(is_opened == true AND is_send == true) OR (is_opened == false)
In other words I want to exclude documents with fields:
is_opened == true AND is_send == false
My query looks like that:
GET documents/default/_search
{
"query": {
"bool": {
"should": [
{
"bool":{
"must":[
{"term": {"is_opened":true}},
{"term": {"is_send":true}}
]
}
},
{
"bool":{
"must":[
{"term": {"is_opened":false}}
]
}
}
]
}
}
}
Logically it works as I expected but Elasticsearch computes relevance.
I don't need it because at the end I sort results by another field so it's a place to optimize queries.
I ask about it because Frequently used filters will be cached automatically by Elasticsearch, to speed up performance.
My results have _score field computed so I think that above query is executed in query context so Elasticsearch won't cache it automatically.
In the future I would like to create queries which operates on status fields, where logic would be more complicated. Still I need to know how to block computing _score.
I noticed that changing should to filter block computing _score but works as must operator. Is it possible to change filter behavior?
Is it possible to use another query than should?
How to force Elasticserach to stop computing _score?
Simply wrap your query inside the constant_score query:
GET documents/default/_search
{
"query": {
"constant_score": {
"filter": {
"bool": {
"should": [
{
"bool": {
"must": [
{
"term": {
"is_opened": true
}
},
{
"term": {
"is_send": true
}
}
]
}
},
{
"bool": {
"must": [
{
"term": {
"is_opened": false
}
}
]
}
}
]
}
}
}
}
}

Resources