How to reduce multiple conditions in ES - elasticsearch

In the below query, I used multiple times match_phrase. how to reduce multiple match_phrase? because in production while querying to ES response is very slow.
GET /logs*/_search
{
"from":0,
"query":{
"bool":{
"filter":[
{
"range":{
"#timestamp":{
"gte":"2020-02-10T11:13:19.7684961Z",
"lte":"2020-02-11T11:13:19.7684961Z"
}
}
}
],
"must":[
{
"bool":{
"must_not":[
{
"match_phrase":{
"message":{
"query":"System32"
}
}
},
{
"match_phrase":{
"message":{
"query":"212.118.14.45"
}
}
},
{
"match_phrase":{
"message":{
"query":" stopped state."
}
}
},
{
"match_phrase":{
"message":{
"query":" running state"
}
}
},
{
"match_phrase":{
"message":{
"query":" Share Name: \\\\*\\DLO-EBackup"
}
}
}
.
.
.
etc.,
.
.
.
.
.
{
"match_phrase":{
"message":{
"query":"WFO15Installation"
}
}
},
{
"match_phrase":{
"message":{
"query":"Windows\\SysWOW64"
}
}
},
{
"match_phrase":{
"message":{
"query":"Bitvise"
}
}
}
]
}
}
]
}
},
"size":10,
"sort":[
{
"#timestamp":{
"order":"desc"
}
}
]
}
Thank You!

to begin with, you could move the must_not block inside the filter one to skip score calculation and leverage on some caching. Something like:
"query":{
"bool":{
"filter":[{
"range":{
"#timestamp":{
"gte":"2020-02-10T11:13:19.7684961Z",
"lte":"2020-02-11T11:13:19.7684961Z"
}
}
},
{
"bool": {
"must_not":[{
"match_phrase":{
"message":{
"query":"System32"
}
}
},
{
"match_phrase":{
"message":{
"query":"212.118.14.45"
}
}
},
...
]
}
}],
...
However, as someone already mentioned in the comments, you should optimise your data for searches before indexing your documents into Elasticsearch. A better solution than having so many filters in your query would be to process your data and applying those filters at ingestion time, for example by using the ingest APIs (see Elastic Documentation) or Logstash. E.g., you could evaluate the must_not conditions at index time and set the result into a boolean field (e.g., ignore) that you can add to all documents, so that you can use that field at query time with a query like this:
"query":{
"bool":{
"filter":[{
"range":{
"#timestamp":{
"gte":"2020-02-10T11:13:19.7684961Z",
"lte":"2020-02-11T11:13:19.7684961Z"
}
}
},
{
"match": {
"ignore": false
}
},
...

Related

Elasticsearch need AND query instead OR

I'm trying to search posts with some prefixes (212, 215) and in certain node (663).
This query is searching posts with OR prefix operator. But i need a query to search with AND operator. How to do it? This query is generated by CMS:
{
"query":{
"bool":{
"filter":[
{
"term":{
"node":663
}
},
{
"terms":{
"prefix":[
"215",
"212"
]
}
},
{
"bool":{
"should":[
{
"type":{
"value":"post"
}
},
{
"type":{
"value":"thread"
}
}
]
}
}
],
"must":{
"match_all":{
}
}
}
},
"sort":[
{
"date":"desc"
}
],
"size":8000,
"docvalue_fields":[
"discussion_id",
"user",
"date"
],
"_source":false
}
If you're looking for docs that have a list of values for prefix containing both 212 and 215, you should use separate queries:
{
"query":{
"bool":{
"filter":[
...
{"match":{"prefix":"212"}},
{"match":{"prefix":"215"}},
...
],
...
}

Elasticsearch query returning far less number of records

I am running following elasticsearch query from groovy script. There are thousands of records which meet this criteria, but I get only 10 records in return.
{
"query":{
"bool":{
"must":[
{
"match_all":{
}
},
{
"range":{
"#Timestamp":{
"gte":1417511269270,
"lte":1575277669270,
"format":"epoch_millis"
}
}
},
{
"match_phrase":{
"field1.keyword":{
"query":"value1"
}
}
},
{
"match_phrase":{
"field2.keyword":{
"query":"value2"
}
}
},
{
"range":{
"#Timestamp":{
"gte":"2001-03-01",
"lt":"2019-10-30"
}
}
}
],
"filter":[
],
"should":[
],
"must_not":[
]
}
}
}
What am I missing in my query?
You are missing a size parameter, which means it defaults to 10 results.
e.g. add this to your query object:
"size": 100

how to get elasticsearch documents in custom order

Is there any way to get documents from elasticsearch in custom order .
the documents in elastic are indexed like this :
{
"productId":1
},
{
"productId":2
},
{
"productId":3
}
i need to get documents in order as term query order
i try this query but not work :
{
"from":0,
"size":20,
"query":{
"bool":{
"should":[
{
"term":{
"productId":{
"value":3,
"boost":1.0
}
}
},
{
"term":{
"productId":{
"value":1,
"boost":1.0
}
}
},
{
"term":{
"productId":{
"value":2,
"boost":1.0
}
}
}
],
"adjust_pure_negative":true,
"boost":1.0
}
},
"version":true
}
as you see term queries with productId:3 , productId:1 , productId:2
i expect result are documents sorted as term queries order :
{
"productId":3
},
{
"productId":1
},
{
"productId":2
}

Elasticsearch Filtered Bool Query

I am running into some serious Problems with a custom Search. All i want is a Wildcard Search in three Fields and the Result should to filtered by another field. In Elastica it results in this Query:
{
"bool":{
"should":[
{
"wildcard":{
"ean":"*180g*"
}
},
{
"wildcard":{
"titel":"*180g*"
}
},
{
"wildcard":{
"interpret":"*180g*"
}
}
],
"filter":[
{
"term":{
"genre":{
"value":"Rock",
"boost":1
}
}
}
]
}
}
Actually i can't find an error, but Elasticsearch does not give me Filtered Results. What happens? Elasticsearch returns ALL Items with the Filtered Term, either if the Boolean Shoulds are True or False. When i add the Filter as "Must" i am getting the same results? What is wrong here !?
You need to add "minimum_should_match": 1 in your bool query.
{
"bool":{
"minimum_should_match": 1,
"should":[
{
"wildcard":{
"ean":"*180g*"
}
},
{
"wildcard":{
"titel":"*180g*"
}
},
{
"wildcard":{
"interpret":"*180g*"
}
}
],
"filter":[
{
"term":{
"genre":{
"value":"Rock",
"boost":1
}
}
}
]
}
}

Query based on Fields existing in different Indices in Elasticsearch

I've got the following query
{
"from":0,
"size":50000,
"_source":[
"T121",
"timestamp"
],
"sort":{
"timestamp":{
"order":"asc"
}
},
"query":{
"bool":{
"must":{
"range":{
"timestamp":{
"gte":"2017-01-17 11:44:41.347",
"lte":"2017-02-18 11:44:47.878"
}
}
},
"must":{
"exists":{
"field":"T121"
}
}
}
}
}
http://172.22.23.169:9200/index1,index2,Index3/_search?pretty
With this URL i want to query over a number of indices in Elasticsearch and only return those documents where a specific field exists.
Is it possible to put in a list of fields in the "exists" clause where i define
if "field1" OR "field2" OR "fiedl3" are existing in one of the documents return it, otherwise don't, or do i have to script such a case?
To search across all indices use > http://172.22.23.169:9200/_search?pretty
To search across selected indices add following filter to "bool" filter
"must": {
"terms": {
"_index": [
"index1",
"index2"
]
}
}
For OR'ing multiple "exists", you can use should clause with multiple exists and specify "minimum_should_match" to control searched records.
{
"from":0,
"size":50000,
"_source":[
"T121",
"timestamp"
],
"sort":{
"timestamp":{
"order":"asc"
}
},
"query":{
"bool":{
"must":{
"range":{
"timestamp":{
"gte":"2017-01-17 11:44:41.347",
"lte":"2017-02-18 11:44:47.878"
}
}
},
"should":[
{
"exists":{
"field":"field1"
}
},
{
"exists":{
"field":"field2"
}
},
{
"exists":{
"field":"field3"
}
}
]
}
}
}

Resources