RestHighLevelClient Elastic Search Java Client takes more time for Search Query - elasticsearch

RestHighLevelClient is taking more time for Search Queries and have bad performance.From direct Elastic Search Rest API it is taking less time.
How can we optimise the performance for this search request
This is the Code Snippet :
SearchRequest searchRequest=new SearchRequest(new String[]
{indexName},searchSourceBuilder);
searchSourceBuilder.trackTotalHits(true);
searchResponse =restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
Query created using BoolQueryBuilder:
{
"size": 100,
"query": {
"bool": {
"filter": [
{
"range": {
"date": {
"from": "2017-08-01T00:00:00.000Z",
"to": "2017-08-30T00:00:00.000Z",
"include_lower": true,
"include_upper": true,
"boost": 1
}
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
},
"sort": [
{
"date": {
"order": "desc"
}
},
{
"id": {
"order": "desc"
}
}
]
}

Related

ElasticSearch : how to sort ES document based on document version

Following is my ES query I want to sort my document in descending order based on "_version" but not sure how to do it
{
"query":
{
"bool":
{
"must":
[
{
"terms":
{
"streamingSegmentId":
[
"00003319-b7fa-3409-806a-fa3bb5d2be26"
],
"boost": 1
}
},
{
"range":
{
"streamingSegmentStartTime":
{
"from": 1644480000000,
"to": 1647476658447,
"include_lower": true,
"include_upper": false,
"boost": 1
}
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
},
"version": true,
"_source":
{
"includes":
[
"errorCount",
"benefitId",
"streamingSegmentStopTime",
"fanoutPublishTimestamp",
"sessionUpdateTime",
"contentSegmentUpdateTime"
],
"excludes":
[]
},
"sort":
[
{
"streamingSegmentStartTime":
{
"order": "asc"
}
},
{
"_version":
{
"order": "desc"
}
}
]
}
I was not able to sort based on _version, hence I added one timestamp field called fanoutPublishTimestamp to sort my document in descending order of time. Following is my udpated query and I'm using collapse to fetch only latest timestamp document. Now the recent problem I started facing with following query is collpase cannot be used with search_after. search_after I'm using to add pagination support in my ES query.
I'm using AWS Elastic search which is using 7.10 version of ES and 8.1 ES version only supports collapse with Search_after. Please let me know if anybody has better solution to deal with this issue
GET /sessions/_search
{
"size": 2,
"query": {
"bool": {
"must": [
{
"terms": {
"benefitId": [
"PRIME"
],
"boost": 1
}
},
{
"range": {
"streamingSegmentStartTime": {
"from": 1647821557000,
"to": 1647825157000,
"include_lower": true,
"include_upper": false,
"boost": 1
}
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
},
"_source": {
"includes": [
"deviceTypeId",
"timeline"
],
"excludes": []
},
"sort": [
{
"streamingSegmentStartTime": {
"order": "asc"
}
},
{
"fanoutPublishTimestamp": {
"order": "desc"
}
}
],
"search_after": [
"1647821557001",
"1647829603837"
],
"collapse": {
"field": "streamingSegmentId"
}
}
As you didn't provide your sample documents, and didn't explain what it means that its not working, I am assuming that it's because of two sort param you are using, if you use just one _version it works(tested locally on my sample documents).
Mostly your another sort criteria streamingSegmentStartTime is causing few documents which has higher _version to come later in the response, try to remove it and see if it provided you expected result.

Elastic - Filter after selecting top 5 hits

I'm using the alerting feature in Kibana and I want to check if the last 5 consecutive values of a field exceed a threshold x but if I use a filter in my elastic query, it gets applied before the top N aggregation.
Is there a way in which I can apply the filter after or check if the last consecutive values exceed a threshold using some other selector or method? I don't want to check this in the trigger condition in painless because that will return all the documents in the ctx and not just the ones which exceeded the threshold which I want to display in my alert message.
I've been stuck with this for a while and I have only seen blog posts saying sub aggregation is not possible on top N so any help or work around would be much appreciated.
This is my query :
{
"size": 500,
"query": {
"bool": {
"filter": [
{
"match_all": {
"boost": 1
}
},
{
"match_phrase": {
"client.id": {
"query": "42",
"slop": 0,
"zero_terms_query": "NONE",
"boost": 1
}
}
},
{
"range": {
"#timestamp": {
"from": "{{period_end}}||-10m",
"to": "{{period_end}}",
"include_lower": true,
"include_upper": true,
"format": "epoch_millis",
"boost": 1
}
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
},
"aggs": {
"2": {
"terms": {
"field": "component.name",
"order": {
"_key": "desc"
},
"size": 50
},
"aggs": {
"3": {
"terms": {
"field": "client.name.keyword",
"order": {
"_key": "desc"
},
"size": 5
},
"aggs": {
"1": {
"top_hits": {
"docvalue_fields": [
{
"field": "gc.oldgen.used",
"format": "use_field_mapping"
}
],
"_source": "gc.oldgen.used",
"size": 5,
"sort": [
{
"#timestamp": {
"order": "desc"
}
}
]
}
}
}
}
}
}
}
}
}
Did you try to use a sub filter aggregation:
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-filter-aggregation.html
Or you can use a pipeline aggregation to manipulate your aggregations results
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-pipeline.html
by the way, a term query on the client id looks more appropriate.

Elastic search query is not executed

Hi I am using elastic search engine to search for some items, items are placed in some buildings, when running this query, Items returned are not sorted even if I change the sort direction. My first impression is that the block sort is not even executed. Is there something wrong with the query ?
{
"from": 0,
"size": 20,
"query": {
"bool": {
"filter": [
{
"terms": {
"buildingsUuid": [
"9caff147-d019-416a-a167-f02bab7334fd"
],
"boost": 1
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
},
"sort": [
{
"itemId": {
"order": "desc"
}
}
]
}

Getting SearchPhaseExecutionException using ElasticSearch Java Client

I am using a filtered query with sort. When i run the query using the browser plugin, it runs fine. But when i use java client that ships with ElasticSearch, i get error
org.elasticsearch.action.search.SearchPhaseExecutionException: Failed
to execute phase [dfs], all shards failed; shardFailures
Here is the query thats being run
{
"from": 0,
"size": 50,
"query": {
"filtered": {
"query": {
"bool": {
"must": {
"bool": {
"should": [
{
"match": {
"_all": {
"query": "Happy Pharrel Williams",
"type": "boolean"
}
}
},
{
"flt": {
"fields": [
"name",
"artists",
"genre",
"albumName"
],
"like_text": "Happy Pharrel Williams"
}
}
]
}
}
}
},
"filter": {
"bool": {
"must": {
"or": {
"filters": [
{
"range": {
"releaseInfo.us": {
"from": null,
"to": "2015-07-22T23:16:12.852Z",
"include_lower": true,
"include_upper": true
}
}
},
{
"and": {
"filters": [
{
"missing": {
"field": "releaseInfo.us"
}
},
{
"range": {
"releaseInfo.WW": {
"from": null,
"to": "2015-07-22T23:16:12.851Z",
"include_lower": true,
"include_upper": true
}
}
}
]
}
}
]
}
}
}
}
}
},
"fields": [],
"sort": [
{
"popularity.US": {
"order": "asc",
"missing": 999
}
},
{
"_score": {}
}
] }
I understand that the error sounds like the field i am sorting on is missing in some of the indices. But i have provided the "missing" option in my sort and the query runs just fine when i run from ES browser head plugin.
Do you see anything wrong with the query structure or something else with Java Client ?
I was getting the exception because i was using a sort on a field that didn't exist in a certain number of indexed documents. I re-indexed all the documents and it worked.

How do I limit an ElasticSearch API count by date?

I'm trying to count the number of query matches over a given time range, hitting the URL /{index}/_count with the body indicated below.
I'm new to Query DSL, so it's quite possible I'm overlooking something obvious. However, the straightforward application of a count to an existing query doesn't work. I don't see anything in the docs that indicate a count query should receive special treatment.
I've tried adding a range and aggregations to the query, but I keep getting the following error or some variant:
indices:data/read/count[s]]]; nested:
QueryParsingException[[graylog2_NN] request does not support [{label}]]
Limit query by timestamp:
{
"query": {
"term": { "level":3 },
"range": {
"timestamp": {
"from": "2015-06-16 15:10:09.322",
"to": "2015-06-16 16:10:09.322",
"include_lower": true,
"include_upper": true
}
}
}
}
Use an aggregation:
{
"query": {
"term": { "level":3 }
},
"aggs": {
"range": {
"date_range": {
field: "_timestamp",
"ranges": {
{ "to": "now-1d" },
{ "from": "now-2d" },
}
}
}
}
}
I've also tried plugging in the query exported from the UI (bug icon on an individual stream display), no joy there either (one hour's worth of matches):
{
"from": 0,
"size": 100,
"query": {
"match_all": {}
},
"post_filter": {
"bool": {
"must": [
{
"range": {
"timestamp": {
"from": "2015-06-16 15:10:09.322",
"to": "2015-06-16 16:10:09.322",
"include_lower": true,
"include_upper": true
}
}
},
{
"query": {
"query_string": {
"query": "streams:5568c9dbe4b0b31b781bf105"
}
}
}
]
}
},
"sort": [
{
"timestamp": {
"order": "desc"
}
}
],
"highlight": {
"require_field_match": false,
"fields": {
"*": {
"fragment_size": 0,
"number_of_fragments": 0
}
}
}
}
I've found a query that both matches and lines up pretty closely with numbers I get from the UI ("Search in the last 1 day"):
{
"query": {
"filtered": {
"query": {
"term": { "level":3 }
},
"filter": {
"range": { "timestamp": { "gte": "now-1d" } }
}
}
}
}
Try the following query that uses bool query. I use a different timestamp format, which is the default in elasticsearch. Try that format first, if no luck modify the timestamp format to match yours.
{
"query": {
"bool" : {
"should" : [
{
"term": { "level":3 }
},
{
"range": {
"timestamp": {
"from": "2015-06-16T15:10:09",
"to": "2015-06-16T16:10:09"
}
}
}
]
}
}
}

Resources