ElasticSearch : how to sort ES document based on document version - elasticsearch

Following is my ES query I want to sort my document in descending order based on "_version" but not sure how to do it
{
"query":
{
"bool":
{
"must":
[
{
"terms":
{
"streamingSegmentId":
[
"00003319-b7fa-3409-806a-fa3bb5d2be26"
],
"boost": 1
}
},
{
"range":
{
"streamingSegmentStartTime":
{
"from": 1644480000000,
"to": 1647476658447,
"include_lower": true,
"include_upper": false,
"boost": 1
}
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
},
"version": true,
"_source":
{
"includes":
[
"errorCount",
"benefitId",
"streamingSegmentStopTime",
"fanoutPublishTimestamp",
"sessionUpdateTime",
"contentSegmentUpdateTime"
],
"excludes":
[]
},
"sort":
[
{
"streamingSegmentStartTime":
{
"order": "asc"
}
},
{
"_version":
{
"order": "desc"
}
}
]
}
I was not able to sort based on _version, hence I added one timestamp field called fanoutPublishTimestamp to sort my document in descending order of time. Following is my udpated query and I'm using collapse to fetch only latest timestamp document. Now the recent problem I started facing with following query is collpase cannot be used with search_after. search_after I'm using to add pagination support in my ES query.
I'm using AWS Elastic search which is using 7.10 version of ES and 8.1 ES version only supports collapse with Search_after. Please let me know if anybody has better solution to deal with this issue
GET /sessions/_search
{
"size": 2,
"query": {
"bool": {
"must": [
{
"terms": {
"benefitId": [
"PRIME"
],
"boost": 1
}
},
{
"range": {
"streamingSegmentStartTime": {
"from": 1647821557000,
"to": 1647825157000,
"include_lower": true,
"include_upper": false,
"boost": 1
}
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
},
"_source": {
"includes": [
"deviceTypeId",
"timeline"
],
"excludes": []
},
"sort": [
{
"streamingSegmentStartTime": {
"order": "asc"
}
},
{
"fanoutPublishTimestamp": {
"order": "desc"
}
}
],
"search_after": [
"1647821557001",
"1647829603837"
],
"collapse": {
"field": "streamingSegmentId"
}
}

As you didn't provide your sample documents, and didn't explain what it means that its not working, I am assuming that it's because of two sort param you are using, if you use just one _version it works(tested locally on my sample documents).
Mostly your another sort criteria streamingSegmentStartTime is causing few documents which has higher _version to come later in the response, try to remove it and see if it provided you expected result.

Related

AWS Elastic Search(v7.10.2) cannot use `collapse` in conjunction with `search_after`

I'm using AWS Elastic Search verion 7.10.2 and getting error ```cannot use collapse in conjunction with `search_after```` on following query. Please let me know what could be the possible solution to fix it. I cannot upgrade the ES version as from the AWS console it looks like it is the max supported. Please let me know if there can be any alternative query for the same.
Search_after I wanted to use for the pagination support and collpase I wanted to use to pick the latest version of the document
StreamingSegmentId : it is the unique Id for each document
StreamingSegmentStartTime : My query can return many unique documents. I wanted to sort the entire result set in ascending order of StreamingSegmentStartTime
FanoutPublishTimestamp : This is the doc published timestamp. since I want the latest version of each document. I'm sorting all the docs belongs to same StreamingSegmentId in descending order and use collpase of pick the top one.
GET /sessions/_search
{
"size": 2,
"query": {
"bool": {
"must": [
{
"terms": {
"benefitId": [
"PRIME"
],
"boost": 1
}
},
{
"range": {
"streamingSegmentStartTime": {
"from": 1647821557000,
"to": 1647825157000,
"include_lower": true,
"include_upper": false,
"boost": 1
}
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
},
"_source": {
"includes": [
"deviceTypeId",
"timeline"
],
"excludes": []
},
"sort": [
{
"streamingSegmentStartTime": {
"order": "asc"
},
"fanoutPublishTimestamp" : {
"order": "desc"
}
}
],
"search_after": [
6749348300022,
6749348300048
],
"collapse": {
"field": "streamingSegmentId"
}
}

RestHighLevelClient Elastic Search Java Client takes more time for Search Query

RestHighLevelClient is taking more time for Search Queries and have bad performance.From direct Elastic Search Rest API it is taking less time.
How can we optimise the performance for this search request
This is the Code Snippet :
SearchRequest searchRequest=new SearchRequest(new String[]
{indexName},searchSourceBuilder);
searchSourceBuilder.trackTotalHits(true);
searchResponse =restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
Query created using BoolQueryBuilder:
{
"size": 100,
"query": {
"bool": {
"filter": [
{
"range": {
"date": {
"from": "2017-08-01T00:00:00.000Z",
"to": "2017-08-30T00:00:00.000Z",
"include_lower": true,
"include_upper": true,
"boost": 1
}
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
},
"sort": [
{
"date": {
"order": "desc"
}
},
{
"id": {
"order": "desc"
}
}
]
}

Elastic - Filter after selecting top 5 hits

I'm using the alerting feature in Kibana and I want to check if the last 5 consecutive values of a field exceed a threshold x but if I use a filter in my elastic query, it gets applied before the top N aggregation.
Is there a way in which I can apply the filter after or check if the last consecutive values exceed a threshold using some other selector or method? I don't want to check this in the trigger condition in painless because that will return all the documents in the ctx and not just the ones which exceeded the threshold which I want to display in my alert message.
I've been stuck with this for a while and I have only seen blog posts saying sub aggregation is not possible on top N so any help or work around would be much appreciated.
This is my query :
{
"size": 500,
"query": {
"bool": {
"filter": [
{
"match_all": {
"boost": 1
}
},
{
"match_phrase": {
"client.id": {
"query": "42",
"slop": 0,
"zero_terms_query": "NONE",
"boost": 1
}
}
},
{
"range": {
"#timestamp": {
"from": "{{period_end}}||-10m",
"to": "{{period_end}}",
"include_lower": true,
"include_upper": true,
"format": "epoch_millis",
"boost": 1
}
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
},
"aggs": {
"2": {
"terms": {
"field": "component.name",
"order": {
"_key": "desc"
},
"size": 50
},
"aggs": {
"3": {
"terms": {
"field": "client.name.keyword",
"order": {
"_key": "desc"
},
"size": 5
},
"aggs": {
"1": {
"top_hits": {
"docvalue_fields": [
{
"field": "gc.oldgen.used",
"format": "use_field_mapping"
}
],
"_source": "gc.oldgen.used",
"size": 5,
"sort": [
{
"#timestamp": {
"order": "desc"
}
}
]
}
}
}
}
}
}
}
}
}
Did you try to use a sub filter aggregation:
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-filter-aggregation.html
Or you can use a pipeline aggregation to manipulate your aggregations results
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-pipeline.html
by the way, a term query on the client id looks more appropriate.

Elastic search query is not executed

Hi I am using elastic search engine to search for some items, items are placed in some buildings, when running this query, Items returned are not sorted even if I change the sort direction. My first impression is that the block sort is not even executed. Is there something wrong with the query ?
{
"from": 0,
"size": 20,
"query": {
"bool": {
"filter": [
{
"terms": {
"buildingsUuid": [
"9caff147-d019-416a-a167-f02bab7334fd"
],
"boost": 1
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
},
"sort": [
{
"itemId": {
"order": "desc"
}
}
]
}

Getting SearchPhaseExecutionException using ElasticSearch Java Client

I am using a filtered query with sort. When i run the query using the browser plugin, it runs fine. But when i use java client that ships with ElasticSearch, i get error
org.elasticsearch.action.search.SearchPhaseExecutionException: Failed
to execute phase [dfs], all shards failed; shardFailures
Here is the query thats being run
{
"from": 0,
"size": 50,
"query": {
"filtered": {
"query": {
"bool": {
"must": {
"bool": {
"should": [
{
"match": {
"_all": {
"query": "Happy Pharrel Williams",
"type": "boolean"
}
}
},
{
"flt": {
"fields": [
"name",
"artists",
"genre",
"albumName"
],
"like_text": "Happy Pharrel Williams"
}
}
]
}
}
}
},
"filter": {
"bool": {
"must": {
"or": {
"filters": [
{
"range": {
"releaseInfo.us": {
"from": null,
"to": "2015-07-22T23:16:12.852Z",
"include_lower": true,
"include_upper": true
}
}
},
{
"and": {
"filters": [
{
"missing": {
"field": "releaseInfo.us"
}
},
{
"range": {
"releaseInfo.WW": {
"from": null,
"to": "2015-07-22T23:16:12.851Z",
"include_lower": true,
"include_upper": true
}
}
}
]
}
}
]
}
}
}
}
}
},
"fields": [],
"sort": [
{
"popularity.US": {
"order": "asc",
"missing": 999
}
},
{
"_score": {}
}
] }
I understand that the error sounds like the field i am sorting on is missing in some of the indices. But i have provided the "missing" option in my sort and the query runs just fine when i run from ES browser head plugin.
Do you see anything wrong with the query structure or something else with Java Client ?
I was getting the exception because i was using a sort on a field that didn't exist in a certain number of indexed documents. I re-indexed all the documents and it worked.

Resources