how to sort the records by another field for the top_hit in ES

how to sort the records by another field for the top_hit in ES - elasticsearch

I have data such as:
Id name startTime(timestamp)
1 c 1510000000000
2 c 1500000000000
3 a 1510000000000
4 a 1500000000000
5 b 1500662700000
I want to get the max startTime record for each name, and then sort by name.
the result should be:
Id name startTime(timestamp)
1 a 1510000000000
5 b 1500662700000
2 c 1510000000000
currently, I can get the max startTime group by each name, but I don't know how to sort by name for the results.
Here is my query:
GET index/default/_search
{
"aggs": {
"group": {
"terms": {
"field": "name"
},
"aggs": {
"tops": {
"top_hits": {
"sort": [
{
"startTime": {
"order": "desc"
}
}
],
"size": 1
}
}
}
}
},
"size": 0
}

As I'm understand, except for top_hits sort, you want the name buckets to be sorted by the name.
Have a look at Terms Aggregation order. All you have to do is to add order by key under the terms aggregation.
Here is my suggestion:
{
"aggs": {
"group": {
"terms": {
"field": "name",
"order": { --> this will do the trick
"_term": "asc"
}
},
"aggs": {
"tops": {
"top_hits": {
"sort": [
{
"startTime": {
"order": "desc"
}
}
],
"size": 1
}
}
}
}
},
"size": 0
}

Related

How to get the last Elasticsearch document for each unique value of a field?

I have a data structure in Elasticsearch that looks like:
{
"name": "abc",
"date": "2022-10-08T21:30:40.000Z",
"rank": 3
}
I want to get, for each unique name, the rank of the document (or the whole document) with the most recent date.
I currently have this:
"aggs": {
"group-by-name": {
"terms": {
"field": "name"
},
"aggs": {
"max-date": {
"max": {
"field": "date"
}
}
}
}
}
How can I get the rank (or the whole document) for each result, and if possible, in 1 request ?

You can use below options
Collapse
"collapse": {
"field": "name"
},
"sort": [
{
"date": {
"order": "desc"
}
}
]
Top hits aggregation
{
"aggs": {
"group-by-name": {
"terms": {
"field": "name",
"size": 100
},
"aggs": {
"top_doc": {
"top_hits": {
"sort": [
{
"date": {
"order": "desc"
}
}
],
"size": 1
}
}
}
}
}
}

How to mention from and size for the first level of elastic search aggregation in nested aggregation?

I have written a query to get the buckets based on id and then sort it. This works fine. But how to make it return buckets from position 100 till 200 for aggregation_by_id rule?
{
"query": {
"match_all": {}
},
"size": 0,
"aggregations": {
"aggregation_by_id": {
"terms": {
"field": "id.keyword"
"size" : 200
},
"aggs": {
"sort_timestamp": {
"top_hits": {
"sort": [{
"timestamp": {
"order": "desc",
"unmapped_type": "long"
}
}],
"size": 1
}
}
}
}
}
}

How to take more fields when grouping

Trying to group data and take all of its fields by the way.
GET /testnews/default/_search
{
"size": 10,
"from":50,
"query":{
"multi_match": {
"query": "serenay",
"fields": ["Data.Title", "Data.Description", "Data.Tags.Title", "Data.MentionTitle", "Data.Program.title", "Data.Program.description", "Data.Program.original_title"]
}
},
"sort":[{
"Data.CreatedAt": {
"order": "desc"
},
"Data.ViewCount": {
"order": "desc"
}
}],
"aggs": {
"group_by_state": {
"terms": {
"field": "Data.Program.title.keyword"
}
}
}
}
But when I did it, it returns only "Program Title" in the grouped result.
Just like:
{
"key": "Kocamın Ailesi",
"doc_count": 3
}
But I just want it like:
{
"key": "Kocamın Ailesi",
"description": "blabla",
"image": "blabla.jpg",
"date": "YYYY-mm-dd",
"doc_count": 3
}
just like sql
select * from x group by field

Regarding the SQL example, to get the behaviour of
select a, b, count(*) from x group by a, b
you can aggregate on a, then b like this:
"aggs": {
"group_by_a": {
"terms": {
"field": "a"
},
"aggs": {
"group_by_b": {
"terms": {
"field":"b"
}
}
}
}
}
But I don't think that is what you're looking for?
If you want the full documents in aggregations you can use the "top_hits" aggregation to select the top n hits within each aggregation:
{
"aggs": {
"group_by_state": {
"terms": {
"field": "Data.Program.title.keyword"
},
"aggs": {
"state_top_hits": {
"top_hits": {
"sort": [
{ "Data.CreatedAt": { "order": "desc" } },
{ "Data.ViewCount": { "order": "desc" } }
],
"_source": {
"includes": [ "key", "description", "image", "date" ]
},
"size": 10 //Will show top 10 hits within keyword agg ordered according to the sort
}
}
}
}
}
}

how to bucket empty and non empty fields in nested aggregation in elasticsearch?

I have the following set of nested subaggregations in elasticsearch (field2 is a subaggregation of field1 and field3 is a subaggregation of field2).
It turns out however that the terms aggregation for field3 will not bucket documents that dont have field3.
My understanding is that I have to use a Missing subaggregation query to bucket those in addition to the term query for field3.
But I am not sure how can I add it to the query below to bucket both.
{
"size": 0,
"aggregations": {
"f1": {
"terms": {
"field": "field1",
"size": 0,
"order": {
"_count": "asc"
},
"include": [
"123"
]
},
"aggregations": {
"field2": {
"terms": {
"field": "f2",
"size": 0,
"order": {
"_count": "asc"
},
"include": [
"tr"
]
},
"aggregations": {
"field3": {
"terms": {
"field": "f3",
"order": {
"_count": "asc"
},
"size": 0
},
"aggregations": {
"aggTopHits": {
"top_hits": {
"size": 1
}
}
}
}
}
}
}
}
}
}

In version 2.1.2 and later, you can use the missing parameter of the terms aggregation, which allows you to specify a default value for documents that are missing that field. (FYI, the missing parameter was available starting 2.0, but there was a bug which prevented it from working on sub-aggregations, which is how you would use it here.)
...
"aggregations": {
"field3": {
"terms": {
"field": "f3",
"order": {
"_count": "asc"
},
"size": 0,
"missing": "n/a" <----- provide a default here
},
"aggregations": {
"aggTopHits": {
"top_hits": {
"size": 1
}
}
}
}
}
However, if you are working with a pre-2.x ES cluster, you can use the missing aggregation at the same depth as your field3 aggregation to bucket the documents that are missing "f3" like this:
...
"aggregations": {
"field3": {
"terms": {
"field": "f3",
"order": {
"_count": "asc"
},
"size": 0
},
"aggregations": {
"aggTopHits": {
"top_hits": {
"size": 1
}
}
}
},
"missing_field3": {
"missing" : {
"field": "f3"
},
"aggregations": {
"aggTopMissingHit": {
"top_hits": {
"size": 1
}
}
}
}
}

Union of sorted sized queries in Elasticsearch

I have docs in Elasticsearch like:
{
"key1":1,
"key2":2,
"key3":3
}
I would like to make a query that returns 30 docs which are the union of the:
the 10 docs with the highest values in key1 +
the 10 docs with the highest values in key2 +
the 10 docs with the highest values in key3
I got 2 ideas:
Using DisMaxQuery - but I couldn't use sorting. Probably missed something..
using MultiSearch - but I would like to get one result object
Any suggestions would be helpful!

Another idea would be to add three terms aggregations on key1, key2 and key3 each sorted by a max sub-aggregation (in order to get the highest value for each key) and for each of them you can add a another top_hits sub-aggregation. You might get more less than 10 docs per key, if that's a problem you can increase the size of the terms aggregations to 2 or 3 and then filter out the unneeded top hits on the client side.
{
"size": 0,
"query": {
"match_all": {}
},
"aggs": {
"topkey1": {
"terms": {
"field": "key1",
"size": 1,
"order": {
"max_key1": "desc"
}
},
"aggs": {
"max_key1": {
"max": {
"field": "key1"
}
},
"key1_tophits": {
"top_hits": {
"size": 10
}
}
}
},
"topkey2": {
"terms": {
"field": "key2",
"size": 1,
"order": {
"max_key2": "desc"
}
},
"aggs": {
"max_key2": {
"max": {
"field": "key2"
}
},
"key2_tophits": {
"top_hits": {
"size": 10
}
}
}
},
"topkey3": {
"terms": {
"field": "key3",
"size": 1,
"order": {
"max_key3": "desc"
}
},
"aggs": {
"max_key3": {
"max": {
"field": "key3"
}
},
"key_tophits": {
"top_hits": {
"size": 10
}
}
}
}
}
}

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

how to sort the records by another field for the top_hit in ES - elasticsearch

Related

How to get the last Elasticsearch document for each unique value of a field?

How to mention from and size for the first level of elastic search aggregation in nested aggregation?

How to take more fields when grouping

how to bucket empty and non empty fields in nested aggregation in elasticsearch?

Union of sorted sized queries in Elasticsearch

Categories

Resources