Ordering Aggregation Buckets by Score

Ordering Aggregation Buckets by Score - elasticsearch

Is it possible to order the aggregation bucket by score?
"aggs": {
"UnitAggregationBucket": {
"terms": {
"field": "unitId",
"size": 10,
/* "order": order by max score documents per bucket */
}
}
}
I have seen this document which explains the default order is doc_count, but I cannot find out if it is possible and how to order the buckets by score.

Yes, it is possible to do that like this:
{
"size": 0,
"query": {
...
},
"aggs": {
"UnitAggregationBucket": {
"terms": {
"field": "unitId",
"size": 10,
"order": {
"score": "desc"
}
},
"aggs": {
"score": {
"max": {
"script": "_score"
}
}
}
}
}
}

Related

How to define percentage of result items with specific field in Elasticsearch query?

I have a search query that returns all items matching users that have type manager or lead.
{
"from": 0,
"size": 20,
"query": {
"bool": {
"should": [
{
"terms": {
"type": ["manager", "lead"]
}
}
]
}
}
}
Is there a way to define what percentage of the results should be of type "manager"?
In other words, I want the results to have 80% of users with type manager and 20% with type lead.

I want to make a suggestion to use bucket_path aggregation. As I know this aggregation needs to be run in sub-aggs of a histogram aggregation. As you have such field in your mapping so I think this query should work for you:
{
"size": 0,
"aggs": {
"NAME": {
"date_histogram": {
"field": "my_datetime",
"interval": "month"
},
"aggs": {
"role_type": {
"terms": {
"field": "type",
"size": 10
},
"aggs": {
"count": {
"value_count": {
"field": "_id"
}
}
}
},
"role_1_ratio": {
"bucket_script": {
"buckets_path": {
"role_1": "role_type['manager']>count",
"role_2": "role_type['lead']>count"
},
"script": "params.role_1 / (params.role_1+params.role_2)*100"
}
},
"role_2_ratio": {
"bucket_script": {
"buckets_path": {
"role_1": "role_type['manager']>count",
"role_2": "role_type['lead']>count"
},
"script": "params.role_2 / (params.role_1+params.role_2)*100"
}
}
}
}
}
}
Please let me know if it didn't work well for you.

How to mention from and size for the first level of elastic search aggregation in nested aggregation?

I have written a query to get the buckets based on id and then sort it. This works fine. But how to make it return buckets from position 100 till 200 for aggregation_by_id rule?
{
"query": {
"match_all": {}
},
"size": 0,
"aggregations": {
"aggregation_by_id": {
"terms": {
"field": "id.keyword"
"size" : 200
},
"aggs": {
"sort_timestamp": {
"top_hits": {
"sort": [{
"timestamp": {
"order": "desc",
"unmapped_type": "long"
}
}],
"size": 1
}
}
}
}
}
}

Paging the top_hits aggregation in ElasticSearch

Right now I'm doing a top_hits aggregation in Elastic Search that groups my data by a field, sorts the groups by a date, and chooses the top 1.
I need to somehow page this aggregation results in a way that I can pass through the pageSize and the pageNumber, but I don't know how.
In addition to this, I also need the total results of this aggregation so we can show it in a table in our web interface.
The aggregation looks like this:
POST my_index/_search
{
"size": 0,
"aggs": {
"top_artifacts": {
"terms": {
"field": "artifactId.keyword"
},
"aggs": {
"top_artifacts_hits": {
"top_hits": {
"size": 1,
"sort": [{
"date": {
"order": "desc"
}
}]
}
}
}
}
}
}

If I understand what you want, you should be able to do pagination through a Composite Aggregation. You can still pass your size parameter in your pagination, but your from would be the key for the bucket.
POST my_index/_search
{
"size": 0,
"aggs": {
"top_artifacts": {
"composite": {
"sources": [
{
"artifact": {
"terms": {
"field": "artifactId.keyword"
}
}
}
]
,
"size": 1, // OPTIONAL SIZE (How many buckets)
"after": {
"artifact": "FOO_BAZ" // Buckets after this bucket key
}
},
"aggs": {
"hits": {
"top_hits": {
"size": 1,
"sort": [
{
"timestamp": {
"order": "desc"
}
}
]
}
}
}
}
}
}

How do I aggregate over top_hits results in elasticsearch

Here are example documents:
{
"player": "Jim",
"score" : 5
"timestamp": 1459492890000
}
{
"player": "Jim",
"score" : 7
"timestamp": 1459492895000
}
{
"player": "Dave",
"score" : 9
"timestamp": 1459492894000
}
{
"player": "Dave",
"score" : 4
"timestamp": 1459492898000
}
I want to get the latest score for each player and then get the average of all those scores. So the answer would be 5.5. Jim's latest score is 7 and Dave's latest score is 4. The average between those two is 5.5
The only way I found to get the "latest" document of a player was to use the top_hits aggregation. However, it does not seem that I am able to do another aggregation after I get the latest document.
This is the best I came up with:
{
"aggs": {
"last_score": {
"terms": { "field": "player" },
"aggs": {
"last_score_hits": {
"top_hits": {
"sort": [ { "timestamp": { "order": "desc" } } ],
"size": 1
},
"aggs": {
"avg_score": {
"avg": { "field": "score" }
}
}
}
}
}
}
}
However, this gives me this error:
Aggregator [last_score_hits] of type [top_hits] cannot accept
sub-aggregations
If there is another way to accomplish this search without using top_hits as well, then I would be all for it.

You're trying to put avg_score as a sub-aggregation of last_score_hits.
To get success you have to put avg_score as a sub-aggregation of last_score. See an example bellow:
{
"aggs": {
"last_score": {
"terms": {
"field": "player"
},
"aggs": {
"last_score_hits": {
"top_hits": {
"sort": [
{
"timestamp": {
"order": "desc"
}
}
],
"size": 1
}
},
"avg_score": {
"avg": {
"field": "score"
}
}
}
}
}
}

You can have other aggregation on a parallel level of top_hit but you cannot have any sub_aggregation below top_hit. It is not supported by ElasticSearch. here is the link to Github issue
You can have a parallel level aggregation like:
"aggs": {
"top_hits_agg": {
"top_hits": {
"size": 10,
"_source": {
"includes": ["score"]
}
}
},
"avg_agg": {
"avg": {
"field": "score"
}
}
}

Union of sorted sized queries in Elasticsearch

I have docs in Elasticsearch like:
{
"key1":1,
"key2":2,
"key3":3
}
I would like to make a query that returns 30 docs which are the union of the:
the 10 docs with the highest values in key1 +
the 10 docs with the highest values in key2 +
the 10 docs with the highest values in key3
I got 2 ideas:
Using DisMaxQuery - but I couldn't use sorting. Probably missed something..
using MultiSearch - but I would like to get one result object
Any suggestions would be helpful!

Another idea would be to add three terms aggregations on key1, key2 and key3 each sorted by a max sub-aggregation (in order to get the highest value for each key) and for each of them you can add a another top_hits sub-aggregation. You might get more less than 10 docs per key, if that's a problem you can increase the size of the terms aggregations to 2 or 3 and then filter out the unneeded top hits on the client side.
{
"size": 0,
"query": {
"match_all": {}
},
"aggs": {
"topkey1": {
"terms": {
"field": "key1",
"size": 1,
"order": {
"max_key1": "desc"
}
},
"aggs": {
"max_key1": {
"max": {
"field": "key1"
}
},
"key1_tophits": {
"top_hits": {
"size": 10
}
}
}
},
"topkey2": {
"terms": {
"field": "key2",
"size": 1,
"order": {
"max_key2": "desc"
}
},
"aggs": {
"max_key2": {
"max": {
"field": "key2"
}
},
"key2_tophits": {
"top_hits": {
"size": 10
}
}
}
},
"topkey3": {
"terms": {
"field": "key3",
"size": 1,
"order": {
"max_key3": "desc"
}
},
"aggs": {
"max_key3": {
"max": {
"field": "key3"
}
},
"key_tophits": {
"top_hits": {
"size": 10
}
}
}
}
}
}

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Ordering Aggregation Buckets by Score - elasticsearch

Yes, it is possible to do that like this: { "size": 0, "query": { ... }, "aggs": { "UnitAggregationBucket": { "terms": { "field": "unitId", "size": 10, "order": { "score": "desc" } }, "aggs": { "score": { "max": { "script": "_score" } } } } } }

Related

How to define percentage of result items with specific field in Elasticsearch query?

How to mention from and size for the first level of elastic search aggregation in nested aggregation?

Paging the top_hits aggregation in ElasticSearch

How do I aggregate over top_hits results in elasticsearch

Union of sorted sized queries in Elasticsearch

Categories

Resources