How to get maximum value and id using Max aggregation by country in Elasticsearch - elasticsearch

Getting maximum value by country but I want additional information for maximum value id. I tried many ways but I don't know how to fetch.
{
"aggs" : {
"country_groups" : {
"terms" : { "field" : "country.keyword",
"size":30000
},
"aggs":{
"max_price":{
"max": { "field" : "video_count"}
}
}
}
}
}

Depending on the type of your id field (numeric or string), you have two ways of doing it.
If you look at the query below, if your id is numeric you can do the same as you did with video_count, i.e. using the max metric aggregation (see max_id_num).
However, if your id field is a string, you can leverage the top_hits aggregation and sort it in descending order (see max_id_str).
{
"aggs": {
"country_groups": {
"terms": {
"field": "country.keyword",
"size": 30000
},
"aggs": {
"max_price_and_id": {
"top_hits": {
"size": 1,
"sort": {
"video_count": "desc"
},
"_source": ["channel_id", "video_count"]
}
}
}
}
}
}

Related

How to do proportions in Elastic search query

I have a field in my data that has four unique values for all the records. I have to aggregate the records based on each unique value and find the proportion of each field in the data. Essentially, (Number of records in each unique field/total number of records). Is there a way to do this with elastic search dashboards? I have used terms aggregation to aggregate the fields and applied value_count metric aggregation to get the doc_count value. But I am not able to use the bucket script to do the division. I am getting the error ""buckets_path must reference either a number value or a single value numeric metric aggregation, got: [StringTerms] at aggregation [latest_version]""
Below is my code:
{
"size": 0,
"aggs": {
"BAR": {
"date_histogram": {
"field": "timestamp",
"calendar_interval": "day"
},
"aggs": {
"latest_version": {
"filter": {
"match_phrase": {
"log": "main_filter"
}
},
"aggs": {
"latest_version_count": {
"terms": {
"field": "field_name"
},
"aggs": {
"version_count": {
"value_count": {
"field": "field_name"
}
}
}
},
"sum_buckets": {
"sum_bucket": {
"buckets_path": "latest_version_count>_count"
}
}
}
},
"BAR-percentage": {
"bucket_script": {
"buckets_path": {
"eachVersionCount": "latest_version>latest_version_count",
"totalVersionCount": "latest_version>sum_buckets"
},
"script": "params.eachVersionCount/params.totalVersionCount"
}
}
}
}
}
}

Sort aggregation buckets by shared field values

I would like to group documents based on a group field G. I use the „field aggregation“ strategy described in the Elastic documention to sort the buckets by the maximal score of the contained documents (called 'field collapse example in the Elastic doc), like this:
{
"query": {
"match": {
"body": "elections"
}
},
"aggs": {
"top_sites": {
"terms": {
"field": "domain",
"order": {
"top_hit": "desc"
}
},
"aggs": {
"top_tags_hits": {
"top_hits": {}
},
"top_hit" : {
"max": {
"script": {
"source": "_score"
}
}
}
}
}
}
}
This query also includes the top hits in each bucket.
If the maximal score is not unique for the buckets, I would like to specify a second order column. From the application context I know that inside a bucket all documents share the same value for a field F. Therefore, this field should be employed as the second order column.
How can I realize this in Elastic? Is there a way to make a field from the top hits subaggregation useable in the enclosing aggregation?
Any ideas? Many thanks!
It seems you can. In this page all the sorting strategy for terms aggregation are listed.
And they is an example of multi criteria buckets sorting :
Multiple criteria can be used to order the buckets by providing an
array of order criteria such as the following:
GET /_search
{
"aggs" : {
"countries" : {
"terms" : {
"field" : "artist.country",
"order" : [ { "rock>playback_stats.avg" : "desc" }, { "_count" : "desc" } ]
},
"aggs" : {
"rock" : {
"filter" : { "term" : { "genre" : "rock" }},
"aggs" : {
"playback_stats" : { "stats" : { "field" : "play_count" }}
}
}
}
}
}
}

Get all documents from elastic search with a field having same value

Say I have documents of type Order and they have a field bulkOrderId. Bulkorderid represents a group or bulk of orders issued at once. They all have the same Id like this :
Order {
bulkOrderId": "bulkOrder:12345678";
}
The id is unique and is generated using UUID.
How do I find groups of orders with the same bulkOrderId from elasticsearch when the bulkOrderId is not known? Is it possible?
You can achieve that using a terms aggregation and a top_hits sub-aggregation, like this:
{
"query": {
"match_all": {}
},
"aggs": {
"bulks": {
"terms": {
"field": "bulkOrderId",
"size": 10
},
"aggs": {
"orders": {
"top_hits": {
"size": 10
}
}
}
}
}
}

Elastic Search Aggregation and Details

I am trying to get the teachers name too in this query..
From this I am able to get loop the teachers and get the number of classes she is working for and also the amount of money she gets for each year.
But I can't get full details in this query. I want to display teachers name too.
here is my current query
{
"aggs": {
"teacher": {
"terms": {
"field": "teacher_id",
"size": 10
},
"aggs": {
"academic_year": {
"date_histogram": {
"field": "acc_year",
"interval": "year"
},
"aggs": {
"income": {
"stats": {
"field": "teacher_hourly_fee"
}
}
}
}
}
}
},
"size": 0
}
Most straightforward approach may be to combine teacher ID and name as a generated term using a script:
{
"aggs" : {
"teacher" : {
"terms" : {
"script" : "_source.teacher_id + '-' + _source.teacher_name",
"size": 10
}
}
}
}
Adjust script particulars per your actual schema.

Calculating sum of nested fields with date_histogram aggregation in Elasticsearch

I'm having trouble getting the sum of a nested field in Elasticsearch using a date_histogram, and I'm hoping somebody can lend me a hand.
I have a mapping that looks like this:
"client" : {
// various irrelevant stuff here...
"associated_transactions" : {
"type" : "nested",
"include_in_parent" : true,
"properties" : {
"amount" : {
"type" : "double"
},
"effective_at" : {
"type" : "date",
"format" : "dateOptionalTime"
}
}
}
}
I'm trying to get a date_histogram that shows total revenue by month across all clients--i.e. a time series showing the sum associated_transactions.amount in a histogram determined by associated_transactions.effective_date. I tried running this query:
{
"query": {
// ...
},
"aggregations": {
"revenue": {
"date_histogram": {
"interval": "month",
"min_doc_count": 0,
"field": "associated_transactions.effective_at"
},
"aggs": {
"monthly_revenue": {
"sum": {
"field": "associated_transactions.amount"
}
}
}
}
}
}
But the sum it's giving me isn't right. It seems that what ES is doing is finding all clients who have any transaction in a given month, then summing all of the transactions (from any time) for those clients. That is, it's a sum of the amount spent in the lifetime of a client who made a purchase in a given month, not the sum of purchases in a given month.
Is there any way to get the data I'm looking for, or is this a limitation in how ES handles nested fields?
Thanks very much in advance for your help!
David
Try this?
{
"query": {
// ...
},
"aggregations": {
"revenue": {
"date_histogram": {
"interval": "month",
"min_doc_count": 0,
"field": "associated_transactions.effective_at"
"aggs": {
"monthly_revenue": {
"sum": {
"field": "associated_transactions.amount"
}
}
}
}
}
}
}
i.e. move the "aggs" key into the "date_histogram" field.
I stumbled upon this question while trying to solve similar problem with my implementation of ES.
It seems that currently Elasticsearch looks at position of aggregation in the JSON body request tree - not inheritance of its objects and filelds. So you should not put your sum aggregation "inside" "date_histogram", but place it outside on the same JSON tree level.
This worked for me:
{
"size": 0,
"aggs": {
"histogram_aggregation": {
"date_histogram": {
"field": "date_vield",
"calendar_interval": "day"
},
"aggs": {
"views": {
"sum": {
"field": "the_vield_i_want_to_sum"
}
}
}
}
},
"query": {
#some query
}
OP made mistake of placing his sum aggregation inside date histogram aggregation.

Resources