how to get derivative aggregations on the simple count - elasticsearch

Using ES 2.3.3
I want to use a Derivative Aggregation but the metric that should be used to calculate it it's not something like avg or sum, it's just the raw doc_count of each bucket of the parent histogram (sales_per_month).
I got it to work like this, by using stats agg:
"aggs" : {
"sales_per_month" : {
"date_histogram" : {
"field" : "date",
"interval" : "month"
},
"aggs": {
"sales": {
"stats": {
"field": "price"
}
},
"sales_deriv": {
"derivative": {
"buckets_path": "sales.count"
}
}
}
}
}
Is this really the way to do this or am I missing a simpler way?

I don't think it can get any simpler than that. It looks good, simple and elegant.

There is no need do define a nested stats aggregation just for the purposes of referencing the count. There is an implicit _count property for each bucket which corresponds to doc_count, which you can use in bucket_path.
As in your example you're referencing a contextual parent aggregation you would simply only reference _count (i.e. you're already in context of sales_per_month aggregation).
In your specific case you would use it as:
"aggs": {
"sales_per_month": {
"date_histogram": {
"field": "date",
"interval": "month"
},
"aggs": {
"sales_deriv": {
"derivative": {
"buckets_path": "_count"
}
}
}
}
}

Related

How to also display the values within the bucket that considered during aggregation?

I need to aggregate records based on the created_date. So based on each created date, there are group of records right?. Now, Could someone tell me how to display the created date as well along with each set of results.?
"aggs": {
"by_created_date": {
"terms": {
"field": "createddate"
},
_source["createddate"] //Something like this. so that i can see what date it has used.
"aggs": {
....
}, //Also may need to use some aggregation on this level.
},
}
aggs":{
"by_created_date":{
"terms":{
"field":"createddate.keyword",
"size":1000
},
"aggs":{
"bucket" : {
"terms" : {
"field" : "field_name",
"size": 10
}
}
}
}
}
terms is used for grouping a field.
So, for nested grouping...you have to write nested aggregation like upper code.

elasticsearch Need average per week of some value

I have simple data as
sales, date_of_sales
I need is average per week i.e. sum(sales)/no.of weeks.
Please help.
What i have till now is
{
"size": 0,
"aggs": {
"WeekAggergation": {
"date_histogram": {
"field": "date_of_sales",
"interval": "week"
}
},
"TotalSales": {
"sum": {
"field": "sales"
}
},
"myValue": {
"bucket_script": {
"buckets_path": {
"myGP": "TotalSales",
"myCount": "WeekAggergation._bucket_count"
},
"script": "params.myGP/params.myCount"
}
}
}
}
I get the error
Invalid pipeline aggregation named [myValue] of type [bucket_script].
Only sibling pipeline aggregations are allowed at the top level.
I think this may help:
{
"size": 0,
"aggs": {
"WeekAggergation": {
"date_histogram": {
"field": "date_of_sale",
"interval": "week",
"format": "yyyy-MM-dd"
},
"aggs": {
"TotalSales": {
"sum": {
"field": "sales"
}
},
"AvgSales": {
"avg": {
"field": "sales"
}
}
}
},
"avg_all_weekly_sales": {
"avg_bucket": {
"buckets_path": "WeekAggergation>TotalSales"
}
}
}
}
Note the TotalSales aggregation is now a nested aggregation under the weekly histogram aggregation (I believe there was a typo in the code provided - the simple schema provided indicated the field name of date_of_sale and the aggregation provided uses the plural form date_of_sales). This provides you a total of all sales in the weekly bucket.
Additionally, AvgSales provides a similar nested aggregation under the weekly histogram aggregation so you can see the average of all sales specific to that week.
Finally, the pipeline aggregation avg_all_weekly_sales will give the average of weekly sales based on the TotalSales bucket and the number of non-empty buckets - if you want to include empty buckets, add the gap_policy parameter like so:
...
"avg_all_weekly_sales": {
"avg_bucket": {
"buckets_path": "WeekAggergation>TotalSales",
"gap_policy": "insert_zeros"
}
}
...
(see: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-pipeline-avg-bucket-aggregation.html).
This pipeline aggregation may or may not be what you're actually looking for, so please check the math to ensure the result is what is expected, but should provide the correct output based on the original script.

ElasticSearch range in sum aggregation

I'm a new user of elasticsearch and I would like make a range on sum aggregation.
So, I have :
{
"query": {},
"aggs": {
"group_by_trainset" : {
"terms": {
"field": "trainset",
"order": { "sum_compteur": "desc" }
},
"aggs": {
"sum_compteur": {
"sum": {
"field": "compteur"
}
}
}
}
}
}
And I have a 10 first results.
I want a pagination or it's not possible to aggs on elasticsearch. I try to return the next 10 results.
So, I want display the 10 results that are lower than the lowest value of the "sum_compteur" of the first 10 results and I don't know how.
Thanks for your help !
For every hit you'll get same Aggregations given input parameters are not changes.
If you want to specify size in aggregation counts you can do is:
"aggs": {
"sum_compteur": {
"sum": {
"field": "compteur",
"size" : 1000,
"order" : { "_count" : "asc" }
}
}
}
Where *1000 is the no of aggregation values you need.
You can also sort the results using "order". And later add pagination in the output array..

elastic search by day aggregation, sum of two properties

I'm trying to aggregate on the sum of two fields, but can't seem to get the syntax right.
Let's say I have the following aggregation:
{
"aggregations": {
"byDay": {
"date_histogram": {
"field": "#timestamp",
"interval": "1d"
},
"aggregations": {
"sum_a": {
"sum": {
"field": "a"
}
},
"sum_b": {
"sum": {
"field": "b"
}
},
"sum_a_and_b": {
/* what goes here? */
}
}
}
}
}
What I really want is an aggregation that is the sum of fields a and b.
It seem like something that would be simple, but I've hit a brick wall trying to get it right. Online examples have either been too simple (summing only on one field), or tried to do much more than this, so I've not found them helpful.
Try Terms Aggregation generating the terms using a script :
"aggs": {
"sum_a_and_b": {
"terms": {
"script": "doc['a'].value + doc['b'].value"
}
}
}
In order to enable dynamic scripting add the following to your config file (elasticsearch.yml by default) :
script.aggs: true # enable just for aggregations

Getting cardinality of multiple fields?

How can I get count of all unique combinations of values of 2 fields that are present in documents of my database, i.e. achieve the same functionality as the "cardinality" aggregation provides, but for more than 1 field?
You can use a script to achieve this. Assuming the character '#' is not present in any value of both the fields (you can use anything else to act as a separator), the query you're looking for is as under. Mind you, scripting will come with a performance hit.
{
"aggs" : {
"multi_field_cardinality" : {
"cardinality" : {
"script": "doc['<field1>'].value + '#' + doc['<field2'].value"
}
}
}
}
Read more about it here.
A better solution is to use nested aggregations and then count the resulting buckets.
"aggs": {
"Group1": {
"terms": {
"field": "Field1",
"size": 0
},
"aggs": {
"Group2": {
"terms": {
"field": "Field2",
"size": 0
}
}
}
}
}

Resources