Kibana - Calculating duration between events - elasticsearch

I am pushing events directly into the elastic search rest API, in the following format:
Timestamp
RequestId
EventName
I would like to bucket by RequestId and then subtract the max and min Timestamps to calculate duration between events in Kibana.
I can quite easily bucket them in Kibana, although its not intuitive to know how to calculate the duration. I have been changing the JSON input to try and get it to render something sensible without luck.
I have managed to achieve what I want using the elastic search API directly:
{
"size": 0,
"query": { },
"aggs": {
"requests_field": {
"terms": {
"field": "requestId",
"size": 5
},
"aggs": {
"min_date": {
"min": {
"field": "timeStamp"
}
},
"max_date": {
"max": {
"field": "timeStamp"
}
},
"duration" : {
"bucket_script" : {
"buckets_path" : {
"minDate" : "min_date",
"maxDate" : "max_date"
},
"script" : "maxDate-minDate"
}
}
}
}
}
}
How can I "visualise" this in Kibana as a simple line graph?

Related

How to get maximum value and id using Max aggregation by country in Elasticsearch

Getting maximum value by country but I want additional information for maximum value id. I tried many ways but I don't know how to fetch.
{
"aggs" : {
"country_groups" : {
"terms" : { "field" : "country.keyword",
"size":30000
},
"aggs":{
"max_price":{
"max": { "field" : "video_count"}
}
}
}
}
}
Depending on the type of your id field (numeric or string), you have two ways of doing it.
If you look at the query below, if your id is numeric you can do the same as you did with video_count, i.e. using the max metric aggregation (see max_id_num).
However, if your id field is a string, you can leverage the top_hits aggregation and sort it in descending order (see max_id_str).
{
"aggs": {
"country_groups": {
"terms": {
"field": "country.keyword",
"size": 30000
},
"aggs": {
"max_price_and_id": {
"top_hits": {
"size": 1,
"sort": {
"video_count": "desc"
},
"_source": ["channel_id", "video_count"]
}
}
}
}
}
}

Elasticsearch derivate of a deep metric

I have a web crawler that collects data and stores snapshots several times a day. My query has some aggregations that group the snapshots together per day and return the last snapshot of each day using top_hits.
The documents look like this:
"_source": {
"taken_at": "2016-02-01T11:27:09.184-03:00",
... ,
"my_metric": 113
}
I'd like to be able to calculate the derivative of a certain metric, say my_metric, of the documents returned by top_hits (i.e., the derivative of the last snapshots of each day's my_metric).
Here's what I have so far:
{
"aggs": {
"filtered_snapshots": {
"filter": {
// ...
},
"aggs" : {
"grouped_data": {
"date_histogram": {
"field": "taken_at",
"interval": "day",
"format": "YYYY-MM-dd",
"order": { "_key" : "asc" }
},
"aggs": {
"resource_by_date": {
"terms": { "field": "remote_id" },
"aggs": {
"latest_snapshots": {
"top_hits": {
"sort": { "taken_at": { "order": "asc" }},
"size" : 1
}
}
}
},
"my_metric_deriv": {
"derivative": {
"buckets_path": "resource_by_date>latest_snapshots>my_metric"
}
}
}
}
}
}
}
}
I get a "No aggregation [my_metric] found for path ..." error with the query above.
Am I using a wrong bucket_path? I've read through the bucket_path and the derivative documentation and haven't found much that could help.
The documentation mentions briefly "deep metrics", stating that they can be limited in some ways, which I couldn't quite understand. I'm not sure how or if the limitations affect my case.

How to calculate average X per Y in elastic search?

Let's say I have a list of events, like 'pageview'. I want to calculate average pageviews per session.
My document looks like this
{
sessionID: 'xxx',
action: 'pageview'
}
So what I'm tried to do is to first aggregate by sessionID and then apply avg. child aggregation, but it's not what I expected.
I'm very new to ElasticSeach. What would be the logic to generate such aggregation in EC?
Thanks
You've started correctly by aggregating on the sessionID field. Then you need another filter sub-aggregation on the action field to match only pageviewactions. Your aggregation query would look like this:
{
"size": 0,
"aggs": {
"sessions": {
"terms": {
"field": "sessionID"
},
"aggs": {
"pageviews": {
"filter": {
"term": {
"action": "pageview"
}
}
}
}
}
}
}
This is going to give you the total doc_count for each of your sessions and in each session bucket you'll get the total doc_count for pageview actions within that session.
The average can then easily be calculated with
response.aggregations.sessions.forEach(function(session) {
var actionsInSession = session.doc_count;
var pageviewActions = session.pageviews.doc_count;
var avg = pageviewActions / actionsInSession;
// do something with the average value
});
UPDATE
If you're using (or willing to use) ES 2.0, you can get ES to calculate those averages for you using pipeline aggregations.
{
"size": 0,
"aggs": {
"sessions": {
"terms": {
"field": "sessionID"
},
"aggs": {
"total": {
"value_count": {
"field": "sessionID"
}
},
"pageviews": {
"filter": {
"term": {
"action": "pageview"
}
},
"aggs": {
"cnt": {
"value_count": {
"field": "action"
}
}
}
},
"avg": {
"bucket_script": {
"buckets_path": {
"total": "total",
"pageviews": "pageviews > cnt"
},
"script": "pageviews / total"
}
}
}
}
}
}
In each sessionID bucket, you'll get an avg value for the number of pageview action vs the number of total actions for that session.

Elastic Search Aggregation and Details

I am trying to get the teachers name too in this query..
From this I am able to get loop the teachers and get the number of classes she is working for and also the amount of money she gets for each year.
But I can't get full details in this query. I want to display teachers name too.
here is my current query
{
"aggs": {
"teacher": {
"terms": {
"field": "teacher_id",
"size": 10
},
"aggs": {
"academic_year": {
"date_histogram": {
"field": "acc_year",
"interval": "year"
},
"aggs": {
"income": {
"stats": {
"field": "teacher_hourly_fee"
}
}
}
}
}
}
},
"size": 0
}
Most straightforward approach may be to combine teacher ID and name as a generated term using a script:
{
"aggs" : {
"teacher" : {
"terms" : {
"script" : "_source.teacher_id + '-' + _source.teacher_name",
"size": 10
}
}
}
}
Adjust script particulars per your actual schema.

Calculating sum of nested fields with date_histogram aggregation in Elasticsearch

I'm having trouble getting the sum of a nested field in Elasticsearch using a date_histogram, and I'm hoping somebody can lend me a hand.
I have a mapping that looks like this:
"client" : {
// various irrelevant stuff here...
"associated_transactions" : {
"type" : "nested",
"include_in_parent" : true,
"properties" : {
"amount" : {
"type" : "double"
},
"effective_at" : {
"type" : "date",
"format" : "dateOptionalTime"
}
}
}
}
I'm trying to get a date_histogram that shows total revenue by month across all clients--i.e. a time series showing the sum associated_transactions.amount in a histogram determined by associated_transactions.effective_date. I tried running this query:
{
"query": {
// ...
},
"aggregations": {
"revenue": {
"date_histogram": {
"interval": "month",
"min_doc_count": 0,
"field": "associated_transactions.effective_at"
},
"aggs": {
"monthly_revenue": {
"sum": {
"field": "associated_transactions.amount"
}
}
}
}
}
}
But the sum it's giving me isn't right. It seems that what ES is doing is finding all clients who have any transaction in a given month, then summing all of the transactions (from any time) for those clients. That is, it's a sum of the amount spent in the lifetime of a client who made a purchase in a given month, not the sum of purchases in a given month.
Is there any way to get the data I'm looking for, or is this a limitation in how ES handles nested fields?
Thanks very much in advance for your help!
David
Try this?
{
"query": {
// ...
},
"aggregations": {
"revenue": {
"date_histogram": {
"interval": "month",
"min_doc_count": 0,
"field": "associated_transactions.effective_at"
"aggs": {
"monthly_revenue": {
"sum": {
"field": "associated_transactions.amount"
}
}
}
}
}
}
}
i.e. move the "aggs" key into the "date_histogram" field.
I stumbled upon this question while trying to solve similar problem with my implementation of ES.
It seems that currently Elasticsearch looks at position of aggregation in the JSON body request tree - not inheritance of its objects and filelds. So you should not put your sum aggregation "inside" "date_histogram", but place it outside on the same JSON tree level.
This worked for me:
{
"size": 0,
"aggs": {
"histogram_aggregation": {
"date_histogram": {
"field": "date_vield",
"calendar_interval": "day"
},
"aggs": {
"views": {
"sum": {
"field": "the_vield_i_want_to_sum"
}
}
}
}
},
"query": {
#some query
}
OP made mistake of placing his sum aggregation inside date histogram aggregation.

Resources