Dynamic time zone offset in elasticsearch aggregation? - elasticsearch

I'm aggregating documents that each have a timestamp. The timestamp is UTC, but the documents each also have a local time zone ("timezone": "America/Los_Angeles") that can be different across documents.
I'm trying to do a date_histogram aggregation based on local time, not UTC or a fixed time zone (e.g., using the option "time_zone": "America/Los_Angeles").
How can I convert the timezone for each document to its local time before the aggregation?
Here's the simple aggregation:
{
"aggs": {
"date": {
"date_histogram": {
"field": "created_timestamp",
"interval": "day"
}
}
}
}

I'm not sure if I fully understand it, but it seems like the time_zone property would be for that:
The zone value accepts either a numeric value for the hours offset, for example: "time_zone" : -2. It also accepts a format of hours and minutes, like "time_zone" : "-02:30". Another option is to provide a time zone accepted as one of the values listed here.

If you store another field that's the local time without timezone information it should work.
Take every timestamp you have (which is in UTC), convert it to a date in the local timezone (this will contain the timezone information). Now simply drop the timezone information from this datetime. Now you can perform actions on this new field.
Suppose you start with this time in UTC:
'2016-07-17T01:33:52.412Z'
Now, suppose you're in PDT you can convert it to:
'2016-07-16T18:33:52.412-07:00'
Now, hack off the end so you end up with:
'2016-07-16T18:33:52.412Z'
Now you can operate on this field.

Related

Kibana nano seconds showing zeros

Kibana not showing nano seconds, it is showing zeros
Actually timestamp is available in nano seconds
How to sort the data in kibana using nano seconds precision
date data type stores dates in millisecond resolution. The date_nanos data type stores dates in nanosecond resolution, which limits its range of dates from roughly 1970 to 2262, as dates are still stored as a long representing nanoseconds since the epoch.
Queries on nanoseconds are internally converted to range queries on this long representation, and the result of aggregations and stored fields is converted back to a string depending on the date format that is associated with the field.
Date formats can be customized, but if no format is specified then it uses the default. As an example, You can customise the date field like this:
PUT my-index-000001
{
"mappings": {
"properties": {
"date": {
"type": "date_nanos"
}
}
}
}

will elasticsearch date_histogram check the date inside the interval exist or not? And if so , what will happen? If no any error handling for this

So far i am working on the ES date histogram for getting monthly result, and my query is like
{
"aggs": {
"sales_over_time": {
"date_histogram": {
"field": "date",
"calendar_interval": "1M",
"offset": Cutoff
}
}
}
}
and the return is like
date
1 10978.521 2020-11-20 5995.69
2 11177.911 2020-12-20 199.39
3 11177.911 2021-01-20 0.00
So my question is :
what if the date "20" is not exist ? and any error handling from ES?
thanks
Jeff
Since it's a monthly date histogram, each bucket must have a date key. That date key is the date of the beginning of the monthly bucket. For instance, 2020-11-20 is the key and the starting date of the bucket starting on that date. In that bucket, you will find all documents whose date is between 2020-11-20 and 2020-12-20.
Same thing for the last bucket which starts on 2021-01-20, it will contain all documents starting on that date and going through 2021-02-20. It doesn't matter whether you have documents whose date field is specifically on those bucket key dates, those keys are just interval bounds.

How does the elastic search calculate the current date when using the 'now' keyword?

For example consider the following elastic query:
GET /my_docs/_search
{
"query": {
"range": {
"doc_creation_date": {
"gte": "2007-07-18T10:15:13"
"lt": "now"
}
}
}
}
So my question is:
when elastic search replaces the word 'now' in the above query - with an actual date - does it just use the date of the server its currently running on or
what other option is going on there?
The reason i am asking this is because i live in a place where the timezone changes depending on the time of the year. So between around March - October, we are at utc
Thanks
now is resolved to the Unix timestamp of the server in milliseconds.
The Unix timestamp is an epoch date defined as the number of seconds that have elapsed since 00:00:00 Coordinated Universal Time (UTC), Thursday, 1 January 1970 [https://en.wikipedia.org/wiki/Unix_time]
This means that all queries will be run against the UTC time zone unless otherwise specified.

Kibana and fixed time spans

Is it possible to set a fixed timespan for a saved visualization or a saved search in Kibana 4?
Scenario:
I want to create one dashboard with 2 visualizations with different time spans.
A metric counting unique users within 10 min (last 10 minutes)
A metric counting todays unique users (from 00.00am until now)
Note that changing the time span on the dashboard does not affect the visualizations. Possible?
You could add a date range query to the saved search you base each visualisation on. Eg, if your timestamp field is called timestamp:
timestamp:[now-6M/M TO now]
where the time range is from 'now' to '6 months ago, rounding to the start of the month.
Because Kibana also now supports JSON-based query DSL, you could also achieve the same thing by entering this into the search box instead:
{
"range" : {
"timestamp" : {
"gte": "now-6M/M",
"lte": "now"
}
}
}
For more on date range queries see https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-range-query.html#ranges-on-dates
However changing the dashboard timescale will override this if it's a subset. So if you use the above 6 month range in the saved search, but a 3 month range in the dashboard, you'll filter to 3 months of data.

Histogram on the basis of facet counts

I am currently working on a project in which I am storing user activity logs in elasticsearch. the user field in the log is like {"user":"abc#yahoo.com"}. I have a timestamp field for each activity, that describes when this activity was recorded. Can i generate date histogram on the basis of number of users in a particular time period. eg the histogram entry must show the number of users on that time. I can have this implemented by obtaining facet counts, but i need to get counts on various intervals and various ranges with minimum queries. Please guide me in this regard. Thanks.
Add a facet to your query something like the following:
{"facets": {
"daily_volume": {
"date_histogram": {
"size": 100,
"field": "created_at",
"interval": "day"
"order": "time"
}
}
}
This returns a nice set of ordered data for the number of items per day.
I then feed this to a Google Chart (the ColumnChart works nicely for histograms), doing a conversion on the returned timestamp integer to convert it to a Date type understood correctly by the Javascript charts API.

Resources