Need an aggregation on datefield in elastic search which gets the count of a day of last six months (i.e. 1st of last six months count) - elasticsearch

I have an Index in my elastic search which contains a date field "createdDate". Here, I need to get the count of documents with 1st date of last six months. i.e. I need to get the count of documents on 1st date for the period of last six months (e.g. count of 1st August, 1st July, 1st June, 1st May, 1st Apr, 1st May for september).
It would be a great help if someone looks into this and help.
Thanks..

Try date histogram aggregation.
{
"aggs" : {
"monthly_cont" : {
"date_histogram" : {
"field" : "createdDate",
"interval" : "month"
}
}
}
}
Refer document here

Related

will elasticsearch date_histogram check the date inside the interval exist or not? And if so , what will happen? If no any error handling for this

So far i am working on the ES date histogram for getting monthly result, and my query is like
{
"aggs": {
"sales_over_time": {
"date_histogram": {
"field": "date",
"calendar_interval": "1M",
"offset": Cutoff
}
}
}
}
and the return is like
date
1 10978.521 2020-11-20 5995.69
2 11177.911 2020-12-20 199.39
3 11177.911 2021-01-20 0.00
So my question is :
what if the date "20" is not exist ? and any error handling from ES?
thanks
Jeff
Since it's a monthly date histogram, each bucket must have a date key. That date key is the date of the beginning of the monthly bucket. For instance, 2020-11-20 is the key and the starting date of the bucket starting on that date. In that bucket, you will find all documents whose date is between 2020-11-20 and 2020-12-20.
Same thing for the last bucket which starts on 2021-01-20, it will contain all documents starting on that date and going through 2021-02-20. It doesn't matter whether you have documents whose date field is specifically on those bucket key dates, those keys are just interval bounds.

what does mean now/d elasticsearch

what exactly is it now-1d/d or now/d in elastic search, Below is an example query
GET /_search
{
"query": {
"range" : {
"timestamp" : {
"gte" : "now-1d/d",
"lt" : "now/d"
}
}
}
}
it will take the current timestamp(time when your query reaches to Elasticsearch) and deduct the 1 day timestamp and bring the document in that range.
These types of queries are useful when you don't want to specify the exact time and want to get data of last 1 day, 3 day, 7 day, 1 month etc.
As mentioned in official doc of range query
now is always the current system time in UTC.
Taken example from official doc of datemath
Assuming now is 2001-01-01 12:00:00, some examples are:
now+1h now in milliseconds plus one hour. Resolves to: 2001-01-01
13:00:00
now-1h now in milliseconds minus one hour. Resolves to: 2001-01-01
11:00:00
now-1h/d now in milliseconds minus one hour, rounded down to UTC
00:00. Resolves to: 2001-01-01 00:00:00
2001.02.01||+1M/d 2001-02-01 in milliseconds plus one month. Resolves to: 2001-03-01 00:00:00

filter weekends from date histogram with Elasticsearch

I have a specific use case that I'm struggling with. To give you more context, I have an index in Elasticsearch that has data related to working days only, so for a specific month I only have working days without weekends or holidays.
The problem for me is the data that I have is only produced in a specific day when there is activity, so if for example no activity has been done on monday 2018-12-09, I will not have this record.
I want to have a date histogram that ignores the weekends only, so I can have a final set of data including working days without activities, to be able to count them at the end.
So for example a query is like that for this month(December) :
{
"aggs" : {
"sales_over_time" : {
"date_histogram" : {
"field" : "date",
"interval" : "month"
}
}
}
}
I expect to have buckets, ignoring all the weekends.
If you guys have any ideas of how I should handle this issue, please give me your opinion.
Thank you

ElasticSearch 2.4 date range histogram using the difference between two date fields

I haven't been able to find anything regarding this for ES 2.* in regards to the problem here or in the docs, so sorry if this is a duplicate.
What I am trying to do is create an aggregation in an ElasticSearch query that will allow me to create buckets based on the difference in a record between 2 date fields.
I.e. If I had data in ES for a shop, I might like to see the time difference between a purchase_date field and shipped_date field.
So in that instance I'd want to create an aggregate that had buckets to give me the hits for when shipped_date - purchase_date is < 1 day, 1-2 days, 3-4 days or 5+ days.
Ideally I was hoping this was possible in an ES query. Is that the case or would the best approach be to process the results into my own array based on the time difference for each hit?
I was able to achieve this by using the built in expression language which is enabled by default in ES 2.4. The functionality I wanted was to group my results to show the difference between EndDate and Date Processed in increments of 15 days. Relevant part of the query is:
{
...,
"aggs": {
"reason": {
"date_histogram": {
"min_doc_count": 1,
"interval": "1296000000ms", // 15 days
"format": "epoch_millis",
"script": {
"lang": "expression",
"inline": "doc['DateProcessed'] > doc['EndDate'] ? doc['DateProcessed'] - doc['EndDate'] : -1"
}
}
...
}
}

Kibana and fixed time spans

Is it possible to set a fixed timespan for a saved visualization or a saved search in Kibana 4?
Scenario:
I want to create one dashboard with 2 visualizations with different time spans.
A metric counting unique users within 10 min (last 10 minutes)
A metric counting todays unique users (from 00.00am until now)
Note that changing the time span on the dashboard does not affect the visualizations. Possible?
You could add a date range query to the saved search you base each visualisation on. Eg, if your timestamp field is called timestamp:
timestamp:[now-6M/M TO now]
where the time range is from 'now' to '6 months ago, rounding to the start of the month.
Because Kibana also now supports JSON-based query DSL, you could also achieve the same thing by entering this into the search box instead:
{
"range" : {
"timestamp" : {
"gte": "now-6M/M",
"lte": "now"
}
}
}
For more on date range queries see https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-range-query.html#ranges-on-dates
However changing the dashboard timescale will override this if it's a subset. So if you use the above 6 month range in the saved search, but a 3 month range in the dashboard, you'll filter to 3 months of data.

Resources