Elasticsearch: how to compute the average recurring time of an event? - time

I'm facing the following issue. Say that you have an index in elasticsearch modeling a supermarket. You have a document for each time a customer shops something. So,
customer1, [list of bougth items], timestamp
customer2, [list of bougth items], timestamp
customer3, [list of bougth items], timestamp
customer2, [list of bougth items], timestamp
Now, it is possible that a customer come to the shop one time in his life, or he come every week, or everyday. So I record the timestamp indicating the time when he was buying something.
I would like to compute the "average recurring time" of the customer to buy something. E.g. if a customer buys on 1sep, 7 sep and 30 sep, his average recurring time is:
first interval: 7days
second interval: 23 days
average recurring time: (23+7)/2 = 15 days.
Do you know an aggregation that can help me to find out this statistic? The problem, for me, it's that I have only the timestamp of the day of buying and I haven't the differences between consecutive buying.
Thanks all.

Unfortunately I don't think that this is possible.

Related

Fix time zone in countries thats add and subtract one hour when is summer or winter

i have this problem, there is a system thats has the duty to deliver a message in the future in a certain date, like a time limit, so this system is in another country thats have the time zone -4 hours and the country thats recibe the message is in a time zone -3 hours, for this country are Chile and Brazil, the problem is, this countries when they enter to the summer time they add one hour, but in different dates, for a period of time like 4 months thy have the same time zone, so for me to deliver the message in the right time i have to add one hour to the date in when the time zone are in -4 and - 3 but when the time zone match -3 hours i dont have to do nothing.
for this task i want to use the function time.LoadLocation("America/Sao_Paulo"), but reading the documentation dosen't mention if they fix the time zone when this countries change their time zone.
does anyone know if they adjust the time zone, or know another function thats can work in this situation?
Areas that observe daylight savings time don’t change the time zone, they change the time standard. For instance, Los Angeles observes a given time eight hours before UTC for much of the year, but seven hours before UTC during the summer; yet Los Angeles is always “Pacific time”.
If you set the time zone to "America/Sao_Paulo", times will be measured relative to Sao Paulo’s then-active time standard.

DAX Query ( Using FIlter and MAX function ), calculate Total Sales for the last running 30 days

I am new to DAX and encountered a measure as below,
30 Day Running Total = CALCULATE([Total Sales],
FILTER (ALL (Dates), Dates[Date]>(Dates[Date]) -30 && Dates[Date] <= (Dates[Date] )))
i.e. to calculate Total Sales for last 30 days in a cumulative way for the data from 1st January 2018 to 30 December 2021, the above measure i am not able to understand
My understanding is as below, please let me know where I am moving in wrong direction
FILTER ( ALL(Dates) -> Removes all filters means take date from minimum to maximum from the complete table and i.e. between 1st January 2018 till 30-december-2021
Dates[Date]>MAX(Dates[Date]) -30 -> "Takes Total Sales from the current row in table minus 30 days".
For example if the DAX calculation is on 30th January 2018 then it considers all the total sales from 1st January 2018 till 30th January 2018
Then why do we need to mention another filter Dates[Date] <= MAX(Dates[Date] )?
Thanks in advance for your time
Regards
Sumit Malik
Sumit your main concern seem to be Point (3)
why do we need to mention another filter Dates[Date] <= MAX(Dates[Date] )?
Your doubt is correct, if the data is clean, you do not need to define that upper-bound filter because theoretically considering sales from 30 days ago, there should not be sales after today.
Unfortunately, often data is dirty and there might be Sales in the future. Therefore, defining an upper-bound is a best-practice to avoid this kind of dirty data issues. Remember that in software engineering you program thinking the worst-case scenario, therefore, defining an upper bound does not harm :)

How do I divide a dollar amount by a time stamp

I want to figure out how many dollars & cents were made or lost per minute. So for instance, if I have $10 and it took me 10 minutes (00:10:00) to acquire it, I want to know how much I made per minute:
Dollars & cents made per minute = $10 / 0:10:00
How do I implement this in google sheets?
try like this:
=A2/1440/TIMEVALUE(B2)
Convert cell A1 to minutes with: =HOUR(A1)*60+MINUTE(A1)+SECOND(A1)/60.

Date histogram every half a month

How to write an interval that groups by every half a month? Rather than 1M I want something like 1/2M to group by from the first to the 16th and from the 16th to the end of the month, every month. Is there a way to do so?
I don't want to end up doing an interval on each day and then calculate manually my results as it's not clean and it would be resource hungry, is there a simple way to do so using setInterval? (in Elasticsearch or Elastica I don't care, I just want the algorithm behind it, thanks!)
$date_grp_agg = new \Elastica\Aggregation\DateHistogram('date');
$date_grp_agg->setField('date')->setFormat("MM-yy")->setInterval('1M'); // This one
Unfortunately, neither 0.5M (half month) nor 2w (2 weeks) are supported, but you could try to use a number of days, i.e. 15d.
$date_grp_agg->setField('date')->setFormat("MM-yy")->setInterval('15d');
Granted, it will not fit months perfectly, i.e. it won't start on the 1st and end on the last day of the month, but it can get you close to the kind of interval you're looking for.

Summing times in Google sheets

I have a sheet where I record my working hours (this is more for me to remind me to stop working than anything else). For every day, I have three possible shifts - early, normal & late, and I have a formula which will sum up any times put into these columns and give me the daily total hours.
To summarise the duration of time spent working in a day, I use the following formula: =(C41-B41)+(E41-D41)+12+(G41-F41) which is:
early end time minus early start time
normal end time minus normal start time PLUS 12 hours
late end time minus late start time
Which gives me output like this:
What I cannot seem to achieve is, the ability to sum the daily totals into something which shows me the total hours worked over 1-week. If I attempt to sum the daily totals together for the example image shown, I get some wild figure such as 1487:25:00 when formatting as 'Duration' or 23:25:00 when formatted as 'Time'!
All my cells where I record the hours worked are formatted as 'Time'
When using arithmetic operations on date values in Google Sheets, it's important to remember that the internal representation of a date is numeric, and understood as the number of days since January 1, 1970.
What follows from that, is that if you want to add 12 hours to a time duration, you should not write "+12" because that will in fact add 12 days. Instead add "+12/24". In other words, try the following formula instead of the one you are using now:
=(C41-B41)+(E41-D41)+(12/24+G41-F41)

Resources