Calculate date pattern and predict - algorithm

I have a list of dates which indicates when the job finished running successfully: The list can be millions of dates
YYYY-MM-DD HH:MM:SS
2016-01-01 05:00:00
2016-01-02 05:00:00
2016-01-05 13:00:00
2016-01-06 13:00:00
2016-01-09 05:00:00
2016-01-10 05:00:00
Occasionally, the job might have failed and delay the process by few hour to few days:
2016-01-13 14:00:00
2016-01-15 14:00:00
2016-01-19 06:00:00
2016-01-20 06:00:00
The pattern obviously is 1 and 3 day for this list.
My question is, how can I figure out the pattern of any arbitrary list of dates, able to ignore the delay, and estimate the next date that the job will be finish running?
I need to estimate and produce a date that accurately-ish predicted when will the job finish next time, using the date that occur the most, and ignore delay if possible.
Any help will be appreciated!

Related

Week of Day for particular given date

I have some date like 2019-07-03 10:25:43 I want a week of day for that date
I tried this but I got Thursday
%{YEAR:date_year}-%{MONTHNUM:date_month}-%{MONTHDAY:date_mday} %{HOUR:time_hour}:?%{MINUTE:time_minute}:?%{SECOND:time_second}
the actual output should be Wednesday for that date but it shows Thursday

Where between in laravel not returning the date that entered in the limit?

$tasks->whereBetween('start_datetime', [$request['start_date_report'], $request['end_date_report']]);
not returning $request['end_date_report'] data.
This problem typically not related to Laravel or any framework, the problem is in the logic of your query.
The problem is probably like this:
You have a datetime column
Your data include the time part, for example you have data 2017-12-01 08:00:00 and 2017-12-02 09:00:00
Your query does not include the time part: between '2017-12-01' and '2017-12-02'
This would result only one data that is 2017-12-01 08:00:00, this is because every database engine would default the time to 00:00:00 so your query is equal to between '2017-12-01 00:00:00' and '2017-12-02 00:00:00'
Therefore, 2017-12-02 09:00:00 is greater than the upper limit of 2017-12-02 00:00:00, thus not included in the result
So, if you want to include 2017-12-02 09:00:00 in the result, you can add the end date by 1 day (without the time part of-course), or make sure you compare the date part only

Hive - time difference in minutes is negative

I need to get time difference in minutes for my analysis in Hive query.
I am using unix_timestamp() to convert dates to seconds and then subtracting to get the diff in seconds and the multiplied by 60 for minutes.
My issue is my recent date - older date difference is coming negative.
here is my query and results
Hive query and result screenshot
processed_ts create_ts processed_unix_timestamp create_unix_timestamp miniue Diff
2017-03-12 3:01:06 2017-03-12 2:58:36 1489312865 1489316315 -57.5
2017-03-12 3:01:36 2017-03-12 2:59:06 1489312895 1489316345 -57.5
2017-03-12 3:02:12 2017-03-12 2:59:42 1489312932 1489316382 -57.5
Any help is much appreciated.
USA & Canada Start DST on March 12
Published 17-Feb-2017
Most of the United States, Canada, and Mexico's northern border cities
will begin Daylight Saving Time (DST) on Sunday, March 12, 2017.
People in areas that observe DST will spring forward 1 hour from 02:00
(2 am) to 03:00 (3 am), local time.
Standard time will resume on
Sunday, November 5, 2017.
https://www.timeanddate.com/news/time/usa-canada-start-dst-2017.html
select timestamp '2017-03-12 02:58:36'
2017-03-12 03:58:36

I am working on cucumber project and i need to verify Timezone in browser..Here is requirement of mine

Requirement:- We are displaying timezone of the current date. Not the date what we received. It will fail all test case once the tz changes to PST We need to display the corresponding time zone of the date not the current timezone of the system for all dates Eg,
Daylight Saving Time (United States) 2016 began at 2:00 AM on Sunday, March 13 and ends at 2:00 AM on Sunday, November 6 03/13/2016 01:59:59 - PST 03/13/2016 02:00:00 - PDT 04/21/2016 00:00:00 - PDT 11/06/2016 01:59:59 - PDT 11/06/2016 02:00:00 - PST
Current solution:
Currently we take Date/Time value from Database and add PDT offset to match the value in UI:
For example:
db_hash["STATUS_STATE_UPDATED_DATE"] = db_hash["STATUS_STATE_UPDATED_DATE"] + Time.zone_offset("PDT")
This is done in all step defs where we are verifying UI data with database.
Also we are appending string "PDT" to database value to match the date/time displayed in UI.
db_hash["STATUS_STATE_UPDATED_DATE"] = db_hash["STATUS_STATE_UPDATED_DATE"].gsub("-","/").gsub("+0000","PDT").strip
We will have to fix this in all the step defs as per the new requirement.

Does to_utc_timestamp take into account daylight saving?

I'm trying to convert EST datetime to UTC in a Hive query, but can't see daylight saving taken into account. Do you know how to account for daylight saving in Hive?
For example:
TO_UTC_TIMESTAMP('2014-12-31 00:00:00', 'EST') gives 2014-12-31 05:00:00 i.e. 5 hour difference
TO_UTC_TIMESTAMP('2014-06-30 00:00:00', 'EST') gives 2014-06-30 05:00:00, also 5 hour difference
I'm expecting the June query to give a 4 hour difference.
In June the East Coast observes EDT (Eastern Daylight Savings Time), but Hive doesn't understand EDT at all:
TO_UTC_TIMESTAMP('2014-12-31 00:00:00', 'EDT') gives 2014-12-31 00:00:00 i.e. no difference
Any ideas?
Thanks,
Ilmari
(Running Hadoop 1.0.3 on AWS Elastic MapReduce)
Here is an open ticket from the Hive project that address this issue.
https://issues.apache.org/jira/browse/HIVE-12194
See 2nd comment:
Ben Breakstone added a comment - 16/Oct/15 16:54
It's worth noting the daylight saving time version of US three-letter codes like "PDT" are not included in /lib/zi/ for the Oracle JDK. New identifiers like "PST8PDT" appear to work as expected.
See http://www.oracle.com/technetwork/articles/javase/alertfurtherinfo-139131.html
Perhaps as Ben Breakstone suggests new identifiers will work?

Resources