Awk and calculating start time from end time and duration - time

I have a file with date, end time and duration in decimal format and I need to calculate the start time. The file looks like:
20140101;1212;1.5
20140102;1515;1.58
20140103;1759;.69
20140104;1100;12.5
...
The duration 1.5 for the time 12:12 means one and a half hours and the start time would be 12:12 - 1:30 = 10:42 AM or 11:00 - 12.5 = 11:00 - 12:30 = 22:30 PM. Is there an easy way for calculating such time differences in Awk or is it the good ol' split-multiply-subtract-and-handle-the-day-break-yourself all over again?
Since the values are in hours and minutes, only the minutes matter and the seconds can be discarded, for example duration 1.58 means 1:34 and the leftover 0.8 seconds can be discarded.
I'm on GNU Awk 4.1.3

As you are using gawk take adventage of its native time functions:
gawk -F\; '{tmst=sprintf("%s %s %s %s %s 00",\
substr($1,1,4),\
substr($1,5,2),\
substr($1,7,2),\
substr($2,1,2),\
substr($2,3,2))
t1=mktime(tmst)
seconds=sprintf("%f",$3)+0
seconds*=60*60
difference=strftime("%H%M",t1-seconds)
print $0""FS""difference}' file
Results:
20140101;1212;1.5;1042
20140102;1515;1.58;1340
20140103;1759;.69;1717
20140104;1100;12.5;2230
Check: https://www.gnu.org/software/gawk/manual/html_node/Time-Functions.html
Explanation:
tmst=sprintf(..) :used to create a date string from the file
that conforms with the datespec of mktime function YYYY MM
DD HH MM SS [DST].
t1=mktime(tmst) :turn datespec into a timestamp than can be
handle by gawk (as the number of seconds elapsed since 1
January 1970)
seconds=sprintf("%f",$3)+0 : convert third field to float.
seconds*=60*60 : convert hours (in float) to seconds.
difference=strftime("%H%M",t1-seconds) : get the difference in
human maner, hours an minutes.

I highly recommend to use a programming language which supports datetime calculations, because the calculation can be tricky in detail because daylight saving shifts. You can use Python for example:
start_times.py:
import csv
from datetime import datetime, timedelta
with open('input.txt', 'rb') as csvfile:
reader = csv.reader(csvfile, delimiter=';', quotechar='|')
for row in reader:
end_day = row[0]
end_time = row[1]
# Create a datetime object
end = datetime.strptime(end_day + end_time, "%Y%m%d%H%M")
# Translate duration into minutes
duration=float(row[2])*60
# Calculate start time
start = end - timedelta(minutes=duration)
# Column 3 is the start day (can differ from end day!)
row.append(start.strftime("%Y%m%d"))
# Column 4 is the start time
row.append(start.strftime("%H%M"))
print ';'.join(row)
Run:
python start_times.py
Output:
20140101;1212;1.5;20140101;1042
20140102;1515;1.58;20140102;1340
20140103;1759;.69;20140103;1717
20140104;1100;12.5;20140103;2230 <-- you see, the day matters!
The above example is using the system's timezone. If the input data refers to a different timezone, Pyhon's datetime module allows to specify it.

I would do something like this:
awk 'BEGIN{FS=OFS=";"}
{ h=substr($2,0,2); m=substr($2,3,2); mins=h*60 + m; diff=mins - $3*60;
print $0, int(diff/60) ":" int(diff%60)
}' file
That is, convert everything to minutes and then back to hours/minutes.
Test
$ awk 'BEGIN{FS=OFS=";"}{h=substr($2,0,2); m=substr($2,3,2); mins=h*60 + m; diff=mins - $3*60; print $0, int(diff/60) ":" int(diff%60)}' a
20140101;1212;1.5;10:42
20140102;1515;1.58;13:40
20140103;1759;.69;17:17

Related

Tibco Spotfire - time in seconds & milliseconds in Real, convert to a time of day

I have a list of time in a decimal format of seconds, and I know what time the series started. I would like to convert it to a time of day with the offset of the start time applied. There must be a simple way to do this that I am really missing!
Sample source data:
\Name of source file : 260521-11_58
\Recording from 26.05.2021 11:58
\Channels : 1
\Scan rate : 101 ms = 0.101 sec
\Variable 1: n1(rpm)
\Internal identifier: 63
\Information1:
\Information2:
\Information3:
\Information4:
0.00000 3722.35645
0.10100 3751.06445
0.20200 1868.33350
0.30300 1868.36487
0.40400 3722.39355
0.50500 3722.51831
0.60600 3722.50464
0.70700 3722.32446
0.80800 3722.34277
0.90900 3722.47729
1.01000 3722.74048
1.11100 3722.66650
1.21200 3722.39355
1.31300 3751.02710
1.41400 1868.27539
1.51500 3722.49097
1.61600 3750.93286
1.71700 1868.30334
1.81800 3722.29224
The Start time & date is 26.05.2021 11:58, and the LH column is elapsed time in seconds with the column name [Time] . So I just want to convert the decimal / real to a time or timespan and add the start time to it.
I have tried lots of ways that are really hacky, and ultimately flawed - the below works, but just ignores the milliseconds.
TimeSpan(0,0,0,Integer(Floor([Time])),[Time] - Integer(Floor([Time])))
The last part works to just get milli / micro seconds on its own, but not as part of the above.
Your formula isn't really ignoring the milliseconds, you are using the decimal part of your time (in seconds) as milliseconds, so the value being returned is smaller than the format mask.
You need to convert the seconds to milliseconds, so something like this should work
TimeSpan(0,0,0,Integer(Floor([Time])),([Time] - Integer(Floor([Time]))) * 1000)
To add it to the time, this would work
DateAdd(Date("26-May-2021"),TimeSpan(0,0,0,Integer([Time]),([Time] - Integer([Time])) * 1000))
You will need to set the column format to
dd-MMM-yyyy HH:mm:ss:fff

MetaTrader 4 / MQL4 Time is off by -5 hours but only when using epoc time

I'm working with MetaTrader4 running in WINE in Ubuntu 16.04. I have a simple inline function that saves the time and various other info to a file using this line:
FileWrite(data_filehandle, "sys_time:" + (string)TimeLocal() + "." + StringFormat("%06lu", usec_instance) + ", sym:" + (string)Symbol() + ", tick_time:" + (string)last_tick.time + ", ask:" + (string)last_tick.ask + ", bid:" + (string)last_tick.bid);
Using the directive:
#property strict
will cause it to output the time in a Date Time format.
Removing that directive will cause it to output time in epoc format.
When it uses Date Time format ( by using '#property strict' ), the time is correct.
It outputs:
sys_time:2020.01.21 07:38:02.994394, sym:EURUSD, tick_time:2020.01.21 14:38:03, ask:1.1104, bid:1.1103
This matches my system time correctly.
Now if I remove '#property strict' to switch to epoc time
It outputs:
sys_time:1579592538.630395, sym:EURUSD, tick_time:1579617738, ask:1.1105, bid:1.11041
my local time is:
$ date '+%s'
1579610544
$ date '+%Z %z'
EST -0500
my time: 1579610544 - MT4 LocalTime: 1579592538 = 18006 seconds (Which is 5 hours and 6 sec behind me)
Any idea on what could be causing this?
I might be slightly less confused if it were +5 hours because that would be GMT.
But it's -5 hours which is Hawaii.
Also... why is the time correct in one format, but not in the other format?
Additional info
I ran some more tests using additional MQL4 functions. I had them constantly pump out their results to my text file. I then quickly whipped together a BASH script to check out the results. I found the following:
Using this code in MT4
FileWrite(data_filehandle, "TimeDaylightSavings(): " + (string)TimeDaylightSavings());
FileWrite(data_filehandle,"TimeLocal(): "+(string)TimeLocal());
FileWrite(data_filehandle,"TimeGMTOffset(): "+(string)TimeGMTOffset());
FileWrite(data_filehandle,"TimeGMT(): "+(string)TimeGMT()+"\n\n");
Gave this output to my text file ( one fresh record about every second ):
TimeDaylightSavings(): 0
TimeLocal(): 1579601184
TimeGMTOffset(): 18000
TimeGMT(): 1579619184
I whipped up this BASH script to scan and check the results in real time:
#!/bin/bash
IFS=$'\n'$'\b';
while true
do
my_time=$(date);
my_epoc=$(date '+%s');
my_record="$( cat EURUSD_price_data.txt| dos2unix | tail -5 )";
mt4_time_local=$( echo "$my_record" | grep -w 'TimeLocal' );
echo "Reading line: $mt4_time_local";
mt4_time_local=$(echo $mt4_time_local | awk '{print $2}' );
echo "My time: $my_time -- My epoc: $my_epoc -- MT4_TimeLocal epoc: $mt4_time_local -- Difference: $(( $my_epoc - $mt4_time_local ))";
mt4_time_GMT=$( echo "$my_record" | grep -w 'TimeGMT' );
echo "Reading line: $mt4_time_GMT";
mt4_time_GMT=$(echo $mt4_time_GMT | awk '{print $2}' );
echo "My time: $my_time -- My epoc: $my_epoc -- MT4_TimeGMT epoc: $mt4_time_GMT -- Difference: $(( $my_epoc - $mt4_time_GMT ))";
echo "";
sleep 1;
done
and got this result:
Reading line: TimeLocal(): 1579601184
My time: Tue Jan 21 10:06:25 EST 2020 -- My epoc: 1579619185 -- MT4_TimeLocal epoc: 1579601184 -- Difference: 18001
Reading line: TimeGMT(): 1579619184
My time: Tue Jan 21 10:06:25 EST 2020 -- My epoc: 1579619185 -- MT4_TimeGMT epoc: 1579619184 -- Difference: 1
Now if I add '#property strict' to switch back to Date Time format I get:
TimeDaylightSavings(): 0
TimeLocal(): 2020.01.21 10:23:56
TimeGMTOffset(): 18000
TimeGMT(): 2020.01.21 15:23:56
My system time:
$ date
Tue Jan 21 10:23:57 EST 2020
Conclusion
For some reason when getting epoc time TimeLocal() gives the wrong time (Hawaiian time for some reason ) , but surprisingly TimeGMT() gives the correct time, even though I am in the EST timezone.
Using the exact same code and set up, when getting the time in Date Time format ( using the '#property strict' directive ) the situation is reversed. TimeLocal() gives the correct time and TimeGMT() gives the wrong time ( but at least it gives correct GMT time )
Is this a bug in MT4, or is there something going on behind the scenes that I haven't fully understood yet?
Q : Any idea on what could be causing this?
The #property strict is a compiler-phase kill-switch, which changes lots of details how the MQL4, syntactically correct, compositions will get understood in either { "old" | "new" }-fashion
( not reading this part of the documentation each time after MT4 IDE update may and will surprise you, so better re-read it always after each and every update )
"Old"-MQL4 used a int32 for datetime internal storage, "New"-MQL4.56789… uses int64, so any roll-overs are way farther.
FileWrite( data_filehandle, "sys_time:"
+ (string)TimeLocal() // localhost-dependent
+ "."
+ StringFormat( "%06lu", usec_instance )
+ ", sym:"
+ (string)Symbol()
+ ", tick_time:"
+ (string)last_tick.time // Fx-QUOTE-dependent
+ ", ask:"
+ (string)last_tick.ask
+ ", bid:"
+ (string)last_tick.bid
);
See TimeGMT() and TimeGMTOffset() for other built-in options.
After a lot of reading, thinking and testing I have found the answer ( though I don't know what they were thinking when they did things this way ).
The issue is right here:
TimeDaylightSavings(): 0
TimeLocal(): 1579601184 <-- should be the same as TimeGMT(). Epoc does not respect timezone
TimeGMTOffset(): 18000
TimeGMT(): 1579619184
Unix epoc time is based on an event that happened on Jan 1st 1970 00:00 UTC.
It does not respect time zones. It is supposed to be the same for everyone on Earth at all times.
You can also see from the original problem that anytime I use the:
date '+%s'
command that my system gives me a correct Unix epoc which matches the output of TimeGMT() from MT4. So the problem is not my system.
However, above we can see that TimeLocal() is not treating the epoc the way that it should. It is adjusting the unix epoc in an attempt to compensate for my time zone. This behavior is probably how it accomplishes producing the correct time when it is displayed in Date Time format. The problem is that when it is asked to produce the time in epoc format it is still doing time zone conversions, which violates that very meaning of Unix epoc time.
So, the solution ( as far as I can tell ) is to simply use TimeGMT() anytime I want a correct Unix epoc. It seems pretty crazy that I have to specifically avoid using TimeLocal() in certain cases... but I guess that's just the way it is?

Ruby Date Time Subtraction

I am trying to calculate the exact duration a process took from some log file result. After parsing the log, I reached at the following stage:
my_array = ["Some_xyz_process", "Start", "2018-07-12", "12:59:53,397", "End", "2018-07-12", "12:59:55,913"]
How can I subtract the start date and time from the end date and time in order to retrieve the exact duration the process took?
my_array = ["Some_xyz_process",
"Start", "2018-07-12", "12:59:53,397",
"End", "2018-07-12", "12:59:55,913"]
require 'date'
fmt = '%Y-%m-%d%H:%M:%S,%L'
is = my_array.index('Start')
#=> 1
ie = my_array.index('End')
#=> 4
DateTime.strptime(my_array[ie+1] + my_array[ie+2], fmt).to_time -
DateTime.strptime(my_array[is+1] + my_array[is+2], fmt).to_time
#=> 2.516 (seconds)
See DateTime#strptime and DateTime# (the latter for format directives). As long as the date and time formats are known I always prefer strptime to parse. Here's an example of why:
DateTime.parse 'Theresa May has had a bad week over Brexit'
#=> #<DateTime: 2018-05-01T00:00:00+00:00 ((2458240j,0s,0n),+0s,2299161j)>`.
You can concat the date and time field and use Time.parse to convert it to a time object and then calculate the difference in number of seconds
Time.parse('2018-07-12 12:59:55,397').to_i - Time.parse('2018-07-12 12:59:53,913').to_i
Hope this helps

Smaller variation between times of different days

I have working on a algorithm that select a set of date/time objects with a certain characteristic, but with no success.
The data to be used are in a list of lists of date/time objects,
e.g.:
lstDays[i][j], i <= day chooser, j <= time chooser
What is the problem? I need a set of nearest date/time objects. Each time of this set must come from different days.
For example: [2012-09-09 12:00,2012-09-10 12:00, 2012-09-11 12:00]
This example of a set of date/time objects is the best example because it minimize to zero.
Important
Trying to contextualize this: I want to observe if a phenomenon occurs at the same time in differents days. If not, I want to evaluate if distance between the hours is reasonable for my study.
I would like a generic algorithm to any number of days and time. This algorithm should return all set of datetime objects and its time distance:
[2012-09-09 12:00,2012-09-10 12:00, 2012-09-11 12:00], 0
[2012-09-09 13:00,2012-09-10 13:00, 2012-09-11 13:05], 5
and so on.
:: "0", because the diff between all times on the first line from datetime objects is zero seconds.
:: "5", because the diff between all times on the second line from datetime objects is five seconds.
Edit: Code here
for i in range(len(lstDays)):
for j in range(len(lstDays[i])):
print lstDays[i][j]
Output:
2013-07-18 11:16:00
2013-07-18 12:02:00
2013-07-18 12:39:00
2013-07-18 13:14:00
2013-07-18 13:50:00
2013-07-19 11:30:00
2013-07-19 12:00:00
2013-07-19 12:46:00
2013-07-19 13:19:00
2013-07-22 11:36:00
2013-07-22 12:21:00
2013-07-22 12:48:00
2013-07-22 13:26:00
2013-07-23 11:18:00
2013-07-23 11:48:00
2013-07-23 12:30:00
2013-07-23 13:12:00
2013-07-24 11:18:00
2013-07-24 11:42:00
2013-07-24 12:20:00
2013-07-24 12:52:00
2013-07-24 13:29:00
Note: lstDays[i][j] is a datetime object.
lstDays = [ [/*datetime objects from a day i*/], [/*datetime objects from a day i+1*/], [/*datetime objects from a day i+2/*], ... ]
And I am not worried with perfomance, a priori.
Hope that you can help me! (:
Generate a histogram:
hours = [0] * 24
for object in objects: # whatever your objects are
# assuming object.date_time looks like '2013-07-18 10:55:00'
hour = object.date_time[11:13] # assuming the hour is in positions 11-12
hours[int(hour)] += 1
for hour in xrange(24):
print '%02d: %d' % (hour, hours[hour])
You can always resort to calculating the times into a list, then estimate the differences, and group those objects that are below that limit. All packed into a dictionary with the difference as the value and the the timestamps as keys. If this is not exactly what you need, I'm pretty sure it should be easy to select whatever result you need from it.
import numpy
import datetime
times_list = [object1.time(), object2(), ..., objectN()]
limit = 5 # limit of five seconds
groups = {}
for time in times_list:
delta_times = numpy.asarray([(tt-time).total_seconds() for tt in times_list])
whr = numpy.where(abs(delta_times) < limit)[0]
similar = [str(times_list[ii]) for ii in whr]
if len(similar) > 1:
similar.sort()
max_time = numpy.max(delta_times[whr]) # max? median? mean?
groups[tuple(similar)] = max_time

Faster date formatting in R?

I often need to convert (long) character strings into the date class in R. I notice that this step seems quite slow.
Example:
date <- c("5/31/2013 23:30", "5/31/2013 23:35", "5/31/2013 23:40", "5/31/2013 23:45", "5/31/2013 23:50", "5/31/2013 23:55")
Date <- as.POSIXct(date, format="%m/%d/%Y %H:%M")
This isn't a huge problem, but I wonder if I'm overlooking an easy route to increased efficiency. Any tips for speeding this up? Thanks.
Since I wrote this before it was pointed out this is a duplicate, I'll add it as an answer anyway. Basically package fasttime can help you IF you have dates AFTER 1970-01-01 00:00:00 AND they are GMT AND they are of the format year, month, day, hour, minute, second. If you can rewrite your dates to this format then fastPOSIXct will be quick:
# data
date <- c( "2013/5/31 23:30" , "2013/5/31 23:35" , "2013/5/31 23:40" , "2013/5/31 23:45" )
require(fasttime)
# fasttime function
dates.ft <- fastPOSIXct( date , tz = "GMT" )
# base function
dates <- as.POSIXct( date , format= "%Y/%m/%d %H:%M")
# rough comparison
require(microbenchmark)
microbenchmark( fastPOSIXct( date , tz = "GMT" ) , as.POSIXct( date , format= "%Y/%m/%d %H:%M") , times = 100L )
#Unit: microseconds
# expr min lq median uq max neval
# fastPOSIXct(date, tz = "GMT") 19.598 21.699 24.148 25.5485 215.927 100
# as.POSIXct(date, format = "%Y/%m/%d %H:%M") 160.633 163.433 168.332 181.9800 278.220 100
But the question would be, is it quicker to transform your dates to a format fasttime can accept or just use as.POSIXct or buy a faster computer?!

Resources