I have been developing a script on my linux box for quite some time, and wanted to run it on my Mac as well.
I thought that the functions on the Mac were the same as the functions on linux, but today I realized it was wrong. I knew that fewer functions existed on the Mac, but I thought that the functions that did exist, had the same implementation.
This problem is specifically in regards to the date command.
When I run the command on my linux machine with the parameter to provide some time in nanoseconds, I get the correct result, but when I run it on my mac, it does not have that option.
Linux-Machine> date +%N
55555555555 #Current time in nanoseconds
Mac-Machine> date +%N
N
How do I go about getting the current time in nanoseconds as a bash command on the Mac?
Worst case is I create a small piece of code that calls a system function in C or something and then call it within my script.
Any help is much appreciated!
This is because OSX and Linux use two different sets of tools. Linux uses the GNU version of the date command (hence, GNU/Linux). Remember that Linux is Linux and OS X is Unix. They're different.
You can install the GNU date command which is included in the "coreutils" package from MacPorts. It will be installed on your system as gdate. You can either use that, or link the date binary with the new gdate binary; your choice.
man date indicates that it doesn't go beyond one second. I would recommend trying another language (Python 2):
$ python -c 'import time; print repr(time.time())'
1332334298.898616
For Python 3, use:
$ python -c 'import time; print(repr(time.time()))'
There are "Linux specifications" but they do not regulate the behavior of the date command much. What you have is really the opposite -- Linux (or more specifically the GNU user-space tools) has a large number of extensions which are not compatible with Unix by any reasonable definition.
There is a large number of standards which do regulate these things. The one you should be looking at is POSIX which requires
date [-u] [+format]
and nothing more to be supported by adhering implementations. (There are other standards like XPG and SUS which you might want to look at as well, but at the very least, you should require and expect POSIX these days ... finally.)
The POSIX document contains a number of examples but there is nothing for date conversion which is however a practical problem which many scripts turn to date for. Also, for your concrete problem, there is nothing for reporting times with sub-second accuracy in POSIX.
Anyway, griping that *BSD isn't Linux isn't really helpful here; you just have to understand what the differences are, and code defensively. If your requirements are complex or unusual, perhaps turn to a scripting language like Perl or Python which perform these types of date formatting operations more or less out of the box in a standard installation (though neither Perl nor Python have a quick and elegant way to do date conversion out of the box, either; solutions tend to be somewhat tortured).
In practical terms, you can compare the MacOS date man page and the Linux one and try to reconcile your requirements.
For your practical requirement, MacOS date does not support any format string with nanosecond accuracy, but nor are you likely to receive useful results on that scale when the execution of the command will take a significant number of nanoseconds. I would settle for millisecond-level accuracy (and even that is going to be thrown off by the execution time in the final digits) and multiply to get the number in nanosecond scale.
nanoseconds () {
python -c 'import time; print(int(time.time()*1000*1000*1000))'
}
(Notice the parentheses around the argument to print() for Python 3.) You will notice that Python does report a value at nanosecond accuracy (the last digits are often not zeros), though by the time you have run time.time() the value will obviously no longer be correct.
To get an idea of the error rate,
bash#macos-high-sierra$ python3
Python 3.5.1 (default, Dec 26 2015, 18:08:53)
[GCC 4.2.1 Compatible Apple LLVM 7.0.2 (clang-700.1.81)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import time
>>> import timeit
>>> def nanoseconds ():
... return int(time.time()*1000*1000*1000)
...
>>> timeit.timeit(nanoseconds, number=10000)
0.0066173350023746025
>>> timeit.timeit('int(time.time()*1000*1000*1000)', number=10000)
0.00557799199668807
The overhead of starting Python and printing the value is probably going to add a few orders of magnitude of overhead, realistically, but I haven't attempted to quantify that. (The output from timeit is in seconds.)
Related
I have a function that converts an int to a date which is then fed into datediff to find how many days since an event happened. One of our tests passes on PySpark Windows and in our Azure DevOps pipeline, but fails when run on PySpark in WSL Ubuntu. We've narrowed it down to the to_date() function producing different results on the 2 platforms, but don't understand why.
import pyspark.sql.functions as F
import datetime
def from_int_to_date(int_date: int) -> datetime.datetime:
"""
Convert an integer in YYYYMMDD format into a datetime object
"""
return datetime.datetime.strptime(str(int_date), "%Y%m%d")
If I calculate F.to_date(F.lit(from_int_to_date(20190401))) I get Column<b"to_date(TIMESTAMP '2019-04-01 00:00:00')"> on Windows and Column<b"to_date(TIMESTAMP('2019-03-31 23:00:00.0'))> on the version running under WSL.
I am based in the UK and on 1 April 2019 we did our clock change for summer so I can understand the reason why it goes back an hour as the problem doesn't occur with an input int of 20190331. I'm just trying to understand why the behaviour of to_date() is different on the two systems and what we should do to mitigate for this (and any other differences) as ideally our code would be platform agnostic.
Set the timezone to the spark driver with the configuration spark.sql.session.timeZone so you won't depend on the system clock.
spark.conf.set("spark.sql.session.timeZone", "Europe/London")
This option can be settled even when the spark session is created.
I have been developing a script on my linux box for quite some time, and wanted to run it on my Mac as well.
I thought that the functions on the Mac were the same as the functions on linux, but today I realized it was wrong. I knew that fewer functions existed on the Mac, but I thought that the functions that did exist, had the same implementation.
This problem is specifically in regards to the date command.
When I run the command on my linux machine with the parameter to provide some time in nanoseconds, I get the correct result, but when I run it on my mac, it does not have that option.
Linux-Machine> date +%N
55555555555 #Current time in nanoseconds
Mac-Machine> date +%N
N
How do I go about getting the current time in nanoseconds as a bash command on the Mac?
Worst case is I create a small piece of code that calls a system function in C or something and then call it within my script.
Any help is much appreciated!
This is because OSX and Linux use two different sets of tools. Linux uses the GNU version of the date command (hence, GNU/Linux). Remember that Linux is Linux and OS X is Unix. They're different.
You can install the GNU date command which is included in the "coreutils" package from MacPorts. It will be installed on your system as gdate. You can either use that, or link the date binary with the new gdate binary; your choice.
man date indicates that it doesn't go beyond one second. I would recommend trying another language (Python 2):
$ python -c 'import time; print repr(time.time())'
1332334298.898616
For Python 3, use:
$ python -c 'import time; print(repr(time.time()))'
There are "Linux specifications" but they do not regulate the behavior of the date command much. What you have is really the opposite -- Linux (or more specifically the GNU user-space tools) has a large number of extensions which are not compatible with Unix by any reasonable definition.
There is a large number of standards which do regulate these things. The one you should be looking at is POSIX which requires
date [-u] [+format]
and nothing more to be supported by adhering implementations. (There are other standards like XPG and SUS which you might want to look at as well, but at the very least, you should require and expect POSIX these days ... finally.)
The POSIX document contains a number of examples but there is nothing for date conversion which is however a practical problem which many scripts turn to date for. Also, for your concrete problem, there is nothing for reporting times with sub-second accuracy in POSIX.
Anyway, griping that *BSD isn't Linux isn't really helpful here; you just have to understand what the differences are, and code defensively. If your requirements are complex or unusual, perhaps turn to a scripting language like Perl or Python which perform these types of date formatting operations more or less out of the box in a standard installation (though neither Perl nor Python have a quick and elegant way to do date conversion out of the box, either; solutions tend to be somewhat tortured).
In practical terms, you can compare the MacOS date man page and the Linux one and try to reconcile your requirements.
For your practical requirement, MacOS date does not support any format string with nanosecond accuracy, but nor are you likely to receive useful results on that scale when the execution of the command will take a significant number of nanoseconds. I would settle for millisecond-level accuracy (and even that is going to be thrown off by the execution time in the final digits) and multiply to get the number in nanosecond scale.
nanoseconds () {
python -c 'import time; print(int(time.time()*1000*1000*1000))'
}
(Notice the parentheses around the argument to print() for Python 3.) You will notice that Python does report a value at nanosecond accuracy (the last digits are often not zeros), though by the time you have run time.time() the value will obviously no longer be correct.
To get an idea of the error rate,
bash#macos-high-sierra$ python3
Python 3.5.1 (default, Dec 26 2015, 18:08:53)
[GCC 4.2.1 Compatible Apple LLVM 7.0.2 (clang-700.1.81)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import time
>>> import timeit
>>> def nanoseconds ():
... return int(time.time()*1000*1000*1000)
...
>>> timeit.timeit(nanoseconds, number=10000)
0.0066173350023746025
>>> timeit.timeit('int(time.time()*1000*1000*1000)', number=10000)
0.00557799199668807
The overhead of starting Python and printing the value is probably going to add a few orders of magnitude of overhead, realistically, but I haven't attempted to quantify that. (The output from timeit is in seconds.)
I have a question: I want to time some calculations using the UNIX-time command and I figured out that Maple (I use version 16) on Ubuntu 12.04 LTS (and some other machines I tested, including Macs) has some odd properties.
calling
time maple < testCalc.txt
where testCalc.txt contains the following code:
with(DETools):
DFactor(mult(x^5*d^5 + 6*x*d +1,x^5*d^5 + x^2*d^2 +7,[d,x]),[d,x]);
results in the following output:
memory used=65.5MB, alloc=72.9MB, time=0.69
memory used=199.6MB, alloc=149.9MB, time=1.84
memory used=312.4MB, alloc=149.9MB, time=2.97
memory used=592.3MB, alloc=312.4MB, time=5.63
memory used=854.7MB, alloc=312.4MB, time=9.80
["The Result (long)"]
memory used=1132.9MB, alloc=312.4MB, time=13.06
But the additional three lines of "time" say
real 0m47.872s
user 0m0.016s
sys 0m0.000s
Clearly, the user and sys time is wrong, as maple spend 13 seconds according to its own time measurement.
It seems to me that maple uses the same sources as the time command and resets the timer every time it uses it, such that the unix-time command only captures the time since the last call of maple to this source.
This is very inconvenient, and I would like to "forbid" maple doing that. Does anyone know how to do that? Is there some flag for calling maple that lets maple not measure the timestamps on its own?
Thanks in advance for an answer.
Albert
Ugly hack coming up.
As I said in my comment, the problem is that Maple is launching a sub-process to do all of the computations. So, I created a shell script called "mserver" in my bin that looks like
#!/bin/sh
/usr/bin/time "REPLACE WITH PATH TO MSERVER ON YOUR MACHINE/mserver" $* 2> log
I then invoked Maple as
maple --kernel-binary=/Users/me/bin/mserver
At the end of a run, the file log contains the correct "time" output for the computation.
Edit: I should point out that if the Maple protocols use stderr for anything, then this will eventually cause Maple to break. I haven't seen any sign of it but I've only just played with this now.
I have been developing a script on my linux box for quite some time, and wanted to run it on my Mac as well.
I thought that the functions on the Mac were the same as the functions on linux, but today I realized it was wrong. I knew that fewer functions existed on the Mac, but I thought that the functions that did exist, had the same implementation.
This problem is specifically in regards to the date command.
When I run the command on my linux machine with the parameter to provide some time in nanoseconds, I get the correct result, but when I run it on my mac, it does not have that option.
Linux-Machine> date +%N
55555555555 #Current time in nanoseconds
Mac-Machine> date +%N
N
How do I go about getting the current time in nanoseconds as a bash command on the Mac?
Worst case is I create a small piece of code that calls a system function in C or something and then call it within my script.
Any help is much appreciated!
This is because OSX and Linux use two different sets of tools. Linux uses the GNU version of the date command (hence, GNU/Linux). Remember that Linux is Linux and OS X is Unix. They're different.
You can install the GNU date command which is included in the "coreutils" package from MacPorts. It will be installed on your system as gdate. You can either use that, or link the date binary with the new gdate binary; your choice.
man date indicates that it doesn't go beyond one second. I would recommend trying another language (Python 2):
$ python -c 'import time; print repr(time.time())'
1332334298.898616
For Python 3, use:
$ python -c 'import time; print(repr(time.time()))'
There are "Linux specifications" but they do not regulate the behavior of the date command much. What you have is really the opposite -- Linux (or more specifically the GNU user-space tools) has a large number of extensions which are not compatible with Unix by any reasonable definition.
There is a large number of standards which do regulate these things. The one you should be looking at is POSIX which requires
date [-u] [+format]
and nothing more to be supported by adhering implementations. (There are other standards like XPG and SUS which you might want to look at as well, but at the very least, you should require and expect POSIX these days ... finally.)
The POSIX document contains a number of examples but there is nothing for date conversion which is however a practical problem which many scripts turn to date for. Also, for your concrete problem, there is nothing for reporting times with sub-second accuracy in POSIX.
Anyway, griping that *BSD isn't Linux isn't really helpful here; you just have to understand what the differences are, and code defensively. If your requirements are complex or unusual, perhaps turn to a scripting language like Perl or Python which perform these types of date formatting operations more or less out of the box in a standard installation (though neither Perl nor Python have a quick and elegant way to do date conversion out of the box, either; solutions tend to be somewhat tortured).
In practical terms, you can compare the MacOS date man page and the Linux one and try to reconcile your requirements.
For your practical requirement, MacOS date does not support any format string with nanosecond accuracy, but nor are you likely to receive useful results on that scale when the execution of the command will take a significant number of nanoseconds. I would settle for millisecond-level accuracy (and even that is going to be thrown off by the execution time in the final digits) and multiply to get the number in nanosecond scale.
nanoseconds () {
python -c 'import time; print(int(time.time()*1000*1000*1000))'
}
(Notice the parentheses around the argument to print() for Python 3.) You will notice that Python does report a value at nanosecond accuracy (the last digits are often not zeros), though by the time you have run time.time() the value will obviously no longer be correct.
To get an idea of the error rate,
bash#macos-high-sierra$ python3
Python 3.5.1 (default, Dec 26 2015, 18:08:53)
[GCC 4.2.1 Compatible Apple LLVM 7.0.2 (clang-700.1.81)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import time
>>> import timeit
>>> def nanoseconds ():
... return int(time.time()*1000*1000*1000)
...
>>> timeit.timeit(nanoseconds, number=10000)
0.0066173350023746025
>>> timeit.timeit('int(time.time()*1000*1000*1000)', number=10000)
0.00557799199668807
The overhead of starting Python and printing the value is probably going to add a few orders of magnitude of overhead, realistically, but I haven't attempted to quantify that. (The output from timeit is in seconds.)
Under most Unix-like systems, you can use the "time" command to execute a program and tell you how much space and time it used. Does anybody know of anything comparable for Windows?
(No, I don't particularly want to spend 6 months learning the Win32 API just for this...)
From the command line (low resolution, possibly inaccurate): echo %date% %time%
Programmatically: QueryPerformanceCounter. http://msdn.microsoft.com/en-us/library/ms644904(v=vs.85).aspx
If you want something of the order of millisecond accuracy (which is comparable to what the linux/unix time would give you) then timeGetTime() is what you need. It returns the number of milliseconds since the system was booted. include mmsystem.h and link against winmm.lib. However, all this would just give you a time value, you'd either need to put in a system() call in between or do something like dump the start time out to a file when called for the first time, and then read it the second time.
More pragmatic solutions, which may be more useful depending on your circumstances:
Write a batch script to call the program you wish you benchmark and wrap it so that it writes to a file:
echo "start" >> log.txt
do_my_stuff.exe
echo "stop" >> log.txt
and then use a tool as the excellent LogExpert to look at the timestamps
Install the cygwin tools and use the time that comes with that. If you only need to do this on your own machine, and the benchmark program doesn't require complex setting up (command line parameters, environment variables, etc) then this may be the easiest approach.
I use the 'time' utility in windows too. It comes with mingw+msys.