I have been developing a script on my linux box for quite some time, and wanted to run it on my Mac as well.
I thought that the functions on the Mac were the same as the functions on linux, but today I realized it was wrong. I knew that fewer functions existed on the Mac, but I thought that the functions that did exist, had the same implementation.
This problem is specifically in regards to the date command.
When I run the command on my linux machine with the parameter to provide some time in nanoseconds, I get the correct result, but when I run it on my mac, it does not have that option.
Linux-Machine> date +%N
55555555555 #Current time in nanoseconds
Mac-Machine> date +%N
N
How do I go about getting the current time in nanoseconds as a bash command on the Mac?
Worst case is I create a small piece of code that calls a system function in C or something and then call it within my script.
Any help is much appreciated!
This is because OSX and Linux use two different sets of tools. Linux uses the GNU version of the date command (hence, GNU/Linux). Remember that Linux is Linux and OS X is Unix. They're different.
You can install the GNU date command which is included in the "coreutils" package from MacPorts. It will be installed on your system as gdate. You can either use that, or link the date binary with the new gdate binary; your choice.
man date indicates that it doesn't go beyond one second. I would recommend trying another language (Python 2):
$ python -c 'import time; print repr(time.time())'
1332334298.898616
For Python 3, use:
$ python -c 'import time; print(repr(time.time()))'
There are "Linux specifications" but they do not regulate the behavior of the date command much. What you have is really the opposite -- Linux (or more specifically the GNU user-space tools) has a large number of extensions which are not compatible with Unix by any reasonable definition.
There is a large number of standards which do regulate these things. The one you should be looking at is POSIX which requires
date [-u] [+format]
and nothing more to be supported by adhering implementations. (There are other standards like XPG and SUS which you might want to look at as well, but at the very least, you should require and expect POSIX these days ... finally.)
The POSIX document contains a number of examples but there is nothing for date conversion which is however a practical problem which many scripts turn to date for. Also, for your concrete problem, there is nothing for reporting times with sub-second accuracy in POSIX.
Anyway, griping that *BSD isn't Linux isn't really helpful here; you just have to understand what the differences are, and code defensively. If your requirements are complex or unusual, perhaps turn to a scripting language like Perl or Python which perform these types of date formatting operations more or less out of the box in a standard installation (though neither Perl nor Python have a quick and elegant way to do date conversion out of the box, either; solutions tend to be somewhat tortured).
In practical terms, you can compare the MacOS date man page and the Linux one and try to reconcile your requirements.
For your practical requirement, MacOS date does not support any format string with nanosecond accuracy, but nor are you likely to receive useful results on that scale when the execution of the command will take a significant number of nanoseconds. I would settle for millisecond-level accuracy (and even that is going to be thrown off by the execution time in the final digits) and multiply to get the number in nanosecond scale.
nanoseconds () {
python -c 'import time; print(int(time.time()*1000*1000*1000))'
}
(Notice the parentheses around the argument to print() for Python 3.) You will notice that Python does report a value at nanosecond accuracy (the last digits are often not zeros), though by the time you have run time.time() the value will obviously no longer be correct.
To get an idea of the error rate,
bash#macos-high-sierra$ python3
Python 3.5.1 (default, Dec 26 2015, 18:08:53)
[GCC 4.2.1 Compatible Apple LLVM 7.0.2 (clang-700.1.81)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import time
>>> import timeit
>>> def nanoseconds ():
... return int(time.time()*1000*1000*1000)
...
>>> timeit.timeit(nanoseconds, number=10000)
0.0066173350023746025
>>> timeit.timeit('int(time.time()*1000*1000*1000)', number=10000)
0.00557799199668807
The overhead of starting Python and printing the value is probably going to add a few orders of magnitude of overhead, realistically, but I haven't attempted to quantify that. (The output from timeit is in seconds.)
Related
I have a function that converts an int to a date which is then fed into datediff to find how many days since an event happened. One of our tests passes on PySpark Windows and in our Azure DevOps pipeline, but fails when run on PySpark in WSL Ubuntu. We've narrowed it down to the to_date() function producing different results on the 2 platforms, but don't understand why.
import pyspark.sql.functions as F
import datetime
def from_int_to_date(int_date: int) -> datetime.datetime:
"""
Convert an integer in YYYYMMDD format into a datetime object
"""
return datetime.datetime.strptime(str(int_date), "%Y%m%d")
If I calculate F.to_date(F.lit(from_int_to_date(20190401))) I get Column<b"to_date(TIMESTAMP '2019-04-01 00:00:00')"> on Windows and Column<b"to_date(TIMESTAMP('2019-03-31 23:00:00.0'))> on the version running under WSL.
I am based in the UK and on 1 April 2019 we did our clock change for summer so I can understand the reason why it goes back an hour as the problem doesn't occur with an input int of 20190331. I'm just trying to understand why the behaviour of to_date() is different on the two systems and what we should do to mitigate for this (and any other differences) as ideally our code would be platform agnostic.
Set the timezone to the spark driver with the configuration spark.sql.session.timeZone so you won't depend on the system clock.
spark.conf.set("spark.sql.session.timeZone", "Europe/London")
This option can be settled even when the spark session is created.
I have been developing a script on my linux box for quite some time, and wanted to run it on my Mac as well.
I thought that the functions on the Mac were the same as the functions on linux, but today I realized it was wrong. I knew that fewer functions existed on the Mac, but I thought that the functions that did exist, had the same implementation.
This problem is specifically in regards to the date command.
When I run the command on my linux machine with the parameter to provide some time in nanoseconds, I get the correct result, but when I run it on my mac, it does not have that option.
Linux-Machine> date +%N
55555555555 #Current time in nanoseconds
Mac-Machine> date +%N
N
How do I go about getting the current time in nanoseconds as a bash command on the Mac?
Worst case is I create a small piece of code that calls a system function in C or something and then call it within my script.
Any help is much appreciated!
This is because OSX and Linux use two different sets of tools. Linux uses the GNU version of the date command (hence, GNU/Linux). Remember that Linux is Linux and OS X is Unix. They're different.
You can install the GNU date command which is included in the "coreutils" package from MacPorts. It will be installed on your system as gdate. You can either use that, or link the date binary with the new gdate binary; your choice.
man date indicates that it doesn't go beyond one second. I would recommend trying another language (Python 2):
$ python -c 'import time; print repr(time.time())'
1332334298.898616
For Python 3, use:
$ python -c 'import time; print(repr(time.time()))'
There are "Linux specifications" but they do not regulate the behavior of the date command much. What you have is really the opposite -- Linux (or more specifically the GNU user-space tools) has a large number of extensions which are not compatible with Unix by any reasonable definition.
There is a large number of standards which do regulate these things. The one you should be looking at is POSIX which requires
date [-u] [+format]
and nothing more to be supported by adhering implementations. (There are other standards like XPG and SUS which you might want to look at as well, but at the very least, you should require and expect POSIX these days ... finally.)
The POSIX document contains a number of examples but there is nothing for date conversion which is however a practical problem which many scripts turn to date for. Also, for your concrete problem, there is nothing for reporting times with sub-second accuracy in POSIX.
Anyway, griping that *BSD isn't Linux isn't really helpful here; you just have to understand what the differences are, and code defensively. If your requirements are complex or unusual, perhaps turn to a scripting language like Perl or Python which perform these types of date formatting operations more or less out of the box in a standard installation (though neither Perl nor Python have a quick and elegant way to do date conversion out of the box, either; solutions tend to be somewhat tortured).
In practical terms, you can compare the MacOS date man page and the Linux one and try to reconcile your requirements.
For your practical requirement, MacOS date does not support any format string with nanosecond accuracy, but nor are you likely to receive useful results on that scale when the execution of the command will take a significant number of nanoseconds. I would settle for millisecond-level accuracy (and even that is going to be thrown off by the execution time in the final digits) and multiply to get the number in nanosecond scale.
nanoseconds () {
python -c 'import time; print(int(time.time()*1000*1000*1000))'
}
(Notice the parentheses around the argument to print() for Python 3.) You will notice that Python does report a value at nanosecond accuracy (the last digits are often not zeros), though by the time you have run time.time() the value will obviously no longer be correct.
To get an idea of the error rate,
bash#macos-high-sierra$ python3
Python 3.5.1 (default, Dec 26 2015, 18:08:53)
[GCC 4.2.1 Compatible Apple LLVM 7.0.2 (clang-700.1.81)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import time
>>> import timeit
>>> def nanoseconds ():
... return int(time.time()*1000*1000*1000)
...
>>> timeit.timeit(nanoseconds, number=10000)
0.0066173350023746025
>>> timeit.timeit('int(time.time()*1000*1000*1000)', number=10000)
0.00557799199668807
The overhead of starting Python and printing the value is probably going to add a few orders of magnitude of overhead, realistically, but I haven't attempted to quantify that. (The output from timeit is in seconds.)
I have been developing a script on my linux box for quite some time, and wanted to run it on my Mac as well.
I thought that the functions on the Mac were the same as the functions on linux, but today I realized it was wrong. I knew that fewer functions existed on the Mac, but I thought that the functions that did exist, had the same implementation.
This problem is specifically in regards to the date command.
When I run the command on my linux machine with the parameter to provide some time in nanoseconds, I get the correct result, but when I run it on my mac, it does not have that option.
Linux-Machine> date +%N
55555555555 #Current time in nanoseconds
Mac-Machine> date +%N
N
How do I go about getting the current time in nanoseconds as a bash command on the Mac?
Worst case is I create a small piece of code that calls a system function in C or something and then call it within my script.
Any help is much appreciated!
This is because OSX and Linux use two different sets of tools. Linux uses the GNU version of the date command (hence, GNU/Linux). Remember that Linux is Linux and OS X is Unix. They're different.
You can install the GNU date command which is included in the "coreutils" package from MacPorts. It will be installed on your system as gdate. You can either use that, or link the date binary with the new gdate binary; your choice.
man date indicates that it doesn't go beyond one second. I would recommend trying another language (Python 2):
$ python -c 'import time; print repr(time.time())'
1332334298.898616
For Python 3, use:
$ python -c 'import time; print(repr(time.time()))'
There are "Linux specifications" but they do not regulate the behavior of the date command much. What you have is really the opposite -- Linux (or more specifically the GNU user-space tools) has a large number of extensions which are not compatible with Unix by any reasonable definition.
There is a large number of standards which do regulate these things. The one you should be looking at is POSIX which requires
date [-u] [+format]
and nothing more to be supported by adhering implementations. (There are other standards like XPG and SUS which you might want to look at as well, but at the very least, you should require and expect POSIX these days ... finally.)
The POSIX document contains a number of examples but there is nothing for date conversion which is however a practical problem which many scripts turn to date for. Also, for your concrete problem, there is nothing for reporting times with sub-second accuracy in POSIX.
Anyway, griping that *BSD isn't Linux isn't really helpful here; you just have to understand what the differences are, and code defensively. If your requirements are complex or unusual, perhaps turn to a scripting language like Perl or Python which perform these types of date formatting operations more or less out of the box in a standard installation (though neither Perl nor Python have a quick and elegant way to do date conversion out of the box, either; solutions tend to be somewhat tortured).
In practical terms, you can compare the MacOS date man page and the Linux one and try to reconcile your requirements.
For your practical requirement, MacOS date does not support any format string with nanosecond accuracy, but nor are you likely to receive useful results on that scale when the execution of the command will take a significant number of nanoseconds. I would settle for millisecond-level accuracy (and even that is going to be thrown off by the execution time in the final digits) and multiply to get the number in nanosecond scale.
nanoseconds () {
python -c 'import time; print(int(time.time()*1000*1000*1000))'
}
(Notice the parentheses around the argument to print() for Python 3.) You will notice that Python does report a value at nanosecond accuracy (the last digits are often not zeros), though by the time you have run time.time() the value will obviously no longer be correct.
To get an idea of the error rate,
bash#macos-high-sierra$ python3
Python 3.5.1 (default, Dec 26 2015, 18:08:53)
[GCC 4.2.1 Compatible Apple LLVM 7.0.2 (clang-700.1.81)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import time
>>> import timeit
>>> def nanoseconds ():
... return int(time.time()*1000*1000*1000)
...
>>> timeit.timeit(nanoseconds, number=10000)
0.0066173350023746025
>>> timeit.timeit('int(time.time()*1000*1000*1000)', number=10000)
0.00557799199668807
The overhead of starting Python and printing the value is probably going to add a few orders of magnitude of overhead, realistically, but I haven't attempted to quantify that. (The output from timeit is in seconds.)
Under most Unix-like systems, you can use the "time" command to execute a program and tell you how much space and time it used. Does anybody know of anything comparable for Windows?
(No, I don't particularly want to spend 6 months learning the Win32 API just for this...)
From the command line (low resolution, possibly inaccurate): echo %date% %time%
Programmatically: QueryPerformanceCounter. http://msdn.microsoft.com/en-us/library/ms644904(v=vs.85).aspx
If you want something of the order of millisecond accuracy (which is comparable to what the linux/unix time would give you) then timeGetTime() is what you need. It returns the number of milliseconds since the system was booted. include mmsystem.h and link against winmm.lib. However, all this would just give you a time value, you'd either need to put in a system() call in between or do something like dump the start time out to a file when called for the first time, and then read it the second time.
More pragmatic solutions, which may be more useful depending on your circumstances:
Write a batch script to call the program you wish you benchmark and wrap it so that it writes to a file:
echo "start" >> log.txt
do_my_stuff.exe
echo "stop" >> log.txt
and then use a tool as the excellent LogExpert to look at the timestamps
Install the cygwin tools and use the time that comes with that. If you only need to do this on your own machine, and the benchmark program doesn't require complex setting up (command line parameters, environment variables, etc) then this may be the easiest approach.
I use the 'time' utility in windows too. It comes with mingw+msys.
I have a program (jhead) that compiles with very few tweaks for both Windows and generic Unix variants. From time to time, windows users ask if it can be modified to also set the "creation date/time" of the files, but I don't see a way to do this with the POSIX api. What I'm currently doing is:
{
struct utimbuf mtime;
mtime.actime = NewUnixTime;
mtime.modtime = NewUnixTime;
utime(FileName, &mtime);
}
Ideally, struct utimebuf would just have a creation time, but it doesn't. It strikes me it would take a lot of windows specific, non-portable code to change the creation time under Windows. Is there another POSIX way of doing this? Any suggestions?
POSIX only recognizes three different file times:
atime (access time): The last time the file was read
mtime (modification time): The last time the file was written
ctime (attribute change time): The last time the file's metadata was modified
Any other file times that may exist in the underlying OS require OS-specific API calls in order to be modified.
And don't worry about creating non-portable code; only these times really exist under most *nix variants.
The Win32 API for this isn't really all that bad, as Windows APIs go: https://msdn.microsoft.com/en-us/library/windows/desktop/ms724933%28v=vs.85%29.aspx . The trickiest thing is to work out how many seconds Windows thinks there were between 1st January 1601 and 1st January 1970; the rest is straightforward.