How to represent dates before epoch as a UNIX timestamp - time

It occurred to me that I'm not aware of a mechanism to store dates before 1970 jan. 1 as Unix timestamps. Since that date is the Unix "epoch" this isn't much of a surprise.
But - even though it's not designed for that - I still wish to store dates in the far past in Unix format.I need this for reasons.
So my question is: how would one go about making unix-timestamps contain "invalid" but still working dates? Would storing a negative amount of seconds work? Can we even store negative amounts of seconds in a unix-timestamp? I mean isn't it unsigned?
Also if I'm correct then I could only store dates as far back as 1901. dec. 13 20:45:52 could this be extended any further back in history by any means?

Unix Time is usually a 32-bit number of whole seconds from the first moment of 1970 in UTC, the epoch being 1 January 1970 00:00:00 UTC. That means a range of about 136 years with about half on either side of the epoch. Negative numbers are earlier, zero is the epoch, and positive are later. For a signed 32-bit integer, the values range from 1901-12-13 to 2038-01-19 03:14:07 UTC.
This is not written in stone. Well, it is written, but in a bunch of different stones. Older ones say 32-bit, newer ones 64-bit. Some specifications says that the meaning is "implementation-defined". Some Unix systems use an unsigned int to extend only into the future past the epoch, but usual practice has been a signed number. Some use a float rather than an integer. For details, see Wikipedia article on Unix Time, and this Question.
So, basically, your Question makes no sense. You have to know the context of your programming language (standard C, other C, Java, etc.), environment (POSIX-compliant), particular software library, or database store, or application.
Avoid Count-From-Epoch
Add to this lack of specificity the fact that a couple dozen other epochs have been used by various software systems, some extremely popular and common. Examples include January 1, 1601 for NTFS file system & COBOL, January 1, 1980 for various FAT file systems, January 1, 2001 for Apple Cocoa, and January 0, 1900 for Excel & Lotus 1-2-3 spreadsheets.
Further add the fact that different granularities of count have been used. Besides whole seconds, some systems use milliseconds, microseconds, or nanoseconds.
I recommend against tracking date-time as a count-from-epoch. Instead use specific data types where available in your programming language or database.
ISO 8601
When data types are not available, or when exchanging data, follow the ISO 8601 standard which defines sensible string formats for various kinds of date-time values.
Date
2015-07-29
A date-time with an offset from UTC (Z is zero/Zulu for UTC) (note padding zero on offset)
2015-07-29T14:59:08Z
2001-02-13T12:34:56.123+05:30
Week (with or without day of week)
2015-W31
2015-W31-3
Ordinal date (day-of-year)
2015-210
Interval
"2007-03-01T13:00:00Z/2008-05-11T15:30:00Z"
Duration (format of PnYnMnDTnHnMnS)
P3Y6M4DT12H30M5S = "period of three years, six months, four days, twelve hours, thirty minutes, and five seconds"
Search StackOverflow.com for many more Questions and Answers on these topics.

Related

Is there a one to one and onto relation between ISO-8601 UTC and Unix Timestamp?

Time Formats
A point in time is often represented as Unix Time, or as a human-readable ISO 8601 date in UTC time string.
For example:
Unix Time
Seconds since Epoch, or Unix timestamp, in seconds or milliseconds:
1529325705
1529325705000
ISO 8601 Date
2018-06-18T15:41:45+00:00
My question
Is there a one-to-one and onto relationship between the two? In other words, is there a point in time with a single representation in one format, and more than one, or zero, representations in the other?
Yes, it is possible to find such a date. From the wiki article on Unix time:
Every day is treated as if it contains exactly 86400 seconds,[2] so leap seconds are not applied to seconds since the Epoch.
That means that the leap seconds themselves cannot be represented in Unix time.
For example, the latest leap second occurred at the end of 2016, so 2016-12-31T23:59:60+00:00 is a valid ISO 8601 time stamp. However, the Unix time stamp for the second before, at 23:59:59, is represented as 1483228799 and the second after, 00:00:00 (on January 1 2017) is 1483228800, so there is no Unix timestamp that represents the leap second.
In practice, this is probably not a problem for you; there has only been 27 leap seconds since they were introduced in 1972.
It might be worthwhile to mention that most software implementations of ISO 8601 does not take leap seconds into account either, but will do something else if asked to parse "2016-12-31T23:59:60+00:00". The System.DateTime class in .NET throws an exception, while it's also conceivable that a library would return 2017-01-01 00:00:00.
No. there is a nice correspondence between the two, but the relationship is 1 to many, and strictly speaking there may not even exist a precise Unix millisecond for a given ISO date-time string. Some issues are:
There are some freedoms in the ISO 8601 format, so the same Unix millisecond may be written in several ways even when we require that the time be in UTC (the offset is zero).
Seconds and fraction of seconds are optional, and there may be a varying number of decimals on the seconds. So a milliseconds value of 1 529 381 160 000, for example, could be written as for example
2018-06-19T04:06:00.000000000Z
2018-06-19T04:06:00.00Z
2018-06-19T04:06:00Z
2018-06-19T04:06Z
The offset of 0 would normally be written as Z, but may also be written as you do in the question, +00:00. I think the forms +00 and +0000 are OK too (forms with a minus are not).
Since there may be more than three decimals on the seconds in ISO 8601, no exact Unix millisecond may match. So you will have to accept truncation (or rounding) to convert to Unix time. Of course the error will be still greater if you convert to Unix seconds rather than milliseconds.
As Thomas Lycken noted, leap seconds can be represented in ISO 8601, but not in Unix time.
In other words, is there a point in time with a single representation in one format, and more than one, or zero, representations in the other?
No. The UTC time depends on your geographic location, ie. your latitude and longitude. However, the UNIX timestamp is a way to track time as a running total of seconds. This count starts at the Unix Epoch on January 1st, 1970 at UTC.
From Unix TimeStamp,
It should also be pointed out that this point in time technically does not change no
matter where you are located on the globe.

Unix time and leap seconds

Regarding Unix (POSIX) time, Wikipedia says:
Due to its handling of leap seconds, it is neither a linear representation of time nor a true representation of UTC.
But the Unix date command does not seem to be aware of them actually
$ date -d '#867715199' --utc
Mon Jun 30 23:59:59 UTC 1997
$ date -d '#867715200' --utc
Tue Jul 1 00:00:00 UTC 1997
While there should be a leap second there at Mon Jun 30 23:59:60 UTC 1997.
Does this mean that only the date command ignores leap seconds, while the concept of Unix time doesn't?
The number of seconds per day are fixed with Unix timestamps.
The Unix time number is zero at the Unix epoch, and increases by
exactly 86400 per day since the epoch.
So it cannot represent leap seconds. The OS will slow down the clock to accommodate for this. The leap seconds is simply not existent as far a Unix timestamps are concerned.
Unix time is easy to work with, but some timestamps are not real times, and some timestamps are not unique times.
That is, there are some duplicate timestamps representing two different seconds in time, because in unix time the sixtieth second might have to repeat itself (as there can't be a sixty-first second). Theoretically, they could also be gaps in the future because the sixtieth second doesn't have to exist, although no skipping leap seconds have been issued so far.
Rationale for unix time: it's defined so that it's easy to work with. Adding support for leap seconds to the standard libraries is very tricky. For example, you want to represent 1 Jan 2050 in a database. No-one on earth knows how many seconds away that date is in UTC! The date can't be stored as a UTC timestamp, because the IAU doesn't know how many leap seconds we'll have to add in the next decades (they're as good as random). So how can a programmer do date arithmetic when the length of time which will elapse between any two dates in the future isn't know until a year or two before? Unix time is simple: we know the timestamp of 1 Jan 2050 already (namely, 80 years * #of seconds in a year). UTC is extremely hard to work with all year round, whereas unix time is only hard to work with in the instant a leap second occurs.
For what it's worth, I've never met a programmer who agrees with leap seconds. They should clearly be abolished.
There is a lot of discussion here and elsewhere about leap seconds, but it isn't a complicated issue, because it doesn't have anything to do with UTC, or GMT, or UT1, or TAI, or any other time standard. POSIX (Unix) time is, by definition, that which is specified by the IEEE Std 1003.1 "POSIX" standard, available here.
The standard is unambiguous: POSIX time does not include leap seconds.
Coordinated Universal Time (UTC) includes leap seconds. However, in POSIX time (seconds since the Epoch), leap seconds are ignored (not applied) to provide an easy and compatible method of computing time differences. Broken-down POSIX time is therefore not necessarily UTC, despite its appearance.
The standard goes into significant detail unambiguously stating that POSIX time does not include leap seconds, in particular:
It is a practical impossibility to mandate that a conforming implementation must have a fixed relationship to any particular official clock (consider isolated systems, or systems performing "reruns" by setting the clock to some arbitrary time).
Since leap seconds are decided by committee, it is not just a "bad idea" to include leap seconds in POSIX time, it is impossible given that the standard allows for conforming implementations which do not have network access.
Elsewhere in this question #Pacerier has said that POSIX time does include leap seconds, and that each POSIX time may correspond to more than one UTC time. While this is certainly one possible interpretation of a POSIX timestamp, this is by no means specified by the standard. His arguments largely amount to weasel words that do not apply to the standard, which defines POSIX time.
Now, things get complicated. As specified by the standard, POSIX time may not be equivalent to UTC time:
Broken-down POSIX time is therefore not necessarily UTC, despite its appearance.
However, in practice, it is. In order to understand the issue, you have to understand time standards. GMT and UT1 are based on the astronomical position of the Earth in the universe. TAI is based on the actual amount of time that passes in the universe as measured by physical (atomic) reactions. In TAI, each second is an "SI second," which are all exactly the same length. In UTC, each second is an SI second, but leap seconds are added as necessary to readjust the clock back to within .9 seconds of GMT/UT1. The GMT and UT1 time standards are defined by empirical measurements of the Earth's position and movement in the universe, and these empirical measurements cannot through any means (neither scientific theory nor approximation) be predicted. As such, leap seconds are also unpredictable.
Now, the POSIX standard also specifies that the intention is for all POSIX timestamps to be interoperable (mean the same thing) in different implementations. One solution is for everyone to agree that each POSIX second is one SI second, in which case POSIX time is equivalent to TAI (with the specified epoch), and nobody need contact anyone except for their atomic clock. We didn't do that, however, probably because we wanted POSIX timestamps to be UTC timestamps.
Using an apparent loophole in the POSIX standard, implementations intentionally slow down or speed up seconds -- so that POSIX time no longer uses SI seconds -- in order to remain in sync with UTC time. Reading the standard it is clear this was not what was intended, because this cannot be done with isolated systems, which therefore cannot interoperate with other machines (their timestamps, without leap seconds, mean something different for other machines, with leap seconds). Read:
[...] it is important that the interpretation of time names and seconds since the Epoch values be consistent across conforming systems; that is, it is important that all conforming systems interpret "536457599 seconds since the Epoch" as 59 seconds, 59 minutes, 23 hours 31 December 1986, regardless of the accuracy of the system's idea of the current time. The expression is given to ensure a consistent interpretation, not to attempt to specify the calendar. [...] This unspecified second is nominally equal to an International System (SI) second in duration.
The "loophole" allowing this behavior:
Note that as a practical consequence of this, the length of a second as measured by some external standard is not specified.
So, implementations abuse this freedom by intentionally changing it to something which cannot, by definition, be interoperable among isolated or nonparticipating systems. Alternatively, the implementation may simply repeat POSIX times as if no time had passed. See this Unix StackExchange answer for details on all modern implementations.
Phew, that was confusing alright... A real brain teaser!
Since both of the other answers contain lots of misleading information, I'll throw this in.
Thomas is right that the number of Unix Epoch timestamp seconds per day are fixed. What this means is that on days where there is a leap second, the second right before midnight (the 61st second of the UTC minute before midnight) is given the same timestamp as the previous second.
That timestamp is "replayed", if you will. So the same unix timestamp will be used for two real-world seconds. This also means that if you're getting fraction unix epochs, the whole second will repeat.
X86399.0, X86399.5, X86400.0, X86400.5, X86400.0, X86400.5, then X86401.0.
So unix time can't unambiguously represent leap seconds - the leap second timestamp is also the timestamp for the previous real-world second.

MongoDB's ISODate() vs. UNIX Timestamp

Is there any sort of advantage (performance, indexes, size, etc) to storing dates in MongoDB as an ISODate() vs. storing as a regular UNIX timestamp?
The amount of overhead of a ISODate compared to a time_t is trivial compared to the advantages of the former.
An ISO 8601 format date is human readable, it can be used to express dates prior to January 1, 1970, and most importantly, it isn't prey to the Y2038 problem.
This last bit can't be stressed enough. In 1960, it seemed ludicrous that wasting an octet or two on a century number could yield any benefit as the turn of the century was impossibly far off. We know how wrong that turned out to be. The year 2038 will be here sooner than you expect, and time_t are already insufficient for representing – for example – the schedule of payments on a 30-year contract.
MongoDB's built-in Date type is very similar to a unix timestamp stored in time_t. The only difference is that Dates are a 64bit field storing miliseconds since Jan 1 1970, rather than a 32bit fields storing seconds since the same epoch. The only down side is that for current releases it treats the count as unsigned so it can't handle dates before 1970 correctly. This will be fixed in MongoDB 2.0 scheduled for release in about a month.
A possible point of confusion is the name "ISODate". It is just a helper function in the shell to wrap around javascript's horrible Date constructor. If you call either "ISODate()" or "new Date()" you will get back the exact same Date object, we just changed how it prints. You are still free to use normal ISO Date stings or time_t ints without using our constructors, but you won't get nice Date objects back in your language of choice.

What is the range of times that Ruby's Time class can represent?

Times farthest in the past and farthest in the future that can be represented?
Is it absolute moments in time, or distance in time from the present moment?
I couldn't find it in the docs for the Time class.
Does it depend on the system? If so, how can I access it in my code?
UPDATE
After some experimentation, I found that it's from about 108 years in the past to about 29 years in the future. Still wondering if it's system dependent.
DateTime (in the Date library, included with ruby) goes back to 1 January 4713 BCE and farther into the future than you are likely to need.
"Time is stored internally as the number of seconds and microseconds since the epoch, January 1, 1970 00:00 UTC. On some operating systems, this offset is allowed to be negative."
So clearly its an abosolute time not relative to now
Sounds like there is a "C" time implementation under cover (integers can be signed or unsigned depending on OS / processor / compiler) : it means the bounds are system dependent.
But if you need to handle dates that are that long ago / far in the future, I guess you won't really need the "time of day" part and can use a Date !?

Does the windows FILETIME structure include leap seconds?

The FILETIME structure counts from January 1 1601 (presumably the start of that day) according to the Microsoft documentation, but does this include leap seconds?
The question shouldn't be if FILETIME includes leap seconds.
It should be:
Do the people, functions, and libraries, who interpret a FILETIME (i.e. FileTimeToSystemTime) include leap seconds when counting the duration?
The simple answer is "no". FileTimeToSystemTime returns seconds as 0..59.
The simpler answer is: "of course not, how could it?".
My Windows 2000 machine doesn't know that there were 2 leap seconds added in the decade since it was released. Any interpretation it makes of a FILETIME is wrong.
Finally, rather than relying on logic, we can determine by direct experimental observation, the answer to the posters question:
var
systemTime: TSystemTime;
fileTime: TFileTime;
begin
//Construct a system-time for the 12/31/2008 11:59:59 pm
ZeroMemory(#systemTime, SizeOf(systemTime));
systemtime.wYear := 2008;
systemTime.wMonth := 12;
systemTime.wDay := 31;
systemTime.wHour := 23;
systemtime.wMinute := 59;
systemtime.wSecond := 59;
//Convert it to a file time
SystemTimeToFileTime(systemTime, {var}fileTime);
//There was a leap second 12/31/2008 11:59:60 pm
//Add one second to our filetime to reach the leap second
filetime.dwLowDateTime := fileTime.dwLowDateTime+10000000; //10,000,000 * 100ns = 1s
//Convert the filetime, sitting on a leap second, to a displayable system time
FileTimeToSystemTime(fileTime, {var}systemTime);
//And now print the system time
ShowMessage(DateTimeToStr(SystemTimeToDateTime(systemTime)));
Adding one second to
12/31/2008 11:59:59pm
gives
1/1/2009 12:00:00am
rather than
1/1/2009 11:59:60pm
Q.E.D.
Original poster might not like it, but god intentionally rigged it so that a year is not evenly divisible by a day. He did it just to screw up programmers.
There can be no single answer to this question without first deciding: What is the Windows FILETIME actually counting? The Microsoft docs say it counts 100 nanosecond intervals since 1601 UTC, but this is problematic.
No form of internationally coordinated time existed prior to the year 1960. The name UTC itself does not occur in any literature prior to 1964. The name UTC as an official designation did not exist until 1970. But it gets worse. The Royal Greenwich Observatory was not established until 1676, so even trying to interpret the FILETIME as GMT has no clear meaning, and it was only around then that pendulum clocks with accurate escapements began to give accuracies of 1 second.
If FILETIME is interpreted as mean solar seconds then the number of leap seconds since 1601 is zero, for UT has no leap seconds. If FILETIME is interpreted as if there had been atomic chronometers then the number of leap seconds since 1601 is about -60 (that's negative 60 leap seconds).
That is ancient history, what about the era since atomic chronometers? It is no better because national governments have not made the distinction between mean solar seconds and SI seconds. For a decade the ITU-R has been discussing abandoning leap seconds, but they have not achieved international consensus. Part of the reason for that can be seen in the
javascript on this page (also see the delta-T link on that page for plots of the ancient history). Because national governments have not made a clear distinction, any attempt to define the count of seconds since 1972 runs the risk of being invalid according to the laws of some jurisdiction. The delegates to ITU-R are aware of this complexity, as are the folks on the POSIX committee. Until the diplomatic issues are worked out, until national governments and international standards make a clear distinction and choice between mean solar and SI seconds, there is little hope that the computer standards can follow suit.
Here's some more info about why that particular date was chosen.
The FILETIME structure records time in
the form of 100-nanosecond intervals
since January 1, 1601. Why was that
date chosen?
The Gregorian calendar operates on a
400-year cycle, and 1601 is the first
year of the cycle that was active at
the time Windows NT was being
designed. In other words, it was
chosen to make the math come out
nicely.
I actually have the email from Dave
Cutler confirming this.
The answer to this question used to be no, but has changed to: YES, sort of, sometimes...
Per the Windows Networking team blog article:
Starting in Server 2019 and the Windows 10 October [2018] update time APIs will now take into account all leap seconds the Operating System is aware of when it translates FILETIME to SystemTime.
As there have been no leap seconds issued since the time of this feature being added, the operating system is still unaware of any leap seconds. However, when the next official leap second makes its way into the world, Windows computers that have this new feature enabled will keep track of it, and thus FILETIME values will be offset by the number of leap seconds on the computer at the time they are interpreted.
The blog post goes on to describe:
No change is made to FILETIME. It still represents the number of 100 ns intervals since the start of the epoch. What has changed is the interpretation of that number when it is converted to SYSTEMTIME and back. Here is a list of affected APIs:
GetSystemTime
GetLocalTime
FileTimeToSystemTime
FileTimeToLocalTime
SystemTimeToFileTime
SetSystemTime
SetLocalTime
Previous to this release, SYSTEMTIME had valid values for wSecond between 0 and 59. SYSTEMTIME has now been updated to allow a value of 60, provided the year, month, and day represents day in which a leap second is valid.
...
In order receive the 60 second in the SYSTEMTIME structure a process must explicitly opt-in.
Note that the opt-in applies to the behavior within the functions listed on how a FILETIME is mapped to a SYSTEMTIME. Regardless of whether you opt-in or not, the operating system will still offset FILETIME values according to the leap seconds it is aware of.
With regard to compatibility, the article states:
Applications that rely on 3rd party frameworks should ensure their framework’s implementation on Windows is also calling into the correct APIs to calculate the correct time, or else the application will have the wrong time reported.
And also provides a links to an earlier post which describes how to disable the entire feature, as follows:
... you can revert to the prior operating system behavior and disable leap seconds across the board by adding the following registry key:
HKLM:\SYSTEM\CurrentControlSet\Control\LeapSecondInformation
Type: "REG_DWORD"
Name: Enabled
Value: 0 Disables the system-wide setting
Value: 1 Enables the system-wide setting
Next, restart your system.
Leap seconds are added unpredictably by the IERS. 23 seconds have been added since 1972, when UTC and leap seconds were defined. Wikipedia says "because the Earth's rotation rate is unpredictable in the long term, it is not possible to predict the need for them more than six months in advance."
Since you'd have to keep a history of when leap seconds were inserted, and keep updating the OS to keep a reference of when they had been inserted, and the difference is so small, it's fair not to expect a general-purpose OS to compensate for leap seconds.
In addition, regular clock drift, of the simple electronic clock in your PC compared to UTC, is so much larger than the compensation required for leap seconds. If you need the kind of precision to compensate for leap seconds, you shouldn't use the highly-inaccurate PC clock.
According to this comment windows is totally unaware of leap seconds. If you add 24 * 60 * 60 seconds to a FILETIME that represents 1:39:45 today, you get a FILETIME that represents 1:39:45 tomorrow, no matter what.
A very crude summary:
UTC = (Atomic Time) + (Leap Seconds) ~~ (Mean Solar Time)
The MS documentation says, specifically, "UTC", and so should include the leap seconds. As always with MS, your mileage may vary.

Resources