How do I model "relative" time in a database? - time

Clarification: I am not trying to calculate friendly times (e.g., "8 seconds ago") by using a timestamp and the current time.
I need to create a timeline of events in my data model, but where these events are only relative to each other. For example, I have events A, B, and C. They happen in order, so it may be that B occurs 20 seconds after A, and that C occurs 20 years after B.
I don't care about the unit of time. For my purpose, there is no time, just relativity.
I intend to model this like a linked list, where each event is a node:
Event
id
name
prev_event
next_event
Is this the most efficient way to model relative events?

All time recorded by computers is relative time, nominally it is relative to an epoch as an offset in milliseconds. Normally this epoch is an offset from 1970/01/01 as is the case with Unix.
If you store normal everyday timestamp values, you already have relative time between events if they are all sequential, you just need to subtract them to get intervals which are what you are calling relative times but they are actually intervals.
You can use whatever resolution you need to use, milliseconds is what most things use, if you are sampling things at sub-millisecond resolution, you would use nanoseconds

I don't think you need to link to previous and next event, why not just use a timestamp and order by the timestamp?
If you can have multiple, simultaneous event timelines, then you would use some kind of identifier to identify the timeline (int, guid, whatever) and key that in witht the timestamp. No id is even necessary unless you need to refer to it by an single number.
Something like this:
Event
TimeLineID (key)
datetime (key)
Name

Related

Data Structure for time scheduling?

I am in need of a data structure that can properly model blocks of time, like appointments. For example, each appointment has a time it starts on, and a time it ends on. I need to have extremely fast access to things like:
Does a specified start time and end time conflict with an existing event?
What events exist from a specified start time and end time?
Ideally the data structure could model something like the image below.
I thought of using a binary search tree (ex. Java's TreeMap) but I can't think of what key or value I would use. Is there a single data structure or combination of data structures that is strong at modeling this?
A Guava Table would probably work for your use case, depending on what it is you want to actually index on.
A naive approach would be to index by name, then time of day, and then have a value whether or not this particular block is occupied by that particular person.
This would make the instantiation of the object become...
Table<String, LocalDateTime, Boolean> calendar = TreeBasedTable.create();
You would populate each individual's allocation at a given interval. You get to set what that interval is - if it's broken into 15, 30 or 1 hour periods (as defined by the table).
To find if a time is occupied, you look for the closest interval to the time you want to schedule a person. You'd use the column() method for this to see if there's any availability, or you could get specific and get a row for the individual. This means you'd have to pull two values; the start time you want, and however many minutes your interval is out. That part I'll have to leave as an exercise for the reader.

How to store time series data in a list (or any other data structure) to get reasonable trends over a variety of horizons?

Say I want to store a forex rate trend in which, I receive two updates every second on average. But I don't want to store all updates against the timestamp over a day as the data would be huge. But I want to show every update in the last two minutes, every second update in the last 1 hour and so on with reducing frequencies over a day. Which algorithm/data structure is best for this?
You could use a circular buffer. But generally StackOverflow is not for questions like that.

Daylight Savings Time Gap/Overlap definitions? When to "correct" for them?

What is the definition of Daylight Savings Time 'Overlap' & 'Gap'?
I have a hazy understanding of them, so I'd like to confirm... What does it mean to be "within" either of them?
What does it mean to "correct" for DST Gap or DST Overlap? When does a time need correcting, and when does it not need correcting?
The above questions are language-agnostic, but an example of their application I have is:
When to call org.joda.time.LocalDateTime#correctDstTransition?
Correct date in case of DST overlap.The Date object created has
exactly the same fields as this date-time, except when the time would
be invalid due to a daylight savings gap. In that case, the time will
be set to the earliest valid time after the gap. In the case of a
daylight savings overlap, the earlier instant is selected.
Much of this is already explained in the DST tag wiki, but I will answer your specific questions.
What is the definition of Daylight Savings Time 'Overlap' & 'Gap'?
...
What does it mean to be "within" either of them?
When daylight saving time begins, the local time is advanced - usually by one hour. This creates a "gap" in the values of local time in that time zone.
For example, when DST starts in the United States, the clocks tick from 1:59 AM to 3:00 AM. Any local time value from 2:00 AM through 2:59 AM would be considered to be "within the gap".
Note that values in the gap are non-existent. They do not occur in the real world, unless a clock was not correctly advanced. In practice, one typically gets to a value within the gap by adding or subtracting an elapsed time value from another local time.
When daylight saving time ends, the local time is retracted by the same amount that was added when it began (again, usually 1 hour). This creates an "overlap" in the local time values of that time zone.
For example, when DST ends in the United states, the clocks tick from 1:59 AM back to 1:00 AM. Any local time value from 1:00 AM through 1:59 AM is ambiguous if there is no additional qualifying information.
To be "within the overlap" means that you have a value that is potentially ambiguous because it falls into this range.
Such values may belong to the daylight time occurrence (which comes first sequentially), or may belong to the standard time occurrence (which comes second sequentially).
What does it mean to "correct" for DST Gap or DST Overlap?
Correcting for the gap means to ensure that the local time value is valid by possible moving it to a different value. There are various schemes in use for doing so, but the most common and most sensible is to advance the local time value by the amount of the gap.
For example, if you have a local time of 2:30 AM, and you determine it to occur on the day of the spring-forward transition in the United States, then it falls into the gap. Advance it to 3:30 AM.
This approach tends to work well because simulates the act of a human manually advancing an analog clock - or rather, correcting with the idea that it had not been properly advanced.
Correcting for the overlap means to ensure that all local times are well qualified. Usually this is accomplished by assigning a time zone offset to all values.
In the case of a value that is not ambiguous, the offset is deterministic.
In the case of a value that falls within the overlap on the day of a fall-back transition, it often makes sense to choose the first of the two possible values (which will have the daylight time offset). This is because time moves in a forward direction. However, there are sometimes cases where it makes sense to use a different rule, so YMMV.
When does a time need correcting, and when does it not need correcting?
If you are attempting to work with time as an instantaneous value, such as determining the elapsed duration between two values, adding an elapsed time to a specific value, or when converting to UTC, then you need to correct for gaps and overlaps as they occur.
If you are only working with user input and output, always displaying the exact value a user gave you (and never using it for math or time zone conversions) then you do not need to correct for gaps and overlaps.
Also, if you are working with date-only values, or time-only values, then you should not be applying time zone information at all, and thus do not need to correct for gaps and overlaps.
Lastly, if you are working strictly with Coordinated Universal Time (UTC), which has no daylight saving time, then you do not need to correct for gaps and overlaps.
When to call org.joda.time.LocalDateTime#correctDstTransition?
You don't. That method is private, and is called by other Joda-time functions as needed.

How to handle static time

I'm looking for some best practices to handle and store static time values.
A static time is usually the time of a recurring event, e.g. the activities in a sport centre, the opening times of a restaurant, the time a TV show is aired every day.
This time values are not bound to a specific date, and should not be affected by daylight saving time. For example, a restaurant will open at 11:00am both in winter and summer.
What's the best way to handle this situation? How should this kind of values be stored?
I'm mainly interested in issues with automatic TimeZone and DST adjustments (that should be avoided), and in keeping the time values independent by any specific date.
The best strategies I've found so far are:
store the time as an integer number of seconds since midnight,
store the time as a string.
I did read this question, but it's mostly about the normal time values and not the use cases I described.
Update
The library I'm working on: github
Regarding database storage, consider the following in order from most preferred to least preferred option:
Use a TIME type if your database supports it, such as in SQL Server (2008 and greater), MySQL, and Postgres, or INTERVAL HOUR TO SECOND in Oracle.
Use separate integer fields for Hours and Minutes (and Seconds if you need them). Consider using a custom user-defined type to bind these together if your DB supports it.
Use string in 24-hour format with a leading zero, such as "01:23:00", "12:00:00" or "23:59:00". If you include seconds, then always include seconds. You want to keep the strings lexicographically sortable. Don't mix and match formatting. Be consistent.
Regarding the approach of storing a whole number of minutes (or seconds) elapsed since midnight, I recommend avoiding it. That works great when you are actually storing an elapsed duration of time, but not so great when storing a time of day. Consider:
Not every day has a midnight. In some time zones (ex: Brazil), on the day of the spring-forward DST transition, the clocks go from 23:59:59 to 01:00:00.
In any time zone that has DST, the "time elapsed since midnight" could be lying to you. Even when midnight exists, if you save 10:00 as "10 hours", then that's potentially a false statement. There may have been 9 hours or 11 hours elapsed since midnight, if you consider the two days per-year involved in DST transitions.
At some point in your application, you'll likely be applying this time-of-day value to some particular date. When you do, if you are using "elapsed time" semantics, you might be tempted to simply add the elapsed time to midnight of the date in question. That will lead to errors on DST transition days, for the reasons I just mentioned. If you are instead representing a "time of day" in your storage, you'll be more likely to combine them together properly. Of course, this is highly dependent on what language and API you are using.
With any of these, be careful when using recurrence patterns. Say you store a time of "02:00:00" when a bar closes every night. When DST springs forward, that time might not exist, and when it falls back, it will exist twice. You need to be prepared to check for this condition when you apply the time to any particular date.
What you should do is entirely up to your use case. In many situations, the sensible thing to do is to jump forward one hour in the spring-forward gap, and to pick the first of the two points in the fall-back overlap. But YMMV.
See also, the DST tag wiki.
Per comments, it looks like the "tod" gem will suffice for your Ruby code.
The question seems a little vague, but I will have a try.
Generally speaking, using an integer seems good enough for me. It is easy to compare, easy to add or subtract a duration (of seconds), and is space- and time-efficient. You can consider wrapping it in a class if you are using an object-oriented language.
As far as I know, there are no existing classes for your needs in C or C++.
In the .NET world, the TimeSpan class may be useful for your purpose. It has some conveniences, like: you can get the TimeSpan value from DateTime.TimeOfDay; you can add the TimeSpan with an interval (a TimeSpan); you can get the hours, minutes, and seconds components separately; etc.
If you use Python, datime.time is also a good candidate. It is designed exactly for usages like yours.
I do not know other good candidates in other languages.
Speaking for Java:
In Java, the use-cases you describe are not covered well by old java.util.Date (which is a global timestamp despite of its name) or java.util.GregorianCalendar (which is a kind of combination of date and time and zone etc.), but:
In Java 8 you have the new built-in class java.time.LocalTime which covers your use-cases well. Predecessor is the equally-named class LocalTime in the external and popular Java library JodaTime which is working since Java 5. Furthermore, in my own alpha-state-library I have the type net.time4j.PlainTime which is similar, but also offers 24:00-support (good for example for shop opening times). All in all Java is a well suited language with interesting time libraries which can mostly do what you wish. In detail:
a) TimeZone and DST adjustments are not handled by the Java classes mentioned above. Instead they are only handled if you convert such a plain wall time to another type like org.joda.time.DateTime which contains a reference to a timezone.
b) Indeed these time classes are completely independent from calendar date, too.
c) The internal storage strategy is for JSR-310 (Java 8):
private final byte hour;
private final byte minute;
private final byte second;
private final int nano;
JodaTime uses the other strategy of local milliseconds instead (elapsed time since midnight).
You cannot represent a time unless you also know the day/month/year. There is no such thing as "should not be affected by daylight saving time" as there are many complicated issues to deal with, including leap seconds and so on. Time, as a human sees it, is a complicated thing that cannot easily be dealt with mathematically.
If you really need to store "11am" without any date associated, then that's what you should store. Just store 11am (or perhaps just 11, use 24 hour time).
Then, if you need to do any math you must apply a date before doing any operations on the time.
I would also refrain from storing "11am" as "x seconds from midnight". You really should just use 11 hours, since that is what the user sees, and then have a good date/time library convert it to a useful format. For example, telling the user if the restaurant is open right now you'd pass it to a date library with today's date.

Data model to use for a DVR's recording schedule

A DVR needs to store a list of programs to record. Each program has a starting time and duration. This data needs to be stored in a way that allows the system to quickly determine if a new recording request conflicts with existing scheduled recordings.
The issue is that merely looking to see if there is a show with a conflicting start time is inadequate because the end of a longer program can overlap with a shorter one. I suppose one could create a data structure that tracked the availability of each time slice, perhaps at half-hour granularity, but this would fail if we cannot assume all shows start and end at the half-hour boundary, and tracking at the minute level seems inefficient, both in storage and look up.
Is there a data structure that allows one to query by range, where you supply the lower and upper bound and it returns a collection of all elements that fall within or overlap that range?
An interval tree (maybe using the augmented tree data structure?) does exactly what you're looking for. You'd enter all scheduled recordings into the tree and when a new request comes in, check whether it overlaps any of the existing intervals. Both this lookup and adding a new request take O(log(n)) time, where n is the number of intervals currently stored.

Resources