How to handle recurring times? - algorithm

First off, I marked this question as language agnostic, but I'm using PHP and MySQL. It shouldn't affect the question itself very much tho.
I'm creating an application which shows times of certain shows throughout the week. Every single show is recurring (on weekly basis) and there might be shows which are airing through 2 days - eg. starting on Sunday at 23:30, ending on Monday at 00:30. I'm storing start of the show (day of the week - Monday, Tuesday... - it's never exact date; time) and duration. There are never shows that would take more than 24 hours.
My problem is with validation if newly added shows aren't overlapping some old ones. Especially if it comes to Sunday-Monday shows.
How are such recurring events usually handled on both DB side and server side?
tl;dr version with stuff I considered
My first idea was to create some custom validation algorithm, but it seemed too cumbersome and complicated. Not that I'd whine about complicated hand-made solutions, but I'm interested if there isn't something more basic that I'm missing.
Other alternative that came to mind was to change table structure to use datetime (instead of "day of week" and "time"), and use a fake fixed date range to store the data. For example all Mondays would be set to 5th Jan 1970, Sundays would use 11th Jan 1970. There would be one exception to this rule - if there would be some show which starts on Sunday and ends on Monday, it would be stored as 12th Jan 1970. This solution would allow more flexible quering of the DB than the original one, and it would also simplify queries for shows which overlap between individual weeks (since we can do the comparison directly in the query). There are some disadvantages to this solution as well (for one, using fake dates might make it confusing).
Both solutions smell of wrong algorithms to me and would love to hear some opinions from more experienced fellow developers.

Sounds like you could just store the starting minute of each show as an integer number of minutes since the start of the week (10,080 possible values).
Then a show starting at minute $a with duration $dur_a will overlap $b if and only if
(10080 + $b - $a) % 10080 < $dur_a
For example consider a show starting at 11pm Sunday and another starting at 12.30am Monday. Here $a == 10020 and $dur_a == 120 and $b == 30. (10080 + $b - $a) % 10080 == 90. This is less than $dur_a and hence the shows overlap.

This problem could be simplified by converting the data into a format that is amenable to the calculations that are required. I recommend creating a type that represents the start times as the number of minutes from Sunday at midnight. Then simple integer range comparisons could be used to find overlapping shows.
The internal representation must, of course, be hidden and abstracted. You may, at some point, want to change the representation from minutes to seconds, for example.

I would opt for a custom validation algorithm:
For each show, compute all showing intervals [start1, end1], [start2, end2], ... [startN, endN], where N is the number of recurrence of the show.
For a new show, also compute these intervals.
Now check if any of these new intervals intersect any old intervals. This is the case if the start or the end of one interval is contained in the other.

Related

Simple algorithm to alternate days

I need to alternate between 2 tasks every day, and I need a simple algorithm to know which task I need to do.
I need to be able to run this algorithm by head, using simple general knowledge (like day of week, day of month, etc), and it must not rely of which task has been done the previous day (because I have a crappy memory).
I have tried checking for parity in a combination of day of week / day of month / # of month, etc, but couldn't find a suitable system: day of week have 2 consecutive odd numbers, same goes for day of month every so often.
I am afraid that this is impossible: if you can't remember what you did the day before, any other procedure will require more mnemonic effort.
remember what you did on January first (or another date),
remember the parities of the cumulated months: oeoeoeooeoe or ooeoeoeeoeo for a leap year,
add the cumulated parity of the month before* to the parity of the day,
add that to the parity of the first task.
E.g. if A on January 1st 2022, then on March 17, 2022: e + o = o gives B.
*In January, use even.
You can also state the month parity rule as: until August inclusive, use the co-parity of the month number; then use the parity. But for a leap year, change that parity after February (excluded).
I need to be able to run this algorithm by head
So, you don't need to take help of Computer science. You can use cognitive human ability to map a thing to another thing.
Note: This need not make sense to everybody though, if you are thinking out of the box.
Map task 1 as God's day.
Map task 2 as Devil's day in your brain.
This should be simple just like day and night.
Now, remember that devil's evil karma is always burnt by God the next day and that devil never learns his lesson. So this way, alternating would be easy.
Friends Episode snippet on Youtube
Just count the number of days in between your date and a given "zero" one...then use parity.
Take number of seconds (or milli, or whatever) since EPOCH (common zero for date and time), divide (integer division) by 60x60x24 (or 1000x60x60x24, or what is appropriate), you then get the number of days since EPOCH.
----EDIT----
Example: Got 1653910695 seconds since EPOCH (at the time of my experience). Dividing it by 60x60x24 give 19142 days. To morrow it will give 19143, etc.
<?php
$day = Date('j');
$previous_day = date('j', strtotime("-1 days"));
if($day%2==0 OR $previous_day%2!=0)
echo "Task 1";
}else{
echo "Task 2";
}
?>

Summing times in Google sheets

I have a sheet where I record my working hours (this is more for me to remind me to stop working than anything else). For every day, I have three possible shifts - early, normal & late, and I have a formula which will sum up any times put into these columns and give me the daily total hours.
To summarise the duration of time spent working in a day, I use the following formula: =(C41-B41)+(E41-D41)+12+(G41-F41) which is:
early end time minus early start time
normal end time minus normal start time PLUS 12 hours
late end time minus late start time
Which gives me output like this:
What I cannot seem to achieve is, the ability to sum the daily totals into something which shows me the total hours worked over 1-week. If I attempt to sum the daily totals together for the example image shown, I get some wild figure such as 1487:25:00 when formatting as 'Duration' or 23:25:00 when formatted as 'Time'!
All my cells where I record the hours worked are formatted as 'Time'
When using arithmetic operations on date values in Google Sheets, it's important to remember that the internal representation of a date is numeric, and understood as the number of days since January 1, 1970.
What follows from that, is that if you want to add 12 hours to a time duration, you should not write "+12" because that will in fact add 12 days. Instead add "+12/24". In other words, try the following formula instead of the one you are using now:
=(C41-B41)+(E41-D41)+(12/24+G41-F41)

Is there a complex date filter algorithm?

Essentially, I want a system that can filter simply such as "Between August 4th and August 7th", but be as complicated as "Every third saturday or monday of each january on leap years".
I figured that in order to represent the complicated boolean algebra, I would need a tree structure. Each node would either be a boolean operation (AND, OR, XOR, NOT) and then would have children that it apply to, which can either be specific filters or another boolean operation.
Each "specific filter" would be something like "Sundays" or "Leap Years". I think everything up to this point is very doable. However, the problem then arises in parsing the tree to actually find what dates are needed, in order to then make database queries to get the data points.
With the example above (Every third saturday or monday of each january on leap years), if we pre-restrict ourselves to the years that we have data (5 years worth). If the sat/mon filters happen to be the top nodes in the tree, we will end up with 500 segmented dates (2 per week, 50 weeks a year, 5 years). Then, the next node has to search through all 500 to find which ones conform to "every third" filter. This isn't even the most complicated example, because an arbitrary number of filters should be allowed, and XOR makes that even more crazy.
So, is there any easy route? Did someone already build this? This is just a small part a project involving data visualization, but it seems that it could be an entire project by itself.
I found a couple in Ruby. IceCube seems promising, even though it might not support all your needs.

Best approach: transfer daily values from one year to another

I will try to explain what I want to accomplish. I am looking for an algorithm or approach, not the actual implementation in my specific system.
I have a table with actuals (incoming customer requests) on a daily basis. These actuals need to be "copied" into the next year, where they will be used as a basis for planning the amount of requests in the future.
The smallest timespan for planning, on a technical basis, is a "period", which consists of at least one day. A period always changes after a week or after a month. This means, that if a week is both in May and June, it will be split in two periods.
Here's an example:
2010-05-24 - 2010-05-30 Week 21 | Period_Id 123
2010-05-31 - 2010-05-31 Week 22 | Period_Id 124
2010-06-01 - 2010-06-06 Week 22 | Period_Id 125
We did this to reduce the amount of data, because we have a few thousand items that have 356 daily values. For planning, this is reduced to "a few thousand x 65" (or whatever the period count is per year). I can aggregate a month, or a week, by combining all periods that belong to one month. The important thing about this is, I could still use daily values, then find the corresponding period and add it there if necessary.
What I need, is an approach on aggregating the actuals for every (working)day, week or month in next years equivalent period. My requirements are not fixed here. The actuals have a certain distribution, because there are certain deadlines and habits that are reflected in the data. I would like to be able to preserve this as far as possible, but planning is never completely accurate, so I can make a compromise here.
Don't know if this is what you're looking for, but this is a strategy for calculating the forecasts using flexible periods:
First define a mapping for each day in next year to the corresponding day in this year. Then when you need a forecast for period x you take all days in that period and sum the actuals for the matching days.
With this you can precalculate every week/month but create new forecasts if the contents of periods change.
Map weeks to weeks. The first full week of this year to the first full week of the next. Don't worry about "periods" and aggregation; they are irrelevant.
Where a missing holiday leaves a hole in the data, just take the values for the same day of the previous week or the next week, and do the same at the beginning/end of the year.
Now for each day of the week, combine the results for the year and look for events more than, say, two standard deviations from the mean (if you don't know what that means then skip this step), and look for correlations with known events like holidays. If a holiday doesn't show an effect in this test then ignore it. If you find an effect, shift it to compensate for the different date next year. Don't worry about higher-order effects, you don't have enough data to pin them down.
Now draw in periods wherever you like and aggregate all you want.
Don't make any promises about the accuracy of these predictions, there's no way to know it. Don't worry about whether this is the best possible way; it isn't, but it's as good as any you're likely to find. You can spend as much more time and effort fine-tuning this as you wish; it might raise expectations but it's not likely to make the results much more accurate-- it's about as likely to make them worse.
There is no A-priori way to answer that question. You have to look at your data, and decide what the important parameters (day of week, week number, month, season, temperature outside?) using the results.
For example, if many of your customers are jewish/muslim, then the gregorian calendar, and ISO-week numbers and all that won't help you much, because jewish/muslim holidays (and so users behaviour) are determined using other calendars.
Another example - Trying to predict iPhone search volume according to last year's search doesn't sound like a good idea. It seems that the important timescales are much longer than a year (the technology becoming mainstream over the years) and much shorter than a year (Specific events that affect us for days-weeks).

Efficient algorithm for determining if a date is in DST

I'm looking for a better than O(n) algorithm to determine if a date in the future will have daylight savings time applied (and how much). Given a year, month, day, hour, minute and time zone (and a copy of the Olsen Time Zone database) how does one efficiently determine if that date will be in DST? I'm looking for the algorithm, not a library function to call.
Thank you.
FURTHER EXPLANATION: The date library I'm using is very slow when you create an object with a date in the future and a time zone. It turns out its doing a linear calculation to calculate if the date is in daylight savings time. Not only that, its doing this at object creation time. Obviously it could wait until asked, but it should also be more efficient.
Sure, DST rules change and a date library can't predict the future, but the alternative is to put an arbitrary upper limit on localized dates.
Everybody's already commented on the problems with always-changing DSTs. But I can accept the premise that we just pretend the currently known rules will apply forever.
To get your DST information, the first thing to do is to calculate the year/month/day for your future date (if it isn't in that form already). Then you look up your time zone and pull out the variation against UTC, the DST on/off rule and offset. There could be several different rules depending on which year, you want to be sure to grab the right one for your "target" year. For reasons explained below, it may be handy to also be aware of the rules for the preceding year.
The on/off rules will have a funny spec like "Oct lastSun": That means the switch occurs in the night of the last Sunday in October.
What you need to do is gather up all of these tersely formatted "rules" and develop a bit of code for each to determine the last date indicated by that rule. It's currently December, so given a couple of rules like "Mar lastSun" and "Oct lastSun" for my time zone, those dates would be March 29, 2009 and October 25, 2009. Which of these dates is more recent? October. October is associated with an "off", so we must currently have NO DST.
You can calculate the DST on/off dates for the current (i.e. target) year regardless of whether the target date is before or after those dates; if the on/off date is in the future of your target date, then simply do the rule calculation again for the previous year. Note that the rules may have changed during the interval, so be sure to apply the correct one for the year you're looking at.
Worst case for this calculation is, you have to repeat your two rule calculations for the previous year. But there's no searching going on otherwise, so it's strictly O(1).
I found a Local/DST/Tz calculator here: http://home-4.tiscali.nl/~t876506/WhatDay.html and as it's a JavaScript applet you should be able to simply crib the code. It doesn't handle all rules, though, so you will need to add some code for the remaining rules.
Update: I just noticed you have an hour and minute in your time as well. That complicates matters just a little. If your date is not on a "switch" date then the instructions I gave above will do you fine. Otherwise, you need to consider the time. I guess the cleanest thing to do would be to include the time in your determination of "most recent". I.e. if your target time is 00:30 UTC and switch time for the given zone is 01:00, then the target year's switch time is still in the future and you have to use the previous year's switch time instead. For practical purposes, this will mean that the "other" switch time was the most recent, and its on/off status applies.
Your number one problem is daylight savings rules that are set by the local authorities. The latter can pass almost any law at any time and therefore change the rules in a way you can't possibly predict.
As far as I know DST changes that are known start and end on a fixed day each year (first weekend in april, last weekend in october, stuff like that). So you could ese the Doomsday Algorithm to find the days of the week for the given year and calculate the conversion dates from that. Then you can determine if DST is in effect in source and/or destination locale. The converion itself is simply a matter of adding and/or subtracting an hour to compensate for DST and then factor in the timezone difference.
Hmm, as I see the problematic point is to determine weekday for a given day, far in the future.
For that, I suggest something like that:
after every 400 years, the complete system turns around, so first divide the number of years with 400, take the integral part. In 400 years, there are 99 leap years and 301 simple ones. If an arbitrary day is Monday, then the same day 400 years later will be 301+2x99 = 499 (mod 7) ---> Monday+2 ---> Wednesday. So you have to say something like that:
wday = (ref_day + 2 * (int)((target_year - ref_year) / 400)) mod 7
then you can do further optimizations, but I guess you can go year-by-year, that will do it. At most you make 399 simple operations, if (leap_year) then ++ else +=2, mod 7.
After you have the weekday for Jan 1 that year, you can calculate DST switching dates, as Carl Smotricz has written.

Resources