Repeating "Events" (Calendar) - events

I'm currently working on an application that allows people to schedule "Shows" for an online radio station.
I want the ability for the user to setup a repeated event, say for example:-
"Manic Monday" show - Every Monday From 9-11
"Mid Month Madness" - Every Second Thursday of the Month
"This months new music" - 1st of every month.
What, in your opinion, is the best way to model this (based around an MVC/MTV structure).
Note: I'm actually coding this in Django. But I'm more interested in the theory behind it, rather than specific implementation details.

Ah, repeated events - one of the banes of my life, along with time zones. Calendaring is hard.
You might want to model this in terms of RFC2445. However, that may well give you far more flexibility - and complexity than you really want.
A few things to consider:
Do you need any finer granularity than a certain time on given dates? If you need to repeat based on time as well, it becomes trickier.
Consider date corner cases such as "the 30th of every month" and what that means for leap years
Consider time corner cases such as "1.30am every day" - sometimes 1.30am may happen twice, and sometimes it may not happen at all, due to daylight saving time
Do you need to share the schedule with people in other time zones? That makes life trickier again
Do you need to represent the number of times an event occurs, or a final date on which it occurs? ("Count" or "until" basically.) You may not need either, or you may need one or both.
I realise this is a list of things to think about more than a definitive answer, but I think it's important to define the parameters of your problem before you try to work out a solution.

From reading other posts, Martin Fowler describes recurring events the best.
http://martinfowler.com/apsupp/recurring.pdf
Someone implemented these classes for Java.
http://www.google.com/codesearch#vHK4YG0XgAs/src/java/org/chronicj/DateRange.java

I've had a thought that repeated events should be generated when the original event is saved, with a new model. This means I'm not doing random processing every time the calendar is loaded (and means I can also, for example, cancel one "Show" in a series) but also means that I have to limit this to a certain time frame, so if someone went say, a year into the future, they wouldn't see these repeated shows. But at some point, they'd have to (potentially) be re-generated.

Related

Algorithm for Scheduling One Appointment in Already Full Schedule

I'm building a calendar scheduling application for, let's say a plumbing company. The company has one or more plumbers, who each have a schedule of appointments at different times throughout the day. So Josh's schedule on May 30th might include a 30-minute appointment at 10 AM, a 45-minute appointment at 1 PM, and an hour-long appointment at 3 PM, while Maria has a completely different schedule that day. Now say a customer wants to book an appointment with this company, and my program has already calculated the time this new appointment will take. I want my program to return a list of possible appointment times for any plumber(s). Is there a standard algorithm for this type of problem?
I'd prefer language-agnostic, general steps just to be more helpful to anyone who might be in a similar situation with a different language, though I'm using PHP and PostgreSQL if there's a specific language feature suited to this.
Here's what I've tried so far:
Get all available shifts for every plumber on the requested day
Get all appointments already made on that day
Do a sort of boolean subtraction to cut the appointments out of the shifts, leaving gaps in each plumber's schedule
Get rid of all schedule gaps that are smaller than the requested appointment length (I also calculate drive times here so I know how far appointments need to be from one another)
Return those gaps to the customer, trimmed to the appointment length, as appointment possibilities
I've learned that the problem with this approach is that it doesn't understand what to do with a gap much larger than the requested appointment. If you have a gap from 10 AM to 6 PM but you want an hour-long appointment, it will only suggest 10 AM to 11 AM. This approach doesn't allow for time-of-day choices, either. In that same scenario, what if the customer wants a morning appointment? Then it should only suggest 10-11 and 11-12. Or if they want an evening appointment, it should only suggest 5-6 PM. This approach also doesn't consider two plumbers working together. If we assume that two workers = half the time, then maybe the algorithm should look for the same 30 minutes available in both Josh and Maria's schedules along with 60-minute gaps in either plumber's schedule. Lastly, this algorithm feels very inefficient.
By the way, I've looked at several other questions here and around the Internet about how to solve similar situations, but I'm finding that most (if not all) of those questions involve optimizing a schedule. That might be valuable for other parts of this program, but for now, let's assume that the existing appointments are fixed and unchangeable. We're just looking to fit a new appointment into an existing schedule. I know this is possible because applications like Calendly have similar inputs and outputs.
In short, is there a better way of meeting these goals:
Suggest available gaps in one plumber's schedule given a time interval
If possible, only return appointment possibilities in the given time of day (morning = 4-12, afternoon = 12-5, evening = 5-10, night = 10-4, or any), and if not possible, continue with the algorithm as if no time of day had been specified
Suggest smaller gaps where n plumbers might do the job in 1/n time (there aren't that many plumbers, so setting a limit on this isn't necessary). This isn't as important as the other criteria, so if this isn't possible or would make the algorithm far more complex, then don't worry about it.
Split big appointment gaps into smaller gaps so we can suggest 4 hour-long gaps in between 10 AM and 2 PM. Obviously we can't suggest all possible hour-long segments of that gap because they'd be infinite
Thank you in advance.
There is no need for any sophisticated algorithm. There is only a small number of possible appointment times throughout a day, let's say every 30 minutes or so. Iterate over all possible times: 06:00, 06:30, 07:00, ... 20:00. Check each time if it matches the requirements, that check can either return a yes/no result, or a number that say how good a match that time is. You end up with a list of possible appointment times, pick the best one or all of them.

What Machine Learning algorithm would be appropriate?

I am working on a predictor for learning the most likely period for grape harvesting, depending on weather and on the characteristics of grape, namely sugar level, Ph, acidity. I've got two datasets and I am thinking of how to merge them together: one is the pre-harvest analysis data of some Italian vineyards in the 2003-2013 period, the other is the weather on that decade. What I want to do is learning from my samples when to harvest, given a range for the optimal sugar level, Ph and acidity, and given a weather forecast.
I thought that some Reinforcement Learning approach could work. Since the pre-harvest analysis are done about 5 times during the grape maturation period, I thought that those could be states I step in, while the weather conditions could be the "probabilities" of going from a state to another.
Yet I am not sure of what algorithm would be the best as every state and every "probability" depends on several variables. I was told that Hidden Markov Model would work, but it seems to me that my problem doesn't fit the model perfectly.
Do you have any suggestion? Thx in advance
This has nothing to do with the actual algorithm, but the problem you are going to run into here is that weather is extremely local. One vineyard can have completely different weather than another only a mile away from it, believe or not. If you put rain gauges at each vineyard, you will find this out. To get really good results you need to have a mini weather station at each vineyard. Absent this, your best option is to use only vineyards in the immediate vicinity of the weather measurements. For example, if your data is from an airport, only use vineyards right next to the airport.
Reinforcement learning is appropriate when you can control the action. It is like a monkey pushing buttons. You push a button and get shocked, so you don't push that button again. Here you have a passive data set and cannot conduct experimental actions, so reinforcement learning does not apply.
Here you have a complex set of uncontrolled inputs, the weather data, a controlled input (harvest time), and several output parameters, sugar etc. Given that data, you want to predict what harvest time to use for some future, unknown weather pattern.
In general, what you are doing is sensitivity analysis: trying to figure out how your factors affected the outcome that occurred. The tricky part is that the outcomes may be driven by some non-obvious pattern. For example, maybe 3 weeks of drought, followed by 2 weeks of heavy rain implies the best harvest will be 65 days hence, or something like that.
So, what you have to do is featurize the data to characterize it in possible likely ways, then do a sensitivity analysis. If the analysis has a strong correlation, then you have found a solution. If it does not, then you have to find a different way to featurize the data. For example, your featurization might be number of days with rain over 2 inches, or it might be most number of days without rain, or it might be total number of days with bright sunshine. Possibly multiple features might combine to make a solution. The options are limited only by your imagination.
Of course, as I was saying above, the fly in the ointment is that your weather data will only roughly approximate the real and actual weather at the particular vineyard, so there will be noise in the data, possibly so much noise as to make getting a good result impossible.
Why you actually don't care too much about the weather
Getting back to the data, having unreliable weather information is actually not a problem, because you actually don't care too much about the weather. The reason is two-fold. First of all, the question you are trying to answer is not when to harvest the grapes, it is whether to wait to harvest or not. The vintner can always measure the current sugar of the grapes. So, he just has to decide, "Should I harvest the grapes now with sugar X%, or should I wait and possibly get a better sugar Z% later? To answer this question the real data you need is not the weather, it is a series of sugar/acidity readings taken over time. What you want to predict is whether, given a situation, the grapes will get better or whether they will get worse.
Secondly, grapevines have an optimal amount of moisture they like. If the vine gets too dry, that is bad, if it gets too wet that is bad. You cannot predict how moist a vine is from the weather. Some soils hold moisture well, others are sandy. A sandy vineyard will require more rain than a clay vineyard to have the same moisture levels. Also, the vintner can water his vineyards, completely invalidating the rainfall pattern. Therefore, weather is pretty much a non-factor.
I agree with Tyler that from a feasible standpoint weather might harm your analysis. However, I think this is for you to test and find out!- there could be some interesting data that comes out of it.
I'm not sure exactly what your test is, but a simple way to start perhaps is to make this into a classification problem using svm (or even logistic regression since you want probabilities) and use all the data as the input for the algorithm- assuming you know which years were good harvest years or not. You could even test each variable individually and see how it effects your performance. I suggest you go this way if you can just because there's massive amounts of sources on the net and people here on SO that can help you tune your algo.
When you have a handle on this, I would, as you seem to have been suggested before, try the HMM- as it will tell you which day was probably the best for the harvest. This is where the weather might hurt, but you'll come to understand more about your data from the simpler experiments.
The think I've learned about machine learning is that while there are guidelines for when to choose which algorithm its not always set in stone and you can change your question slightly and try a new approach to the problem, depending how much freedom you have to play with the data. Good luck and have fun!

When to discard events in discrete event simulation

In most examples of DES I've seen an Event triggers a State change and possibly schedules some new Events in the future. However, if I simulate a Billiard game this is not the whole story.
In this case the Events of interest are the shots and the collisions of the balls with each other and with the cushion. The State consists of the position and velocity of each ball.
After a collision or a shot I will first recalculate a new State and from there I will calculate all possible future (first) collisions. The strange thing is that I will have to discard all Events which were scheduled previously as these describe collisions which were possible only before the state change.
So there seem to be two ways of doing DES.
One, where the future Events are computed from the State and all Events scheduled in the past are discarded with each State change (as in the Billiard example), and
another one, where each Event causes a state change and possibly schedules new Events, but where old Events are never discarded (as in most examples I've seen).
This is hard to believe.
The Billiard example also has the irritating property, that future events are calculated from the global state of the system. All Balls need to be considered, not just the ones which participated in a collision or a shot.
I wonder if my Billard example is different from classic DES. In any case, I am looking for the correct way to reason about such issues, i.e.
How do I know which Events are to be discarded?
How do I know what States to consider when scheduling future events
It there a possible "safe" or "foolproof" way to compute future events (at the cost of performance)?
An obvious answer is "it all depends on your problem domain". A more precise answer or a pointer to literature would be much appreciated
Your example is not unique or different from other DES models.
There's a third option which you omitted, which is that when certain events occur, specific other events will be cancelled. For example, in an epidemic model you might schedule infection events. Each infection event subsequently schedules 1) the critical time for the patient beyond which death becomes inevitable, with some probability and some delay corresponding to the patient's demographics, mortality rate for that demographic, and rate of progression for the disease; or 2) the patient's recovery. If medical interventions get queued up according to some triage strategy, treatment may or may not occur prior to the critical time. If not, a death gets scheduled, otherwise cancel the critical time event and schedule a recovery event.
These sorts of event scheduling, event cancellation, and parameterizations so that you can identify which entities the scheduling/cancelling applies to can all be described by a notation called "event graphs," created by Lee Schruben. See 'Schruben, Lee 1983. Simulation modeling with event graphs. Communications of the ACM. 26: 957-963' for the original paper, or check out this tutorial from the 1996 Winter Simulation Conference which is freely available online.
You might also want to look at this paper titled "Simple Movement and Detection in Discrete Event Simulation", which appeared in the 2005 Winter Simulation Conference.
The State consists of the position and velocity of each ball.
Once you get that working, you'll need to add the spin and axis of rotation for each ball, since the proper use of spin is what differentiates the pros from the amateurs.
I will have to discard all Events which were scheduled previously
Yup, that's true, so don't bother scheduling them at all. See below.
So there seem to be two ways of doing DES (both involving the
scheduling of events)
Actually, there's a third way. Simply search the problem space to determine the time of the first future event, and then jump to that time. There is no need to schedule Events. You only care about the one Event that will occur first.
All Balls need to be considered
Yes, this is true. Start by considering one of the balls and determining the time of it's next collision. That time then puts an upper limit on how far the other balls can move. For example, imagine the first ball will collide after 0.1 seconds. Then the question for the second ball is, "Is it possible for the second ball to hit anything within 0.1 seconds?" If not, then move along to the third ball. If so, then reduce the time limit to the time it takes for the second ball to collide, and then move on to the third ball.
An obvious answer is "it all depends on your problem domain"
That's true. My comments apply only to your example of a billiards simulation. For other problem domains, different rules apply.

How to manage transactions, debt, interest and penalty?

I am making a BI system for a bank-like institution. This system should manage credit contracts, invoices, payments, penalties and interest.
Now, I need to make a method that builds an invoice. I have to calculate how much the customer has to pay right now. He has a debt, which he has to pay for. He also has to pay for the interest. If he was ever late with due payment, penalties are applied for each day he's late.
I thought there were 2 ways of doing this:
By having only 1 original state - the contract's original state. And each time to compute the monthly payment which the customer has to make, consider the actual, made payments.
By constantly making intermediary states, going from the last intermediary state, and considering only the events that took place between the time of these 2 intermediary states. This means having a job that performs periodically (daily, monthly), that takes the last saved state, apply the changes (due payments, actual payments, changes in global constans like the penalty rate which is controlled by the Central Bank), and save the resulting state.
The benefits of the first variant:
Always actual. If changes were made with a date from the past (a guy came with a paid invoice 5 days after he made the payment to the bank), they will be correctly reflected in the results.
The flaws of the first variant:
Takes long to compute
Documents printed with the current results may differ if the correct data changes due to operations entered with a back date.
The benefits of the second variant:
Works fast, and aggregated data is always available for search and reports.
Simpler to compute
The flaws of the second variant:
Vulnerable to failed jobs.
Errors in the past propagate until the end, to the final results.
An intermediary result cannot be changed if new data from past transactions arrives (it can, but it's hard, and with many implications, so I'd rather mark it as Tabu)
Jobs cannot be performed successfully and without problems if an unfinished transaction exists (an issued invoice that wasn't yet paid)
Is there any other way? Can I combine the benefits from these two? Which one is used in other similar systems you've encountered? Please share any experience.
Problems of this nature are always more complicated than they first appear. This
is a consequence of what I like to call the Rumsfeldian problem of the unknown unknown.
Basically, whatever you do now, be prepared to make adjustments for arbitrary future rules.
This is a tough proposition. some future possibilities that may have a significant impact on
your calculation model are back dated payments, adjustments and charges.
Forgiven interest periods may also become an issue (particularly if back dated). Requirements
to provide various point-in-time (PIT) calculations based on either what was "known" at
that PIT (past view of the past) or taking into account transactions occurring after the reference PIT that
were back dated to a PIT before the reference (current view of the past). Calculations of this nature can be
a real pain in the head.
My advice would be to calculate from "scratch" (ie. first variant). Implement optimizations (eg. second variant) only
when necessary to meet performance constraints. Doing calculations from the beginning is a compute intensive
model but is generally more flexible with respect to accommodating unexpected left turns.
If performance is a problem but the frequency of complicating factors (eg. back dated transactions)
is relatively low you could explore a hybrid model employing the best of both variants. Here you store the
current state and calculate forward
using only those transactions that posted since the last stored state to create a new current state. If you hit a
"complication" re-do the entire account from the
beginning to reestablish the current state.
Being able to accommodate the unexpected without triggering a re-write is probably more important in the long run
than shaving calculation time right now. Do not place restrictions on your computation model until you have to. Saving
current state often brings with it a number of built in assumptions and restrictions that reduce wiggle room for
accommodating future requirements.

Time Calendar Data Structure

We are looking at updating (rewriting) our system which stores information about when people can reserve rooms etc. during the day. Right now we store the start and time and the date the room is available in one table, and in another we store the individual appointment times.
On the surface it seemed like a logical idea to store the information this way, but as time progressed and the system came under heavy load, we began to realize that this data structure appears to be inefficient. (It becomes an intensive operation to search all rooms for available times and calculate when the rooms are available. If the room is available for a given time, is the time that it is available long enough to accommodate the requested time).
We have gone around in circles about how to make the system more efficient, and we feel there has to be a better way to approach this. Does anyone have suggestions about how to go about this, or have any places where to look about how to build something like this?
I found this book to be inspiring and a must-read for any kind of database involving time management/constraints:
Developing Time-Oriented Database Applications in SQL
(Added by editor: the book is available online, via the Richard Snodgrass's home page. It is a good book.)
#Radu094 has pointed you to a good source of information - but it will be tough going processing that.
At a horribly pragmatic level, have you considered recording appointments and available information in a single table, rather than in two tables? For each day, slice the time up into 'never available' (before the office opens, after the office closes - if such a thing happens), 'available - can be allocated', and 'not available'. These (two or) three classes of bookings would be recorded in contiguous intervals (with start and end time for each interval in a single record).
For each room and each date, it is necessary to create a set of 'not in use' bookings (depending on whether you go with 'never available', the set might be one 'available' record or it might include the early shift and late shift 'never available' records too).
Then you have to work out what questions you are asking. For example:
Can I book Room X on Day Y between T1 and T2?
Is there any room available on Day Y between T1 and T2?
At what times on Day Y is Room X still available?
At what times on Day Y is a room with audio-visual capabilities and capacity for 12 people available?
Who has Room X booked during the morning of Day Y?
This is only a small subset of the possibilities. But with some care and attention to detail, the queries become manageable. Validating the constraints in the DBMS will be harder. That is, ensuring that if the time [T1..T2) is booked, then no-one else books [T1+00:01..T2-00:01) or any other overlapping period. See Allen's Interval Algebra at Wikipedia and other places (including this one at uci.edu).

Resources