As is usual with cubes, the users want things that don't fit into a hierarchy to be displayed hierarchically. They'd like to see Day > Week > Month > Quarter > Year as the hierarchy, but the problem with weeks is they can be part of 1-2 months, not just 1 month (and by extension part of 2 quarters, semesters, years).
So my question is: how to set up the attribute relationships, and how to set up the hierarchy? Here is what I have, but I know it's not optimal.
Hierarchies (cycle == weeks):
Attribute Relationships:
There is no Cycle -> Year, because it is a many-to-many relationship
There are four types of weeks to concern yourself with:
Week of Year (1-53). Hierarchy: Year > Week
You should decide whether you want Week 1 to begin on Jan 1, or follow the ISO standard
Week of Month (1-5). Hierarchy: Year [> Quarter] > Month > Week
You should decide whether you want Week 1 to begin on the first day of the month, or the first Sunday/Monday of the month.
Week of Fiscal Year (1-53). Hierarchy: Fiscal Year > Week
Week of Fiscal Month (1-5). Hierarchy: Fiscal Year [> Quarter] > Month > Week
You can leave the design as you have it, or even add the weeks to the "Calendar Time" hierarchy, thus only having one hierarchy as requested by the users. This is just what Microsoft calls a non-natural hierarchy. Analysis Services query response time is much slower for these than for natural hierarchies (those with all child levels having a relationship to their parent level). That is why you get a warning in BIDS. You just will have to test if the performance is good enough for your users. If it is, fine. If it is not, maybe fall back from the one-hierarchy solution to the two hierarchy solution described in your question. Then at least the first hierarchy would be fast, and only the second have bad performance.
There are other solutions, but they are all a bit unnatural: Some companies define months to have exactly four or five weeks so that the relationship can be set, but a reporting month starts or ends a few days before or after the calendar month. See e. g. the 4-4-5 calendar article in Wikipedia.
Related
Given the new legal regulations due to covid-19 the daycare of my child and all involved parents are overwhelmed and we need to come up with a schedule of when which child can be cared for.
Given the demanded care time per child (s. below), We need an algorithm to optimize the following:
Minimal total contacts / fixed groups. If children meet, it is best if they stay in that group and do not see children of other groups.
While point 1 is more important, the 2nd prio would be to reduce the size per group or maybe it should be phrased as minimal count of different children met per child
Even less important: reduce total contact time.
(Maybe there are other requirements, that I overlooked?)
The demands are of following nature (Timespan and type) :
Case
Child
Timespan
Type
(1) Fixed time, Required
1
Monday, 8:30 - 13:00
Required
(2) Fixed time, nice to have
1
Tuesday, 8:30 - 13:00
Nice to have
(see (1))
1
Tuesday, 13:00 - 16:00
Required
(see (1))
1
Thursday, 8:30 - 13:00
Required
(see (1))
2
Monday 8:30 - 13:00
Required
(3) Flexibel date, required
2
Any 2 other days 8:30 - 13:00
Required
(4) Flexibel date, nice to have
2
Any day 13:00 - 16:00
Nice to have
(5) Flexibel datetime, Required
3
3 hours
Required
(6) Flexibel datetime, Nice to have
3
3 additional hours
Nice to have
...
...
...
Required = The child must have daycare
Nice to have = daycare is demanded but not required. E.g. if child 1 meets child 2 and 3 on Monday and Thursday, it would be fine to meet the same children on Tuesday morning as well, but if it is a completely different group of children than this would not make sense.
All provided timespans must stay in one continuous piece (meaning that 3 hours cannot be split up into multiple slots).
Additional Information
There is only one room available.
There are 15 children in total.
If a solution is much better than the other it is ok-ish to violate "Required" demands for a few cases. We might be able to find a different solution for the parents in few situations. The algorithm should hence contain a parameter like maxAllowedViolations - let's say it's 3 and it should compare how much the solution is better than without the errors.
The demand is provided per week and might change from week to week. I only know the demand one week in advance. The ideal grouping hence might change per week, but it might be better to respect the grouping of the last week as a guidance, because corona has about 7 to 10 days of incubation time.
The caregivers are tested for covid-19 twice a week, the children are not.
I do not care in which language or pseudo-code-ish way the algorithm is, but I will try to implement the algorithm in a web-based format so other daycare centers can use it as well.
I want to find the "same day of the week last year". I'm sure that question is going to have litany of subtleties that I've not yet begun to think about but I believe this question is likely a common one.
Here are a few use cases where someone might want to use this algorithm:
Example 1
I'm a manager at a Walmart. I want to find out how many kitten mittens I sold the same day last year. I know that kitten mitten purchases are closely related to day of week and week of year. Thus I want to know "how many kitten mittens do I need to stock for tomorrow "the first Tuesday in January".
Example 2
I'm a nurse at a hospital. I want to determine how many patients are coming in each day next week so I can better align staff with bed demand. I know that there are strong trends with how many patients arrive at the hospital by day of week, and I want to see how many patients we had "the third Friday of November" last year.
I feel like this is a standard problem people have to have come across. Is there a best approach to this challenge? I can imagine issues where in the current year there are five Fridays in say November, and the last year there were only four, so you would not be able to report in that manner.
What is a language-independent (although if you're curious, I would be implementing this in M) approach to this algorithm?
There is a function that returns week day number. In Cache it's $zd(date,10), in GT.M there should be similar one. All you need to do is to correct your date using this function:
set currentDate=+$h
set currentWeekDay=$zd(currentDate,10)
set dateAboutYearAgo=currentDate-365
set weekDayAboutYearAgo=$zd(dateAboutYearAgo,10)
set sameWeekDayAboutYearAgo=dateAboutYearAgo-weekDayAboutYearAgo+currentWeekDay
From this online calculator: http://homer.freeshell.org/dd.cgi using its data I've successfully written a working version, however its data is limited to years 1500 to 2600. I want to modify (and make a better one) so that I can calculate for any year > 2600.
Referring to Table X, is there actually a formula to calculate the base doomsday for all base centuries (above 2600)?
I've tried working it out myself by putting centuries higher than this e.g. 2700 gave me a base doomsday of '00', 2800 gave '02;, 2900 back to '00' again...
Help appreciated.
As I understand it, that page's “Base Doomsday” is just an offset to allow for the four-hundred-year cycle of leap day calculations. So, you can extend it indefinitely into the future simply by adding blocks of four centuries.
Are there any other calculators out there that do this?
Two common methods for calculating the day of the week
given a date are Doomsday, which you are using,
and Zeller's Congruence
www.merlyn.demon.co.uk provides
some really interesting information on date/time calculations, various calendar
systems and significant dates as they relate to calendar/date calculations.
The calculator at this link http://homer.freeshell.org/dd.cgi is the best in terms of explaining doomsday algorithm cleanly and clearly for human, with one little caveat.
If you input 2/29/1900, it would say it's a Thursday. Well, there is no 2/29/1900, because it's not a leap year.
Of course if your input 1/35/2016, it would "garbage-in-garbage-out" for you as well.
Imagine there are only 364 days in a year, then the day of week for each date will never change year after year, because mod(364,7)==0.
But we have 365 days a year, so the day steps forward 1 each year, that's where the second term mod(year, 7) comes from.
In addition, every 4 year, there is a leap year, which contributes to the last term mod(year, 4).
But every 100 years, you subtract a leap year, and every 400 years, you add one leap year. That's where the first term "3,2,0,5" comes in.
You see, it's all because of this leap year, and mod(365,7)==1 business.
7/11, 5to9 helps to remember table Z greatly.
Here is my problem: I am designing an application that will allow students to select the classes they want to take for the semester and create potential schedule layouts for them. Each class typically has several possible sections that occur at different times.
So I am looking for a good data structure to use in order to develop an algorithm that will do this.
Is there a common way to do this? Any data structures and/or algorithms I can apply to this situation? I am just looking for a place to get started.
EDIT: The classes tend to be Monday, Wednesday, Friday or Tuesday, Thursday. In a lot of cases there are also labs or recitations that occur at various times during the week
Thanks,
Rob
This is a problem where genetic algorithms are suitable. At least, my University staff developed an algorithm based on it. Here are some of their papers where the technique is presented.
http://morgoth.zemris.fer.hr/people/Marko.Cupic/files/2009-425555.EvoCOP_2009.pdf
http://morgoth.zemris.fer.hr/people/Marko.Cupic/files/2009-422047.iti2009.pdf
I would use a tree
At each node (which represents a class) branch for each section and an additional branch for not taking the course
You can prune for scheduling conflicts at any time
This shouldn't get too big as long as you aren't storing these forever, and as long as you don't include too many courses per student per semester
The tree would be rooted at any arbitrary class. Each branch from root would be a section of that class (and the extra branch for not taking it)
Then at the end of each of these branches you have more nodes. These nodes would all represent the second class you're fitting in the schedule.
Each of these nodes would have another branch for each section of the second class. And so on.
ex:
math
/ / \
2:00 1:00 blank
| | |
p.e p.e p.e
/ \ / \ / \
2:00 blank 2:00 blank 2:00 blank
|
conflict
Does each class have the same schedule each day of the week? Or are they like mine were, where some were MWF, others TuTh, and others Sat?
If all the classes are at the same time every day of the week, the model's pretty easy. You need tables for students, classes, classSections, and studentSchedules.
For your classSection table, since the classes aren't the same time every day, if they're the same days each week, you can include fields for each day of the week (M-Sa), start time, class length (in hours,) and, of course, the classCodeID.
At a minimum:
Student
ID
Class
classCodeID
description
classSection
classCodeID
classSectionID
isOnM
isOnTu
isOnW
isOnTh
isOnF
isOnSat
startTime
length
studentSchedule
studentID
classCodeID
classSectionID
You could also normalize the days of the week instead of having them in the classSection table, but I like seeing the week mapped out in a bunch of checkboxes.
I see you have multiple start times per week, so you'll need another ID field in the classSection table.
The app you have seems ok, don't you have a data model already? Looks like you don't even need to be a student to see the class schedules.
I will try to explain what I want to accomplish. I am looking for an algorithm or approach, not the actual implementation in my specific system.
I have a table with actuals (incoming customer requests) on a daily basis. These actuals need to be "copied" into the next year, where they will be used as a basis for planning the amount of requests in the future.
The smallest timespan for planning, on a technical basis, is a "period", which consists of at least one day. A period always changes after a week or after a month. This means, that if a week is both in May and June, it will be split in two periods.
Here's an example:
2010-05-24 - 2010-05-30 Week 21 | Period_Id 123
2010-05-31 - 2010-05-31 Week 22 | Period_Id 124
2010-06-01 - 2010-06-06 Week 22 | Period_Id 125
We did this to reduce the amount of data, because we have a few thousand items that have 356 daily values. For planning, this is reduced to "a few thousand x 65" (or whatever the period count is per year). I can aggregate a month, or a week, by combining all periods that belong to one month. The important thing about this is, I could still use daily values, then find the corresponding period and add it there if necessary.
What I need, is an approach on aggregating the actuals for every (working)day, week or month in next years equivalent period. My requirements are not fixed here. The actuals have a certain distribution, because there are certain deadlines and habits that are reflected in the data. I would like to be able to preserve this as far as possible, but planning is never completely accurate, so I can make a compromise here.
Don't know if this is what you're looking for, but this is a strategy for calculating the forecasts using flexible periods:
First define a mapping for each day in next year to the corresponding day in this year. Then when you need a forecast for period x you take all days in that period and sum the actuals for the matching days.
With this you can precalculate every week/month but create new forecasts if the contents of periods change.
Map weeks to weeks. The first full week of this year to the first full week of the next. Don't worry about "periods" and aggregation; they are irrelevant.
Where a missing holiday leaves a hole in the data, just take the values for the same day of the previous week or the next week, and do the same at the beginning/end of the year.
Now for each day of the week, combine the results for the year and look for events more than, say, two standard deviations from the mean (if you don't know what that means then skip this step), and look for correlations with known events like holidays. If a holiday doesn't show an effect in this test then ignore it. If you find an effect, shift it to compensate for the different date next year. Don't worry about higher-order effects, you don't have enough data to pin them down.
Now draw in periods wherever you like and aggregate all you want.
Don't make any promises about the accuracy of these predictions, there's no way to know it. Don't worry about whether this is the best possible way; it isn't, but it's as good as any you're likely to find. You can spend as much more time and effort fine-tuning this as you wish; it might raise expectations but it's not likely to make the results much more accurate-- it's about as likely to make them worse.
There is no A-priori way to answer that question. You have to look at your data, and decide what the important parameters (day of week, week number, month, season, temperature outside?) using the results.
For example, if many of your customers are jewish/muslim, then the gregorian calendar, and ISO-week numbers and all that won't help you much, because jewish/muslim holidays (and so users behaviour) are determined using other calendars.
Another example - Trying to predict iPhone search volume according to last year's search doesn't sound like a good idea. It seems that the important timescales are much longer than a year (the technology becoming mainstream over the years) and much shorter than a year (Specific events that affect us for days-weeks).