Scheduling Algorithm with contiguous breaks - algorithm

I'm lost here. Here's the problem and I think it's NP-hard. A center is staffed with a finite number of workers with the following conditions:
There are 3 shifts per day with 2 people in each shift
Each employee works for 5 days straight and then 2 days off with only one shift per day
So the problem is: how many workers do we need if the center remains active every day and a feasible schedule?
Update:
Thanks for all the great answers. The closest I've come to (with a randomized brute-force algorithm) is the following:
X 3 0
1 0 3
2 3 1
2 1 3
0 1 2
0 2 1
3 0 2
I've simplified the problem into batches of 2 people (0-3 represent 4 batches) in the hopes of getting a feasible solution. X refers to a shift which has not been assigned (which was not the initial goal but it looks like there may not be an alternative).

The constraints cannot be respected exactly as expressed in the question.
That's because the numbers don't add up (or rather "divide up").
Consequently, the problem should be reworded to require
exactly 3 shifts per day
exactly 2 workers per shift
workers work a maximum of 5 consecutive days
workers rest a minimum of 2 consecutive days
With the introduction of the minimum and maximum qualifiers, the minimum number of workers required is 9 (again assuming no part-time worker).
Note that although 9 appears to be a absolute minimum, given the need to cover 42 shifts per week (3 * 2 * 7) with workers who can cover a maximum of 5 shifts per week (5 work days 2 rest days = a week), there is no assurance that 9 would be sufficient given the consecutive work and/or rest day requirements.
This is how I figure...
8 workers isn't enough, and the following 9 workers line-up, is an example of such a schedule.
To make things easy, I assigned all workers except for worker #1 and #9, to an optimal schedule of exactly a 5 days-on and 2 days-off schedule; #1 and #9 work less. Of course many other arrangements would work (maybe this is what the OP sensed when he hinted at an NP-complete problem). Also, the schedule is such that each week's schedule is exactly the same for everyone, but that could also be changed (maybe introducing some fairness, by having all workers have a lighter week every once in a while, but this BTW can lead to some difficulties of respecting the requirement of 5 maximum work days).
The sample schedule shows two consecutive weeks to help see the consecutive work or rest days, but as said, all weeks are the same for every one.
Max Conseq Ws Min Conseq Rs
Worker #1 RRWWWRW RRWWWRW 3 2
Worker #2 WWWWWRR WWWWWRR 5 2
Worker #3 WWWRRWW WWWRRWW 5 2
Worker #4 WWWRRWW WWWRRWW 5 2
Worker #5 WRRWWWW WRRWWWW 5 2
Worker #6 WRRWWWW WRRWWWW 5 2
Worker #7 RWWWWWR RWWWWWR 5 2
Worker #8 RWWWWWR RWWWWWR 5 2
Worker #9 WWRRRRW WWRRRRW 3 3
Nb of Ws 6666666 6666666
The tally at the bottom shows exactly 6 workers per day (respecting the need to cover 3 shifts with 2 workers each), the max and min columns on the right show that the maximum consecutive work and minimum consecutive rest requirements are respected.

3 shifts per day * 2 people per shift * (7 days per week / 5 working days per person) = 8.4 people (9 if part time is not an option).

3 shifts x 7 days = 21
this does not divide evenly by 5 nor 2 - so your constraints will not allow a complete filling of the slots.

OK - even though you have an answer, let me take a shot.
Let's take the general problem: 7 days x 3 shifts = 21 different shifts to fill
There are 7 possible employee schedules expressed as days on (1) & days off (0)
MTWTFSS
0011111
1001111
1100111
1110011
1111001
1111100
0111110
We want to minimize the number of scheduled employees that matches the number of required hours.
I have a matrix of number of employees of each type per shift and that number is an integer variable. My optimization model is:
Min (number of employees)
Subject to: sum of (# of emp sched * employee schedule) = staff required for each shift
and
number of employees scheduled is integer
You can change the = sign in the first constraint to a >=. Then you'll get a feasible solution with extra staff. You can solve this in Excel with the basic SOLVER addin.
Let's say I need four employees for each day on a shift but I'm willing to tolerate extra staff.
A solution using the schedules above is:
Number of staff by schedule type: 0,2,0,2,0,2,0
Schedule types 0011111,1001111,1100111,1110011,1111001,1111100,0111110
(In other words 2 with schedules 1001111, 2 with schedules 1111001, and 2 more with schedules 1111100)
This results in one day (Monday) with two extra staff and 4 employees on all the other days.
Of course, this isn't a unique solution. There are at least 6 other solutions with two extra staff members. Constraint programming would be a better and much faster approach since there will often be many feasible schedules.

Related

Algorithm for dispatching a certain number of hours into 6 groups of at least 18 and 2 of at least 15

Here's my problem : I'm part of a group of 8 teachers and we have to dispatch classes for next year. Each class is equal to a certain number of hours. We have several 8h classes, 4h classes, but also 1h classes, 3.5h, etc.
There are two categories of teachers : those who have to work 18h (6 of us) and those who have to work 15h (2 of us).
The number of classes is normally large enough that dispatching them is easy and some of us end up with some extra hours. But this year the number of hours we have to do (6 * 18 + 2 * 15) is almost equal to the total number of hours available and it seems to me that it's impossible to create a partition knowing that we can't do less than 18h or 15h.
So, I'm trying to find an algorithm which could dispatch those hours into groups, so that 6 of us have at least 18h and the other two at least 2h.
I've seen some paper on the subject (https://www.ijcai.org/Proceedings/09/Papers/096.pdf for instance), but none answer totally my problem.
So if anyone could help me, I would be very grateful.
Thanking you all in advance.

Algorithm to group people by two factors

I have thought long about this but couldn't figure it out. I am looking for an algorithm ( in any language) to group a bunch of people by the following these 2 rules:
Group by ascending skill level which is represented by a number (the higher the more skilled). The best and weakest in the group should not differ by more than 1 point, where possible.
Spread out people from the same country as far as possible, i.e. dont put people from the same country in the same group, while at the same time not breaking rule 1 above. A group should not consist of people from 1 country where possible.
Each group can have at most 4 person (where possible) or 3 persons e.g. if there are 18 people, then they are split into 3 groups of 4 and 2 groups of 3.
Sample data (Skill level followed by country) :
5 US
5 US
5 US
5 US
6 GB
6 GB
6 GB
7 CN
7 CN
7 CN
7 CN
7 HK
8 US
8 US
8 US
8 CA
8 CN
8 CN
..to be grouped into 2groups of 4s and 2groups of 3s
Please help if you have any idea?
thank you in advance
I would suggest the following.
First, aggregate the data by country and skill level, so the data looks more like:
US 5 4
GB 6 3
. . .
Sort this by the highest ranking first.
Then use a greedy algorithm.
Determine the number of members in the group (either size or size - 1)
Take one from the first group (highest ranking).
Continue taking one from each subsequent group meeting the country condition (so you might need to skip the US).
That defines the first group.
Then repeat.
This is not guaranteed to be optimal. But then again, optimality is not defined for the problem. Which is more important? Country diversity or skill sameness?

Greedy Algorithm Optimization

Consider a DVR recorder that has the duty to record television programs.
Each program has a starting time and ending time.
The DVR has the following restrictions:
It may only record up to two items at once.
If it chooses to record an item, it must record it from start to end.
Given the the number of television programs and their starting/ending times, what is the maximum number of programs the DVR can record?
For example: Consider 6 programs:
They are written in the form:
a b c. a is the program number, b is starting time, and c is ending time
1 0 3
2 6 7
3 3 10
4 1 5
5 2 8
6 1 9
The optimal way to record is have programs 1 and 3 recorded back to back, and programs 2 and 4 recorded back to back. 2 and 4 will be recording alongside 1 and 3.
This means the max number of programs is 4.
What is an efficient algorithm to find the max number of programs that can be recorded?
This is a classic example for a greedy algorithm.
You create an array with tuples for each program in the input.
Now you sort this array by the end times and start going from the left to the right. If you can take the very next program (you are recording at most one program already), you increment the result counter and remember the end-time. For another program again fill the available slot if possible, if not, you can't record it and can discard it.
This way you will get the maximum number of programs that can be recorded in O(nlogn) time.

Find minimum number of moves for Tower of London task

I am looking for a solution for a task similar to the Tower of Hanoi task, however this is different from Hanoi as the disks are not constrained by size. The Tower of London task I am creating has 8 disks, instead of the traditional 3 or 5 (as shown in the Wikipedia link). I am using PEBL software that is "programmed primarily in C++ (although you do not need to know C++ to use PEBL), but also uses flex and bison (GNU versions of lex and yacc) to handle parsing."
Here is a video of what the task looks like in action: http://www.youtube.com/watch?v=IiBJ94HRpeM&noredirect=1
*Each disk is a number. e.g., blue disk=1, red disk = 2, etc.
1 \
2 ----\
3 ----/ 3 1
4 5 / 2 4 5
========= =========
The left side consists of the disks you have to move, to match the right side. There are 3 columns.
So if I am making it with 8 disks, I would create a trial to look like this:
1 \
2 ----\ 7 8
6 3 8 ----/ 3 6 1
7 4 5 / 2 4 5
========= =========
How do I figure out what is the minimum amount of moves needed for the left to look like the right? I don't need to use PEBL to code this, but I need to know since I am calculating how close to the minimum a person would get for each trial.
The principle is easy and its called breadth first search:
Each state has a certain number of successor states (defined by the moves possible).
You start out with a set of states that contains the initial state and step number 0.
If the end state is in the set of states, return the step number.
Increment the step number.
Rebuild the set of states by replacing the current states with each of their successor states.
Go to 2
So, in each step, compute the successor states of your currently available states and look if you reached the target state.
BUT, be warned, this can take a while and eat up a lot of memory!
You can optimize a bit in our case, since you can leave out the predecessor state.
Still, you will have 5 possible moves in most states. Which means you will have 5^N states to consider after N steps.
For example, your second example will need 10 moves, if I don't err. This will give you about 10 million states. Most contemporary computers will not be able to search beyond depth 15.
I think that an algorithm to find a solution would be easy and fast, but we have no proof this solution would be the shortest one.

Finding similarities in a multidimensional array

Consider a sales department that sets a sales goal for each day. The total goal isn't important, but the overage or underage is. For example, if Monday of week 1 has a goal of 50 and we sell 60, that day gets a score of +10. On Tuesday, our goal is 48 and we sell 46 for a score of -2. At the end of the week, we score the week like this:
[0,0]=10,[0,1]=-2,[0,2]=1,[0,3]=7,[0,4]=6
In this example, both Monday (0,0) and Thursday and Friday (0,3 and 0,4) are "hot"
If we look at the results from week 2, we see:
[1,0]=-4,[1,1]=2,[1,2]=-1,[1,3]=4,[1,4]=5
For week 2, the end of the week is hot, and Tuesday is warm.
Next, if we compare weeks one and two, we see that the end of the week tends to be better than the first part of the week. So, now let's add weeks 3 and 4:
[0,0]=10,[0,1]=-2,[0,2]=1,[0,3]=7,[0,4]=6
[1,0]=-4,[1,1]=2,[1,2]=-1,[1,3]=4,[1,4]=5
[2,0]=-8,[2,1]=-2,[2,2]=-1,[2,3]=2,[2,4]=3
[3,0]=2,[3,1]=3,[3,2]=4,[3,3]=7,[3,4]=9
From this, we see that the end of the week is better theory holds true. But we also see that end of the month is better than the start. Of course, we would want to next compare this month with next month, or compare a group of months for quarterly or annual results.
I'm not a math or stats guy, but I'm pretty sure there are algorithms designed for this type of problem. Since I don't have a math background (and don't remember any algebra from my earlier days), where would I look for help? Does this type of "hotspot" logic have a name? Are there formulas or algorithms that can slice and dice and compare multidimensional arrays?
Any help, pointers or advice is appreciated!
This data isn't really multidimensional, it's just a simple time series, and there are many ways to analyse it. I'd suggest you start with the Fourier Transform, it detects "rhythms" in a series, so this data would show a spike at 7 days, and also around thirty, and if you extended the data set to a few years it would show a one-year spike for seasons and holidays. That should keep you busy for a while, until you're ready to use real multidimensional data, say by adding in weather information, stock market data, results of recent sports events and so on.
The following might be relevant to you: Stochastic oscillators in technical analysis, which are used to determine whether a stock has been overbought or oversold.
I'm oversimplifying here, but essentially you have two moving calculations:
14-day stochastic: 100 * (today's closing price - low of last 14 days) / (high of last 14 days - low of last 14 days)
3-day stochastic: same calculation, but relative to 3 days.
The 14-day and 3-day stochastics will have a tendency to follow the same curve. Your stochastics will fall somewhere between 1.0 and 0.0; stochastics above 0.8 are considered overbought or bearish, below 0.2 indicates oversold or bullish. More specifically, when your 3-day stochastic "crosses" the 14-day stochastic in one of those regions, you have predictor of momentum of the prices.
Although some people consider technical analysis to be voodoo, empirical evidence indicates that it has some predictive power. For what its worth, a stochastic is a very easy and efficient way to visualize the momentum of prices over time.
It seems to me that an OLAP approach (like pivot tables in MS Excel) fit the problem perfectly.
What you want to do is quite simple - you just have to calculate the autocorrelation of your data and look at the correlogram. From the correlogram you can see 'hidden' periods of your data and then you can use this information to analyze the periods.
Here is the result - your numbers and their normalized autocorrelation.
10 1,000
-2 0,097
1 -0,121
7 0,084
6 0,098
-4 0,154
2 -0,082
-1 -0,550
4 -0,341
5 -0,027
-8 -0,165
-2 -0,212
-1 -0,555
2 -0,426
3 -0,279
2 0,195
3 0,000
4 -0,795
7 -1,000
9
I used Excel to get the values. But the sequence in column A and add the equation =CORREL($A$1:$A$20;$A1:$A20) to cell B1 and copy it then up to B19. If you the add a line diagram, you can nicely see the structure of the data.
You can already make reasonable guesses about the periods of patterns - you're looking at things like weekly and monthly. To look for weekly patterns, for example, just average all the mondays together and so on. Same goes for days of the month, for months of the year.
Sure, you could use a complex algorithm to find out that there's a weekly pattern, but you already know to expect that. If you think there really may be patterns buried there that you'd never suspect (there's a strange community of people who use a 5-day week and frequent your business), by all means, use a strong tool -- but if you know what kinds of things to look for, there's really no need.
Daniel has the right idea when he suggested correlation but I don't think autocorrelation is what you want. Instead I would suggest correlating each week with each other week. Peaks in your correlation--that is values close to 1--suggest that the values of the weeks resemble each other (I.e. are peiodic) for that particular shift.
For example when you cross correlate
0 0 1 2 0 0
with
0 0 0 1 1 0
the result would be
2 0 0 1 3 0
the highest value is 3, which corresponds to shifting (right) the second array by 4
0 0 0 1 1 0 --> 0 0 1 1 0 0
and thenn multiplying component wise
0 0 1 2 0 0
0 0 1 1 0 0
----------------------
0 + 0 + 1 + 2 + 0 + 0 = 3
Note that when you correlate you can create your own "fake" week and cross-correlate all your real weeks, the idea being that you are looking for "shapes" of your weekly values that correspond to the shape of your fake week by looking for peaks in the correlation result.
So if you are interested in finding weeks that are close near the end of the week you could use the "fake" week
-1 -1 -1 -1 1 1
and if you get a high response in the first value of the correlation this means that the real week that you correlated with has roughly this shape.
This is probably beyond the scope of what you're looking for, but one technical approach that would give you the ability to do forecasting, look at things like statistical significance, etc., would be ARIMA or similar Box-Jenkins models.

Resources