I need to assign n people to m courses, where each person specified their first and second preference and each course has a maximum number of persons attending. Each person can only attend one course. The algorithm should find one solution where
the number of people assigned one course out of their preference is maximized
the number of people assigned their first choice is maximized (taking into account 1 which is of higher priority).
I guessed that this is not an uncommon problem but a search returned nothing too useful, therefore I decided to roll my own. This is what I came up so far:
For courses which have less first preferences than maximum numbers of people attending, assign all those persons to the course
For other courses: Put random people into the course which have selected this course as first choice until the course is full
For courses which have less second preferences than free spaces, assign all those persons to the course
For other courses: Put random people into the course which have selected this course as second choice until the course is full
For each person without a course: At their first (then second) preference look out for a person which has chosen another course where spots are still free (if more than one is found take the one which has chosen the course with most free spots), move this person to their second choice and assign the missing person
I still don't think this algorithm will find the optimal solution to the problem due to the last step. Any ideas how to make this one better? Are there other algorithm which solve this problem?
Place everyone in their first choice course if possible.
If there is anyone who didn't get it, place them in their second choice.
Now, we might get some who didn't get any of their choices. (the "losers".)
Find a person who got his first choice course, which is also the second choice of the "loser". This guy will be reassigned to his second choice, while the "loser" takes his slot. If there is no such person, then your problem is unsolvable.
Note that this maximizes the number of people who got their first choice:
If you got your second choice, then it means either:
someone else already got your first choice as his first choice
someone else got your first choice as his second choice, but only because his first choice was taken as someone else's second choice, and whose first choice was filled with first choice students.
(Possibly that last bit is a bit hard to follow, so here's a rewording:)
For person X with first choice A and second choice B:
If X got choice B, then:
Y took X's slot in A, and Y's first choice is A.
Y took X's slot in A, and Y's second choice is A. Y's first choice is C, but C's slots are all filled with other students whose first choice is C as well.
This is similar to the stable marriage problem.
Given n men and n women, where
each person has ranked all members of
the opposite sex with a unique number
between 1 and n in order of
preference, marry the men and women
together such that there are no two
people of opposite sex who would both
rather have each other than their
current partners. If there are no such
people, all the marriages are
"stable".
Update:
Taking #bdares comments into account, and the fact that the courses have a finite capacity it would be hard to cast the problem as stable matching.
I would solve this as a linear program with the objective function based on the number of people who get their first choice and the course size as a constraint.
The first problem can be modeled as a maximum cardinality bipartite matching problem. The second problem can be modeled as a weighted bipartite matching problem (also known as the assignment problem).
Sounds like a linear bottleneck assignment problem. While you in the wiki page, check out the link provided in the reference section.
Related
My problem is the following:
Me and my team are moving to another part of the office and we have to decide everybody's place to sit. However, everybody has priorities. I would like to find an algorithm which helps us to distribute the seats in a way that everybody is satisfied. (Or the most of them at least.)
I've started to implement my own algorithm where I ask 3 preferred options (the team consists of 10 people and there are 10 places) from everybody and consider there "seniority" (the length of the time they have spent in the team) as a rank between them.
However, I've stuck without any luck, tried to browse the internet for an algorithm which solves a similar problem but didn't find any.
What would be the best way to solve this? Is there any
generally known algorithm which solves this or a similar problem?
Thank you!
What first comes to mind for me is the stable marriage problem. Here's the problem statement for the original algorithm:
Given n men and n women, where each person has ranked all members of the opposite sex in order of preference, marry the men and women together such that there are no two people of opposite sex who would both rather have each other than their current partners. When there are no such pairs of people, the set of marriages is deemed stable.
Please read up on the Gale–Shapley algorithm, which is what I'll adapt for this problem.
Have each worker make a list of their rankings for all the spots. These will be the "men". Then, each spot will use the seniority ranking as their rankings for the "men". The spots will be the "women" in the Gale-Shapley algorithm.
You will get a seat assignment that has no "unstable marriage". Here's what an unstable marriage is:
There is an element A of the first matched set which prefers some given element B of the second matched set over the element to which A is already matched, and
B also prefers A over the element to which B is already matched.
In this context, an unstable marriage means that there is a worker-seat between W1 and S1 assignment such that another worker, W2, has ranked S1 higher. Not only that, S1 has also ranked W2 higher. Since the seats made their list based off the seniority list, it means that W2 has higher seniority.
In effect, this means that you'll get a seating assignment such that no worker has a seat that someone else with higher seniority wants "more".
The bottom of that Wiki article mentions packages in R and Python that have already implemented the algorithm, so it's just up to you to input the preference lists.
Disclaimer: This is probably not the most efficient algorithm. All the seats have the same ranking list, so there's probably a shortcut somewhere. However, it's easier to use a cannon to kill a fly, if the cannon is already written in R/Python for you. Also, this is the only algorithm I remember from uni, so this is the only hammer I have for any nail.
I decided to implement a brute force solution as lots of the comments suggested.
So:
I asked everybody from the team to give a preference order between the seats (10 to 1, what I use as score the "teamMember-seat" pairings, 10 is the highest score)
collected all of the "teamMember-seat" pairings with scores e.g. name:Steve, seat:seat1, score:5 (the score is from the given order from the previous step)
generated all the possible sitting combination from these
e.g.
List1: [name:Steve seat:seat1 score:5], [name:John seat:seat2 score:3] ... [name:X seat:seatY score:X]
List2: [name:Steve seat:seat2 score:4], [name:John seat:seat1 score:4] ... [name:X seat:seatY score:X]
...
ListX: [],[]...
chose the "teamMember-seat" list(s) with the highest score (score of the list is calculated by summing the scores of the "teamMember-seat" pairings)
if there are 2 lists with equal scores, then the algorithm choose that one where the most senior team members get the most preferred seats of them
if still there are more then one list (combination) the algorithm choose one randomly
I'm sure there are some better algorithms to do this as some of you suggested but I've run out of time.
I didn't post the code since it is really long and not too complicated to implement. However, if you need it, don't hesitate to drop a private message.
Say you have a class with 5 sections: A,B,C,D,E. Each section meets at different times, thus students registering for the course will have preference for which section they will take (they can only take one section). When students register for the course, they list 3 sections they would prefer to take, in order of preference.
Each section has n students. Let's say for simplicity that exactly n*5 students have registered for the course.
So, the question is: How do you efficiently match students to their preferred section?
I've seen some questions with similar matching scenario questions, but none quite fit and I'm afraid I don't know enough about algorithms to make up my own. BTW, this is a real problem and I know the department in question takes a few days to do it by hand.
To determine whether each student can be assigned to a preferred section, construct an integer-valued maximum flow in the following network, where the three Xs stand for capacity-1 arcs from students to the sections they prefer (polynomial-time via, e.g., the push-relabel algorithm). There's a solution if and only if the maximum flow moves m = n*5 units; then the assignments are determined by which arcs from each student is saturated.
capacity-1 arcs capacity-n arcs
| |
v v
student 1
/ student 2 section1
/ . X section2 \
source < . X section3 > sink
\ . X section4 /
\ student m-1 section5
student m
To take the order of preference into account, switch to solving a min-cost flow problem, still poly-time solvable (though you may find the network simplex mode of a general-purpose LP solver easier to use) which allows a cost to specified for each arc. Choose a score for each preference level depending on what you think is fair.
I'm positive that this has been asked before, but scheduling problems are like snowflakes, and I can't find the old question by keywords alone.
Maybe you could randomly distribute them into sections. Next you select random pairs of student and consider if swapping them improves the distribution (does it increase the match with their preferences?). You can iterate until there is no improvement possible for X iterations.
This is obviously a very naive approach but if your sample is small it might converge quickly. You cannot guarantee you have the optimal solution, but therefore you'd need a brute force approach which is probably not possible.
Is there a scoring system in which if student 1 is in section A the score is 20? (on the other hand if student 2 is in section A, score is 15?
I'm asking since if there's only one spot left for section A, and both student 1 and 2 has section A most preferred, then who ever gets registered first gets the spot. Instead of who ever is best fit (higher score).
If there is no scoring, you can just loop through the students and put them in the section they prefer. If the first one is full, try their second preference, then the next. If all three sections the student prefers are filled, just enroll them to one that isn't filled.
(It'd be different if there is scoring since you have to go with a priority queue for each section and maximize that.)
Hi I am building a program wherein students are signing up for an exam which is conducted at several cities through out the country. While signing up students provide a list of three cities where they would like to give the exam in order of their preference. So a student may say his first preference for an exam centre is New York followed by Chicago followed by Boston.
Now keeping in mind that as the exam centres have limited capacity they cannot accomodate each students first choice .We would however try and provide as many students either their first or second choice of centres and as far as possible avoid students having to give the third choice centre to a student
Now any ideas of a sorting algorithm that would mke this process more efficent.The simple way to do this would be to first go through the list of first choice of students allot as many as possible then go through the list of second choices and allot. However this may lead to the students who are first in the list getting their first centre and the last students getting their third choice or worse none of their choices. Anything that could make this more efficient
Sounds like a variant of the classic stable marriages problem or the college admission problem. The Wikipedia lists a linear-time (in the number of preferences, O(n²) in the number of persons) algorithm for the former; the NRMP describes an efficient algorithm for the latter.
I suspect that if you randomly generate preferences of exam places for students (one Fisher–Yates shuffle per exam place) and then apply the stable marriages algorithm, you'll get a pretty fair and efficient solution.
This problem could be formulated as an instance of minimum cost flow. Let N be the number of students. Let each student be a source vertex with capacity 1. Let each exam center be a sink vertex with capacity, well, its capacity. Make an arc from each student to his first, second, and third choices. Set the cost of first choice arcs to 0; the cost of second choice arcs to 1; and the cost of third choice arcs to N + 1.
Find a minimum-cost flow that moves N units of flow. Assuming that your solver returns an integral solution (it should; flow LPs are totally unimodular), each student flows one unit to his assigned center. The costs minimize the number of third-choice assignments, breaking ties by the number of second-choice assignments.
There are a class of algorithms that address this allocating of limited resources called auctions. Basically in this case each student would get a certain amount of money (a number they can spend), then your software would make bids between those students. You might use a formula based on preferences.
An example would be for tutorial times. If you put down your preferences, then you would effectively bid more for those times and less for the times you don't want. So if you don't get your preferences you have more "money" to bid with for other tutorials.
We have a simulation program where we take a very large population of individual people and group them into families. Each family is then run through the simulation.
I am in charge of grouping the individuals into families, and I think it is a really cool problem.
Right now, my technique is pretty naive/simple. Each individual record has some characteristics, including married/single, age, gender, and income level. For married people I select an individual and loop through the population and look for a match based on a match function. For people/couples with children I essentially do the same thing, looking for a random number of children (selected according to an empirical distribution) and then loop through all of the children and pick them out and add them to the family based on a match function. After this, not everybody is matched, so I relax the restrictions in my match function and loop through again. I keep doing this, but I stop before my match function gets too ridiculous (marries 85-year-olds to 20-year-olds for example). Anyone who is leftover is written out as a single person.
This works well enough for our current purposes, and I'll probably never get time or permission to rework it, but I at least want to plan for the occasion or learn some cool stuff - even if I never use it. Also, I'm afraid the algorithm will not work very well for smaller sample sizes. Does anybody know what type of algorithms I can study that might relate to this problem or how I might go about formalizing it?
For reference, I'm comfortable with chapters 1-26 of CLRS, but I haven't really touched NP-Completeness or Approximation Algorithms. Not that you shouldn't bring up those topics, but if you do, maybe go easy on me because I probably won't understand everything you are talking about right away. :) I also don't really know anything about evolutionary algorithms.
Edit: I am specifically looking to improve the following:
Less ridiculous marriages.
Less single people at the end.
Perhaps what you are looking for is cluster analysis?
Lets try to think of your problem like this (starting by solving the spouses matching):
If you were to have a matrix where each row is a male and each column is a female, and every cell in that matrix is the match function's returned value, what you are now looking for is selecting cells so that there won't be a row or a column in which more than one cell is selected, and the total sum of all selected cells should be maximal. This is very similar to the N Queens Problem, with the modification that each allocation of a "queen" has a reward (which we should maximize).
You could solve this problem by using a graph where:
You have a root,
each of the first raw's cells' values is an edge's weight leading to first depth vertices
each of the second raw's cells' values is an edge's weight leading to second depth vertices..
Etc.
(Notice that when you find a match to the first female, you shouldn't consider her anymore, and so for every other female you find a match to)
Then finding the maximum allocation can be done by BFS, or better still by A* (notice A* typically looks for minimum cost, so you'll have to modify it a bit).
For matching between couples (or singles, more on that later..) and children, I think KNN with some modifications is your best bet, but you'll need to optimize it to your needs. But now I have to relate to your edit..
How do you measure your algorithm's efficiency?
You need a function that receives the expected distribution of all states (single, married with one children, single with two children, etc.), and the distribution of all states in your solution, and grades the solution accordingly. How do you calculate the expected distribution? That's quite a bit of statistics work..
First you need to know the distribution of all states (single, married.. as mentioned above) in the population,
then you need to know the distribution of ages and genders in the population,
and last thing you need to know - the distribution of ages and genders in your population.
Only then, according to those three, can you calculate how many people you expect to be in each state.. And then you can measure the distance between what you expected and what you got... That is a lot of typing.. Sorry for the general parts...
I'm trying to write a program to automate a ticket draft.
We have a certain number of season ticket passes and want to split up the tickets among a group of people. There are X number of games, Y number of season passes, and Z number of people. Each of Z people has ranked the X games.
My code basically goes through the draft order and back picking out the tickets from their ranking if available, otherwise, picking the next highest ranking. For the most part it works. The problem is, there's a point where most of the tickets are taken and the remaining tickets left are ones you already have so you just don't pick them. People therefore have different numbers of tickets. Is there a good way to get around this?
If you have X games and Y season passes, presumably there are X*Y tickets available to give to the Z people, right?
This sounds like it could be treated as an optimization problem, but to do so you have to identify your main goals? I'm guessing you want each person to receive X*Y / Z tickets (split them evenly), but maybe not. I'm guessing you also want to maximize the aggregate satisfaction (defined in some way according to the rankings) in tickets. You would probably want to give a large penalty in satisfaction for a person if he receives more than 1 ticket for the same game. I believe this last aspect might be why the straight draft approach is not the best, but I could be mistaken.
Once you are clear on what you are trying to optimize (if this is indeed an optimization problem), then you can consider the best approach to the problem. This could be your own custom-built solution, or you could try an existing technique (genetic algorithm, etc.). Before doing so though it is important that you frame the problem properly.
If there were no preferences involved, this would be a straight min-cut max flow problem. http://en.wikipedia.org/wiki/Maximum_flow_problem, as follows:
Create a source vertex A. From A, create Z vertices, one for each person. The capacity can be infinite (or very, very large). Create a sink B, and create X vertices, one for each game, linked to B; the capacity should be Y (you have Y tickets per game). From each person, link to each game they've ranked, with capacity 1.
If you look at the wiki link above, there are about 10 algorithms to solve this basic problem. Find one you understand and can implement yourself, because you'll need to modify it slightly. I'm not familiar with all of them, but the ones I know about have a step 'pick an edge' or 'pick a path.' You should modify the 'how you pick an edge' logic to take the priority ordering of the games into account. I'm not sure exactly what the ordering should be (you'll probably need to experiment), but if you say the lowest ranked game is 1, the next is 2, up to X, then a score like 'ranking of the edge - number of games the person is already signed up for' might work.
I think this is a variant of the Stable Marriage Problem or the Stable Roommates Problem for which there are known algorithms for solving.