My problem is the following:
Me and my team are moving to another part of the office and we have to decide everybody's place to sit. However, everybody has priorities. I would like to find an algorithm which helps us to distribute the seats in a way that everybody is satisfied. (Or the most of them at least.)
I've started to implement my own algorithm where I ask 3 preferred options (the team consists of 10 people and there are 10 places) from everybody and consider there "seniority" (the length of the time they have spent in the team) as a rank between them.
However, I've stuck without any luck, tried to browse the internet for an algorithm which solves a similar problem but didn't find any.
What would be the best way to solve this? Is there any
generally known algorithm which solves this or a similar problem?
Thank you!
What first comes to mind for me is the stable marriage problem. Here's the problem statement for the original algorithm:
Given n men and n women, where each person has ranked all members of the opposite sex in order of preference, marry the men and women together such that there are no two people of opposite sex who would both rather have each other than their current partners. When there are no such pairs of people, the set of marriages is deemed stable.
Please read up on the Gale–Shapley algorithm, which is what I'll adapt for this problem.
Have each worker make a list of their rankings for all the spots. These will be the "men". Then, each spot will use the seniority ranking as their rankings for the "men". The spots will be the "women" in the Gale-Shapley algorithm.
You will get a seat assignment that has no "unstable marriage". Here's what an unstable marriage is:
There is an element A of the first matched set which prefers some given element B of the second matched set over the element to which A is already matched, and
B also prefers A over the element to which B is already matched.
In this context, an unstable marriage means that there is a worker-seat between W1 and S1 assignment such that another worker, W2, has ranked S1 higher. Not only that, S1 has also ranked W2 higher. Since the seats made their list based off the seniority list, it means that W2 has higher seniority.
In effect, this means that you'll get a seating assignment such that no worker has a seat that someone else with higher seniority wants "more".
The bottom of that Wiki article mentions packages in R and Python that have already implemented the algorithm, so it's just up to you to input the preference lists.
Disclaimer: This is probably not the most efficient algorithm. All the seats have the same ranking list, so there's probably a shortcut somewhere. However, it's easier to use a cannon to kill a fly, if the cannon is already written in R/Python for you. Also, this is the only algorithm I remember from uni, so this is the only hammer I have for any nail.
I decided to implement a brute force solution as lots of the comments suggested.
So:
I asked everybody from the team to give a preference order between the seats (10 to 1, what I use as score the "teamMember-seat" pairings, 10 is the highest score)
collected all of the "teamMember-seat" pairings with scores e.g. name:Steve, seat:seat1, score:5 (the score is from the given order from the previous step)
generated all the possible sitting combination from these
e.g.
List1: [name:Steve seat:seat1 score:5], [name:John seat:seat2 score:3] ... [name:X seat:seatY score:X]
List2: [name:Steve seat:seat2 score:4], [name:John seat:seat1 score:4] ... [name:X seat:seatY score:X]
...
ListX: [],[]...
chose the "teamMember-seat" list(s) with the highest score (score of the list is calculated by summing the scores of the "teamMember-seat" pairings)
if there are 2 lists with equal scores, then the algorithm choose that one where the most senior team members get the most preferred seats of them
if still there are more then one list (combination) the algorithm choose one randomly
I'm sure there are some better algorithms to do this as some of you suggested but I've run out of time.
I didn't post the code since it is really long and not too complicated to implement. However, if you need it, don't hesitate to drop a private message.
Related
I'm making a matchmaking client that matches 10 people together into two teams:
Each person chooses four people they would like to play with, ranked from highest to lowest.
Two teams are then formed out of the strongest relationships in that set.
How would you create an algorithm that solves this problem?
Example:
Given players [a, b, c, d, e, f, g, h, i, j], '->' meaning a preference pick.
a -> b (weight: 4)
a -> c (weight: 3)
a -> d (weight: 2)
a -> e (weight: 1)
b -> d (weight: 4)
b -> h (weight: 3)
b -> a (weight: 2)
...and so on
This problem seemed simple on the surface (after all it is only just a matchmaking client), but after thinking about it for a while it seems that there needs to be quite a lot of relationships taken into account.
Edit (pasted from a comment):
Ideally, I would avoid a brute-force approach to scale to larger games which require 100 players and 25 teams, where picking your preferred teammates would be done through a search function. I understand that this system may not be the best for its purpose - however, it is an interesting problem and I would like to find an efficient solution while learning something along the way.
A disclaimer first.
If your user suggested this, there are two possibilities.
Either they can provide the exact details of the algorithm, so ask them.
Or they most probably don't know what they are talking about, and just generated a partial idea on the spot, in which case, it's sadly not worth much on average.
So, one option is to search how matchmaking works in other projects, disregarding the idea completely.
Another is to explore the user's idea.
Probably it won't turn into a good system, but there is a chance it will.
In any case, you will have to do some experiments yourself.
Now, to the case where you are going to have fun exploring the idea.
First, for separating ten items into two groups of five, there are just choose(10,5)=252 possibilities, so, unless the system has to do it millions of times per second, you can just calculate some score for all of them, and choose the best one.
The most straightforward way is perhaps to consider all 2^{10} = 1024 ways to form a subset of 10 elements, and then explore the ones where the size of the subset is 5.
But there may be better, more to-the-point, tools readily available, depending on the language or framework.
The 10-choose-5 combination is one group, the items not taken are the other group.
So, what would be the score of a combination?
Now we look at our preferences.
For each preference satisfied, we can add its weight, or its weight squared, or otherwise, to the score.
Which works best would sure need some experimentation.
Similarly, for each preference not satisfied, we can add a penalty depending on its weight.
Next, we can consider all players, and maybe add more penalty for each of the players which has none of their preferences satisfied.
Another thing to consider is team balance.
Since the only data so far are preferences (which may well turn out to be insufficient), an imbalance means that one team has many of their preferences satisfied, and the other has only few, if any at all.
So, we add yet another penalty depending on the absolute difference of (satisfaction sum of the first team) and (satisfaction sum of the second team).
Sure there can be other things to factor in...
Based on all this, construct a system which at least looks plausible on the surface, and then experiment and experiment again, tweaking it so that it better fits the matchmaking goals.
I would think of a way to score proposed teams against the selections from people, such as scoring proposed teams against the weights.
I would try and optimise this by hill-climbing (e.g. swapping a pair of people and looking to see if that improves the score) if only because people could look at the final solution and try this themselves - so you don't want to miss improvements of this sort.
I would hill-climb multiple times, from different starting points, and pick the answer found with the best score, because hill-climbing will probably end at local optima, not global optima.
At least some of the starting points should be based on people's original selections. This would be easiest if you got people's selections to amount to an entire team's worth of choices, but you can probably build up a team from multiple suggestions if you say that you will follow person A's suggestions, and then person B's selection if needed, and then person C's selection if needed, and so on.
If you include as starting points everybody's selections, or selections based on priority ABCDE.. and then priority BCDE... and then priority CDEF... then you have the property that if anybody submits a perfect selection your algorithm will recognise it as such.
If your hill-climbing algorithm tries swapping all pairs of players to improve, and continues until it finds a local optimum and then stops, then you also have the property that if anybody submits a selection which is only one swap away from perfection, your algorithm will recognise it as such.
I’m working on program for the English Language school I work for. I’m not being paid, its just a kind of a hobby to improve / automate my work flow.
It’s a residential school and one aspects I’m looking at automating is the way we allocate room to students, and although I don’t want a full blown solution I was hoping someone could point me in the right direction… Suggestions of the way you might approach this or by suggesting algorithms to look at etc.
Basically at the school we have a whole bunch of different rooms ranging from singles to dormitories for 8 people. We get lots of different nationalities from all over the world, and we always try to maker sure each room has a mix of nationalities. Where there is more than one nationality we try to balance them. Age is also important, we always put students of a similar age together, while still trying to mix nationalities, and its unusual for us to have students sharing with more than two years between them.
I suppose more generically speaking, I am in interested in how to sort a given set of students based on two parameters to an optimal result with a few rules attached.
I hope I’ve explain clearly what I am trying to achieve… in a way it sounds really simple, but I’ve trying to think how to do it in a simple way, i.e. by sorting by nationality and then by age but it just doesn’t cut it and I know there must be a better way of approaching this. When I do it “by hand” on an excel sheet it does feel quite intuitive.
Thank you to anyone who offers help / advice.
This is an interesting question but it's not easy to answer. Somehow it's connected with subdivsion and bin packing or the cutting-stock problem. You may want to look for a topological sort too. You can look for Drools a business logic platform that let you define such rules.
First of all you might find this interesting: Stable Room-mates Problem (wikipedia). Unfortunately it does not answer your question.
Try a genetic algorithm.
There are three main criteria for using a genetic algorithm:
ability to represent a solution as a mutable array. We can have an array of integers such that a[i] is the room for the ith student.
mutation of the state should produce predictable results. In our case this is true. Mutating the array will predictably shuffle students between the rooms.
easy to write a fast fitness function. Shouldn't be too hard to write a O(n) fitness function.
This is an interesting problem. I'll try writing some code with this approach and we'll see what happens.
How about, you think of a room as something that repels students of a nationality it already has, and attracts students of a close age to what it already has. The closer the age to the average age, the more it attracts it, and the more guys of X nationality are in the room, the more if repels guys of X nationality.
Then you would, for every new student to be added, iterate through each room and see which is the one that attracts it more. I guess if the room is empty you can set all forces to 0. Also, you would have a couple of constants that multiply each of both "forces" so you can calibrate it depending on how important is to have the same age against how important is to have different nationalities.
I'd analyze each student and create a 'personality' vector based on his/her age & nationality. Then I'd sort the vectors, and maybe scramble the results a bit after sorting to encourage diversity.
The general theme of "assign x to y with respect to constraints while optimizing some quantity" falls within operations research or more specifically http://en.wikipedia.org/wiki/Mathematical_optimization. The usual approach is to formally specify the problem and use a generic optimization solver such as one of those listed in http://en.wikipedia.org/wiki/List_of_optimization_software.
Give it a try, the formal specification languages for using the existing solvers are rather easy to learn and you might get an optimal solution without having to debug a complicated algorithm.
Formulation as a General Optimization Problem
It will be useful to formalize constraints and parameters. Let us assume that for 1 <= i <= 8, we have n_i rooms available of size i. Now let us impose the hard constraint that in a particular room S, every two students a, b \in S, we have that:
|Grade(a) - Grade(b)| <= 2 (1)
Now we are interested in optimizing the "diversity" function which intuitively represents the idea that we want rooms to be as mixed as possible. So we can represent this goal as:
max over all arrangements {{ Sum over all rooms S of DiversityScore(S) }}
where we have DiversityScore(S) = # of Different Nationalities in the Room
Formulation as a Graph Problem
This is the most general setting, but clearly max over all arrangements is not computationally feasible. Now let us pose this as a sort of graph problem with the hard grade constraints. Denote all students as a vertex in a Graph G. Connect two vertices if students satisfy constraint (1). Now a clique in this graph represents a group of students that can all be placed in the same room. Now proceed in a greedy manner. Choose the largest clique of size 4 which has the largest Diversity Score. Then place them in a room and continue until all rooms are filled. This clique search method can also incorporate gender constraints which is useful, however not that Clique finding is NP Hard Problem.
Now before trying to come up with something that may be faster, let us think about how to weaken the hard constraint (1). We can massage our graph formulation by including edge weights into the picture. So if the hard constraint is satisfied denote the edge weight from i to j as 1. If two students i and j deviate by age more than 2 denote the edge weight as 1 / (Age Difference)^2 or something. Then the score of a clique should be a product of the cliques edge weights with some diversity score. However it becomes clear that now the problem is on a complete graph, which is just the general optimization we hoped to avoid, so we need to impose some hard restrictions to reduce the connectivity of our graph.
A Basic Sorting Approximation Algorithm
Sort all students by their age, so we have a sorted array where all students in a[i] have the same age, and all students in a[i] are older than all students in a[j] for all j < i.
Now consider each pair i, j, of which there are O(n^2), where we also have that |Age[i] - Age[j]| <= 2. Find the largest group of students with different nationalities and place them in a room together. We successively iterate over O(n^2) index pairs which satisfy the hard constraint and take any students with nationality difference (which we can find by preprocessing and hashing on the index pairs). Doing this carefully (like looking at indices i j which are spread apart before close together) improves running time further. It feels like it should be polytime, but I think there are certain subtleties to address first before saying so.
I need to assign n people to m courses, where each person specified their first and second preference and each course has a maximum number of persons attending. Each person can only attend one course. The algorithm should find one solution where
the number of people assigned one course out of their preference is maximized
the number of people assigned their first choice is maximized (taking into account 1 which is of higher priority).
I guessed that this is not an uncommon problem but a search returned nothing too useful, therefore I decided to roll my own. This is what I came up so far:
For courses which have less first preferences than maximum numbers of people attending, assign all those persons to the course
For other courses: Put random people into the course which have selected this course as first choice until the course is full
For courses which have less second preferences than free spaces, assign all those persons to the course
For other courses: Put random people into the course which have selected this course as second choice until the course is full
For each person without a course: At their first (then second) preference look out for a person which has chosen another course where spots are still free (if more than one is found take the one which has chosen the course with most free spots), move this person to their second choice and assign the missing person
I still don't think this algorithm will find the optimal solution to the problem due to the last step. Any ideas how to make this one better? Are there other algorithm which solve this problem?
Place everyone in their first choice course if possible.
If there is anyone who didn't get it, place them in their second choice.
Now, we might get some who didn't get any of their choices. (the "losers".)
Find a person who got his first choice course, which is also the second choice of the "loser". This guy will be reassigned to his second choice, while the "loser" takes his slot. If there is no such person, then your problem is unsolvable.
Note that this maximizes the number of people who got their first choice:
If you got your second choice, then it means either:
someone else already got your first choice as his first choice
someone else got your first choice as his second choice, but only because his first choice was taken as someone else's second choice, and whose first choice was filled with first choice students.
(Possibly that last bit is a bit hard to follow, so here's a rewording:)
For person X with first choice A and second choice B:
If X got choice B, then:
Y took X's slot in A, and Y's first choice is A.
Y took X's slot in A, and Y's second choice is A. Y's first choice is C, but C's slots are all filled with other students whose first choice is C as well.
This is similar to the stable marriage problem.
Given n men and n women, where
each person has ranked all members of
the opposite sex with a unique number
between 1 and n in order of
preference, marry the men and women
together such that there are no two
people of opposite sex who would both
rather have each other than their
current partners. If there are no such
people, all the marriages are
"stable".
Update:
Taking #bdares comments into account, and the fact that the courses have a finite capacity it would be hard to cast the problem as stable matching.
I would solve this as a linear program with the objective function based on the number of people who get their first choice and the course size as a constraint.
The first problem can be modeled as a maximum cardinality bipartite matching problem. The second problem can be modeled as a weighted bipartite matching problem (also known as the assignment problem).
Sounds like a linear bottleneck assignment problem. While you in the wiki page, check out the link provided in the reference section.
We have a simulation program where we take a very large population of individual people and group them into families. Each family is then run through the simulation.
I am in charge of grouping the individuals into families, and I think it is a really cool problem.
Right now, my technique is pretty naive/simple. Each individual record has some characteristics, including married/single, age, gender, and income level. For married people I select an individual and loop through the population and look for a match based on a match function. For people/couples with children I essentially do the same thing, looking for a random number of children (selected according to an empirical distribution) and then loop through all of the children and pick them out and add them to the family based on a match function. After this, not everybody is matched, so I relax the restrictions in my match function and loop through again. I keep doing this, but I stop before my match function gets too ridiculous (marries 85-year-olds to 20-year-olds for example). Anyone who is leftover is written out as a single person.
This works well enough for our current purposes, and I'll probably never get time or permission to rework it, but I at least want to plan for the occasion or learn some cool stuff - even if I never use it. Also, I'm afraid the algorithm will not work very well for smaller sample sizes. Does anybody know what type of algorithms I can study that might relate to this problem or how I might go about formalizing it?
For reference, I'm comfortable with chapters 1-26 of CLRS, but I haven't really touched NP-Completeness or Approximation Algorithms. Not that you shouldn't bring up those topics, but if you do, maybe go easy on me because I probably won't understand everything you are talking about right away. :) I also don't really know anything about evolutionary algorithms.
Edit: I am specifically looking to improve the following:
Less ridiculous marriages.
Less single people at the end.
Perhaps what you are looking for is cluster analysis?
Lets try to think of your problem like this (starting by solving the spouses matching):
If you were to have a matrix where each row is a male and each column is a female, and every cell in that matrix is the match function's returned value, what you are now looking for is selecting cells so that there won't be a row or a column in which more than one cell is selected, and the total sum of all selected cells should be maximal. This is very similar to the N Queens Problem, with the modification that each allocation of a "queen" has a reward (which we should maximize).
You could solve this problem by using a graph where:
You have a root,
each of the first raw's cells' values is an edge's weight leading to first depth vertices
each of the second raw's cells' values is an edge's weight leading to second depth vertices..
Etc.
(Notice that when you find a match to the first female, you shouldn't consider her anymore, and so for every other female you find a match to)
Then finding the maximum allocation can be done by BFS, or better still by A* (notice A* typically looks for minimum cost, so you'll have to modify it a bit).
For matching between couples (or singles, more on that later..) and children, I think KNN with some modifications is your best bet, but you'll need to optimize it to your needs. But now I have to relate to your edit..
How do you measure your algorithm's efficiency?
You need a function that receives the expected distribution of all states (single, married with one children, single with two children, etc.), and the distribution of all states in your solution, and grades the solution accordingly. How do you calculate the expected distribution? That's quite a bit of statistics work..
First you need to know the distribution of all states (single, married.. as mentioned above) in the population,
then you need to know the distribution of ages and genders in the population,
and last thing you need to know - the distribution of ages and genders in your population.
Only then, according to those three, can you calculate how many people you expect to be in each state.. And then you can measure the distance between what you expected and what you got... That is a lot of typing.. Sorry for the general parts...
I'm trying to write a program to automate a ticket draft.
We have a certain number of season ticket passes and want to split up the tickets among a group of people. There are X number of games, Y number of season passes, and Z number of people. Each of Z people has ranked the X games.
My code basically goes through the draft order and back picking out the tickets from their ranking if available, otherwise, picking the next highest ranking. For the most part it works. The problem is, there's a point where most of the tickets are taken and the remaining tickets left are ones you already have so you just don't pick them. People therefore have different numbers of tickets. Is there a good way to get around this?
If you have X games and Y season passes, presumably there are X*Y tickets available to give to the Z people, right?
This sounds like it could be treated as an optimization problem, but to do so you have to identify your main goals? I'm guessing you want each person to receive X*Y / Z tickets (split them evenly), but maybe not. I'm guessing you also want to maximize the aggregate satisfaction (defined in some way according to the rankings) in tickets. You would probably want to give a large penalty in satisfaction for a person if he receives more than 1 ticket for the same game. I believe this last aspect might be why the straight draft approach is not the best, but I could be mistaken.
Once you are clear on what you are trying to optimize (if this is indeed an optimization problem), then you can consider the best approach to the problem. This could be your own custom-built solution, or you could try an existing technique (genetic algorithm, etc.). Before doing so though it is important that you frame the problem properly.
If there were no preferences involved, this would be a straight min-cut max flow problem. http://en.wikipedia.org/wiki/Maximum_flow_problem, as follows:
Create a source vertex A. From A, create Z vertices, one for each person. The capacity can be infinite (or very, very large). Create a sink B, and create X vertices, one for each game, linked to B; the capacity should be Y (you have Y tickets per game). From each person, link to each game they've ranked, with capacity 1.
If you look at the wiki link above, there are about 10 algorithms to solve this basic problem. Find one you understand and can implement yourself, because you'll need to modify it slightly. I'm not familiar with all of them, but the ones I know about have a step 'pick an edge' or 'pick a path.' You should modify the 'how you pick an edge' logic to take the priority ordering of the games into account. I'm not sure exactly what the ordering should be (you'll probably need to experiment), but if you say the lowest ranked game is 1, the next is 2, up to X, then a score like 'ranking of the edge - number of games the person is already signed up for' might work.
I think this is a variant of the Stable Marriage Problem or the Stable Roommates Problem for which there are known algorithms for solving.