I'm trying to write a program to automate a ticket draft.
We have a certain number of season ticket passes and want to split up the tickets among a group of people. There are X number of games, Y number of season passes, and Z number of people. Each of Z people has ranked the X games.
My code basically goes through the draft order and back picking out the tickets from their ranking if available, otherwise, picking the next highest ranking. For the most part it works. The problem is, there's a point where most of the tickets are taken and the remaining tickets left are ones you already have so you just don't pick them. People therefore have different numbers of tickets. Is there a good way to get around this?
If you have X games and Y season passes, presumably there are X*Y tickets available to give to the Z people, right?
This sounds like it could be treated as an optimization problem, but to do so you have to identify your main goals? I'm guessing you want each person to receive X*Y / Z tickets (split them evenly), but maybe not. I'm guessing you also want to maximize the aggregate satisfaction (defined in some way according to the rankings) in tickets. You would probably want to give a large penalty in satisfaction for a person if he receives more than 1 ticket for the same game. I believe this last aspect might be why the straight draft approach is not the best, but I could be mistaken.
Once you are clear on what you are trying to optimize (if this is indeed an optimization problem), then you can consider the best approach to the problem. This could be your own custom-built solution, or you could try an existing technique (genetic algorithm, etc.). Before doing so though it is important that you frame the problem properly.
If there were no preferences involved, this would be a straight min-cut max flow problem. http://en.wikipedia.org/wiki/Maximum_flow_problem, as follows:
Create a source vertex A. From A, create Z vertices, one for each person. The capacity can be infinite (or very, very large). Create a sink B, and create X vertices, one for each game, linked to B; the capacity should be Y (you have Y tickets per game). From each person, link to each game they've ranked, with capacity 1.
If you look at the wiki link above, there are about 10 algorithms to solve this basic problem. Find one you understand and can implement yourself, because you'll need to modify it slightly. I'm not familiar with all of them, but the ones I know about have a step 'pick an edge' or 'pick a path.' You should modify the 'how you pick an edge' logic to take the priority ordering of the games into account. I'm not sure exactly what the ordering should be (you'll probably need to experiment), but if you say the lowest ranked game is 1, the next is 2, up to X, then a score like 'ranking of the edge - number of games the person is already signed up for' might work.
I think this is a variant of the Stable Marriage Problem or the Stable Roommates Problem for which there are known algorithms for solving.
Related
My problem is the following:
Me and my team are moving to another part of the office and we have to decide everybody's place to sit. However, everybody has priorities. I would like to find an algorithm which helps us to distribute the seats in a way that everybody is satisfied. (Or the most of them at least.)
I've started to implement my own algorithm where I ask 3 preferred options (the team consists of 10 people and there are 10 places) from everybody and consider there "seniority" (the length of the time they have spent in the team) as a rank between them.
However, I've stuck without any luck, tried to browse the internet for an algorithm which solves a similar problem but didn't find any.
What would be the best way to solve this? Is there any
generally known algorithm which solves this or a similar problem?
Thank you!
What first comes to mind for me is the stable marriage problem. Here's the problem statement for the original algorithm:
Given n men and n women, where each person has ranked all members of the opposite sex in order of preference, marry the men and women together such that there are no two people of opposite sex who would both rather have each other than their current partners. When there are no such pairs of people, the set of marriages is deemed stable.
Please read up on the Gale–Shapley algorithm, which is what I'll adapt for this problem.
Have each worker make a list of their rankings for all the spots. These will be the "men". Then, each spot will use the seniority ranking as their rankings for the "men". The spots will be the "women" in the Gale-Shapley algorithm.
You will get a seat assignment that has no "unstable marriage". Here's what an unstable marriage is:
There is an element A of the first matched set which prefers some given element B of the second matched set over the element to which A is already matched, and
B also prefers A over the element to which B is already matched.
In this context, an unstable marriage means that there is a worker-seat between W1 and S1 assignment such that another worker, W2, has ranked S1 higher. Not only that, S1 has also ranked W2 higher. Since the seats made their list based off the seniority list, it means that W2 has higher seniority.
In effect, this means that you'll get a seating assignment such that no worker has a seat that someone else with higher seniority wants "more".
The bottom of that Wiki article mentions packages in R and Python that have already implemented the algorithm, so it's just up to you to input the preference lists.
Disclaimer: This is probably not the most efficient algorithm. All the seats have the same ranking list, so there's probably a shortcut somewhere. However, it's easier to use a cannon to kill a fly, if the cannon is already written in R/Python for you. Also, this is the only algorithm I remember from uni, so this is the only hammer I have for any nail.
I decided to implement a brute force solution as lots of the comments suggested.
So:
I asked everybody from the team to give a preference order between the seats (10 to 1, what I use as score the "teamMember-seat" pairings, 10 is the highest score)
collected all of the "teamMember-seat" pairings with scores e.g. name:Steve, seat:seat1, score:5 (the score is from the given order from the previous step)
generated all the possible sitting combination from these
e.g.
List1: [name:Steve seat:seat1 score:5], [name:John seat:seat2 score:3] ... [name:X seat:seatY score:X]
List2: [name:Steve seat:seat2 score:4], [name:John seat:seat1 score:4] ... [name:X seat:seatY score:X]
...
ListX: [],[]...
chose the "teamMember-seat" list(s) with the highest score (score of the list is calculated by summing the scores of the "teamMember-seat" pairings)
if there are 2 lists with equal scores, then the algorithm choose that one where the most senior team members get the most preferred seats of them
if still there are more then one list (combination) the algorithm choose one randomly
I'm sure there are some better algorithms to do this as some of you suggested but I've run out of time.
I didn't post the code since it is really long and not too complicated to implement. However, if you need it, don't hesitate to drop a private message.
I'm making a matchmaking client that matches 10 people together into two teams:
Each person chooses four people they would like to play with, ranked from highest to lowest.
Two teams are then formed out of the strongest relationships in that set.
How would you create an algorithm that solves this problem?
Example:
Given players [a, b, c, d, e, f, g, h, i, j], '->' meaning a preference pick.
a -> b (weight: 4)
a -> c (weight: 3)
a -> d (weight: 2)
a -> e (weight: 1)
b -> d (weight: 4)
b -> h (weight: 3)
b -> a (weight: 2)
...and so on
This problem seemed simple on the surface (after all it is only just a matchmaking client), but after thinking about it for a while it seems that there needs to be quite a lot of relationships taken into account.
Edit (pasted from a comment):
Ideally, I would avoid a brute-force approach to scale to larger games which require 100 players and 25 teams, where picking your preferred teammates would be done through a search function. I understand that this system may not be the best for its purpose - however, it is an interesting problem and I would like to find an efficient solution while learning something along the way.
A disclaimer first.
If your user suggested this, there are two possibilities.
Either they can provide the exact details of the algorithm, so ask them.
Or they most probably don't know what they are talking about, and just generated a partial idea on the spot, in which case, it's sadly not worth much on average.
So, one option is to search how matchmaking works in other projects, disregarding the idea completely.
Another is to explore the user's idea.
Probably it won't turn into a good system, but there is a chance it will.
In any case, you will have to do some experiments yourself.
Now, to the case where you are going to have fun exploring the idea.
First, for separating ten items into two groups of five, there are just choose(10,5)=252 possibilities, so, unless the system has to do it millions of times per second, you can just calculate some score for all of them, and choose the best one.
The most straightforward way is perhaps to consider all 2^{10} = 1024 ways to form a subset of 10 elements, and then explore the ones where the size of the subset is 5.
But there may be better, more to-the-point, tools readily available, depending on the language or framework.
The 10-choose-5 combination is one group, the items not taken are the other group.
So, what would be the score of a combination?
Now we look at our preferences.
For each preference satisfied, we can add its weight, or its weight squared, or otherwise, to the score.
Which works best would sure need some experimentation.
Similarly, for each preference not satisfied, we can add a penalty depending on its weight.
Next, we can consider all players, and maybe add more penalty for each of the players which has none of their preferences satisfied.
Another thing to consider is team balance.
Since the only data so far are preferences (which may well turn out to be insufficient), an imbalance means that one team has many of their preferences satisfied, and the other has only few, if any at all.
So, we add yet another penalty depending on the absolute difference of (satisfaction sum of the first team) and (satisfaction sum of the second team).
Sure there can be other things to factor in...
Based on all this, construct a system which at least looks plausible on the surface, and then experiment and experiment again, tweaking it so that it better fits the matchmaking goals.
I would think of a way to score proposed teams against the selections from people, such as scoring proposed teams against the weights.
I would try and optimise this by hill-climbing (e.g. swapping a pair of people and looking to see if that improves the score) if only because people could look at the final solution and try this themselves - so you don't want to miss improvements of this sort.
I would hill-climb multiple times, from different starting points, and pick the answer found with the best score, because hill-climbing will probably end at local optima, not global optima.
At least some of the starting points should be based on people's original selections. This would be easiest if you got people's selections to amount to an entire team's worth of choices, but you can probably build up a team from multiple suggestions if you say that you will follow person A's suggestions, and then person B's selection if needed, and then person C's selection if needed, and so on.
If you include as starting points everybody's selections, or selections based on priority ABCDE.. and then priority BCDE... and then priority CDEF... then you have the property that if anybody submits a perfect selection your algorithm will recognise it as such.
If your hill-climbing algorithm tries swapping all pairs of players to improve, and continues until it finds a local optimum and then stops, then you also have the property that if anybody submits a selection which is only one swap away from perfection, your algorithm will recognise it as such.
Say you have a class with 5 sections: A,B,C,D,E. Each section meets at different times, thus students registering for the course will have preference for which section they will take (they can only take one section). When students register for the course, they list 3 sections they would prefer to take, in order of preference.
Each section has n students. Let's say for simplicity that exactly n*5 students have registered for the course.
So, the question is: How do you efficiently match students to their preferred section?
I've seen some questions with similar matching scenario questions, but none quite fit and I'm afraid I don't know enough about algorithms to make up my own. BTW, this is a real problem and I know the department in question takes a few days to do it by hand.
To determine whether each student can be assigned to a preferred section, construct an integer-valued maximum flow in the following network, where the three Xs stand for capacity-1 arcs from students to the sections they prefer (polynomial-time via, e.g., the push-relabel algorithm). There's a solution if and only if the maximum flow moves m = n*5 units; then the assignments are determined by which arcs from each student is saturated.
capacity-1 arcs capacity-n arcs
| |
v v
student 1
/ student 2 section1
/ . X section2 \
source < . X section3 > sink
\ . X section4 /
\ student m-1 section5
student m
To take the order of preference into account, switch to solving a min-cost flow problem, still poly-time solvable (though you may find the network simplex mode of a general-purpose LP solver easier to use) which allows a cost to specified for each arc. Choose a score for each preference level depending on what you think is fair.
I'm positive that this has been asked before, but scheduling problems are like snowflakes, and I can't find the old question by keywords alone.
Maybe you could randomly distribute them into sections. Next you select random pairs of student and consider if swapping them improves the distribution (does it increase the match with their preferences?). You can iterate until there is no improvement possible for X iterations.
This is obviously a very naive approach but if your sample is small it might converge quickly. You cannot guarantee you have the optimal solution, but therefore you'd need a brute force approach which is probably not possible.
Is there a scoring system in which if student 1 is in section A the score is 20? (on the other hand if student 2 is in section A, score is 15?
I'm asking since if there's only one spot left for section A, and both student 1 and 2 has section A most preferred, then who ever gets registered first gets the spot. Instead of who ever is best fit (higher score).
If there is no scoring, you can just loop through the students and put them in the section they prefer. If the first one is full, try their second preference, then the next. If all three sections the student prefers are filled, just enroll them to one that isn't filled.
(It'd be different if there is scoring since you have to go with a priority queue for each section and maximize that.)
I am required to solve a specific problem.
I'm given a representation of a social network.
Each node is a person, each edge is a connection between two persons. The graph is undirected (as you would expect).
Each person has a personal "affinity" for buying a product (to simplify things, let's say there's only one product involved in this whole problem).
In each "step" in time, each person, independently, chooses whether to buy the product or not.
There's probability invovled here. A few parameters are taken into account:
His personal affinity for the product,
The percentage of his friends that already bought the product
The gain for a person buying the product is 1 dollar.
The problem is to point out X persons (let's say, 5 persons) that will receive the product in step 0, and will maximize the total expected value of the gain after Y steps (let's say, 10 steps)
The network is very large. It's not possible to simulate all the options in a naive way.
What tool / library / algorithm should I be using?
Thank you.
P.S.
When investigating this matter in google and wikipedia, a few terms kept popping up:
Dynamic network analysis
Epidemic model
but it didn't help me to find an answer
Generally, people who have the most neighbours have the most influence when they buy something.
So my heuristic would be to order people first by the number of neighbours they have (in decreasing order), then by the number of neighbours that each of those neighbours has (in order from highest to lowest), and so on. You will need at most Y levels of neighbour counts, though fewer may suffice in practice. Then simply take the first X people on this list.
This is only a heuristic, because e.g. if a person has many neighbours but most or all of them are likely to have already bought the product through other connections, then it may give a higher expectation to select a different person having fewer neighbours, but whose neighbours are less likely to already own the product.
You do not need to construct the entire list and then sort it; you can construct the list and then insert each item into a heap, and then just extract the highest-scoring X people. This will be much faster if X is small.
If X and Y are as low as you suggest then this calculation will be pretty fast, so it would be worth doing repeated runs in which instead of starting with the first X people owning the product, for each run you randomly select the initial X owners according to a probability that depends on their position in the list (the further down the list, the lower the probability).
Check out the concept of submodularity, a pretty powerful mathematical concept. In particular, check out slide 19, where submodularity is used to answer the question "Given a social graph, who should get free cell phones?". If you have access, also read the corresponding paper. That should get you started.
I’m working on program for the English Language school I work for. I’m not being paid, its just a kind of a hobby to improve / automate my work flow.
It’s a residential school and one aspects I’m looking at automating is the way we allocate room to students, and although I don’t want a full blown solution I was hoping someone could point me in the right direction… Suggestions of the way you might approach this or by suggesting algorithms to look at etc.
Basically at the school we have a whole bunch of different rooms ranging from singles to dormitories for 8 people. We get lots of different nationalities from all over the world, and we always try to maker sure each room has a mix of nationalities. Where there is more than one nationality we try to balance them. Age is also important, we always put students of a similar age together, while still trying to mix nationalities, and its unusual for us to have students sharing with more than two years between them.
I suppose more generically speaking, I am in interested in how to sort a given set of students based on two parameters to an optimal result with a few rules attached.
I hope I’ve explain clearly what I am trying to achieve… in a way it sounds really simple, but I’ve trying to think how to do it in a simple way, i.e. by sorting by nationality and then by age but it just doesn’t cut it and I know there must be a better way of approaching this. When I do it “by hand” on an excel sheet it does feel quite intuitive.
Thank you to anyone who offers help / advice.
This is an interesting question but it's not easy to answer. Somehow it's connected with subdivsion and bin packing or the cutting-stock problem. You may want to look for a topological sort too. You can look for Drools a business logic platform that let you define such rules.
First of all you might find this interesting: Stable Room-mates Problem (wikipedia). Unfortunately it does not answer your question.
Try a genetic algorithm.
There are three main criteria for using a genetic algorithm:
ability to represent a solution as a mutable array. We can have an array of integers such that a[i] is the room for the ith student.
mutation of the state should produce predictable results. In our case this is true. Mutating the array will predictably shuffle students between the rooms.
easy to write a fast fitness function. Shouldn't be too hard to write a O(n) fitness function.
This is an interesting problem. I'll try writing some code with this approach and we'll see what happens.
How about, you think of a room as something that repels students of a nationality it already has, and attracts students of a close age to what it already has. The closer the age to the average age, the more it attracts it, and the more guys of X nationality are in the room, the more if repels guys of X nationality.
Then you would, for every new student to be added, iterate through each room and see which is the one that attracts it more. I guess if the room is empty you can set all forces to 0. Also, you would have a couple of constants that multiply each of both "forces" so you can calibrate it depending on how important is to have the same age against how important is to have different nationalities.
I'd analyze each student and create a 'personality' vector based on his/her age & nationality. Then I'd sort the vectors, and maybe scramble the results a bit after sorting to encourage diversity.
The general theme of "assign x to y with respect to constraints while optimizing some quantity" falls within operations research or more specifically http://en.wikipedia.org/wiki/Mathematical_optimization. The usual approach is to formally specify the problem and use a generic optimization solver such as one of those listed in http://en.wikipedia.org/wiki/List_of_optimization_software.
Give it a try, the formal specification languages for using the existing solvers are rather easy to learn and you might get an optimal solution without having to debug a complicated algorithm.
Formulation as a General Optimization Problem
It will be useful to formalize constraints and parameters. Let us assume that for 1 <= i <= 8, we have n_i rooms available of size i. Now let us impose the hard constraint that in a particular room S, every two students a, b \in S, we have that:
|Grade(a) - Grade(b)| <= 2 (1)
Now we are interested in optimizing the "diversity" function which intuitively represents the idea that we want rooms to be as mixed as possible. So we can represent this goal as:
max over all arrangements {{ Sum over all rooms S of DiversityScore(S) }}
where we have DiversityScore(S) = # of Different Nationalities in the Room
Formulation as a Graph Problem
This is the most general setting, but clearly max over all arrangements is not computationally feasible. Now let us pose this as a sort of graph problem with the hard grade constraints. Denote all students as a vertex in a Graph G. Connect two vertices if students satisfy constraint (1). Now a clique in this graph represents a group of students that can all be placed in the same room. Now proceed in a greedy manner. Choose the largest clique of size 4 which has the largest Diversity Score. Then place them in a room and continue until all rooms are filled. This clique search method can also incorporate gender constraints which is useful, however not that Clique finding is NP Hard Problem.
Now before trying to come up with something that may be faster, let us think about how to weaken the hard constraint (1). We can massage our graph formulation by including edge weights into the picture. So if the hard constraint is satisfied denote the edge weight from i to j as 1. If two students i and j deviate by age more than 2 denote the edge weight as 1 / (Age Difference)^2 or something. Then the score of a clique should be a product of the cliques edge weights with some diversity score. However it becomes clear that now the problem is on a complete graph, which is just the general optimization we hoped to avoid, so we need to impose some hard restrictions to reduce the connectivity of our graph.
A Basic Sorting Approximation Algorithm
Sort all students by their age, so we have a sorted array where all students in a[i] have the same age, and all students in a[i] are older than all students in a[j] for all j < i.
Now consider each pair i, j, of which there are O(n^2), where we also have that |Age[i] - Age[j]| <= 2. Find the largest group of students with different nationalities and place them in a room together. We successively iterate over O(n^2) index pairs which satisfy the hard constraint and take any students with nationality difference (which we can find by preprocessing and hashing on the index pairs). Doing this carefully (like looking at indices i j which are spread apart before close together) improves running time further. It feels like it should be polytime, but I think there are certain subtleties to address first before saying so.