I was solving some example questions from an old programming contest. In this question we get an input of how many bartender we have, and which recipe's they know. Each cocktail takes 1 minute to make and we need to calculate if the order can be finished within 5 minutes, using all bartenders.
The key to solving this problem is assigning cocktails as efficient as possible. And thats where I'm stuck, my current algorithm gives the order to the bartender who knows the least other recipes. But of course this isn't 100% correct yet. Could anyone point me in the right direction (or give me an algorithm name to google) which solves this "bartender problem"?
This could be solved with a flow network.
The source has edges to each bartender, with capacity 5.
Each bartender have edges to each drink he/she can make, with capacity 5.
Each drink have edges to the sink, with a capacity corresponding to the number that is ordered.
Compute the maximum flow from the source to the sink. If any order remains unfulfilled, there is no solution.
Create a list of cocktails on the order, sequenced by how many tenders know how to make that cocktail
ie The order is for
(2*CocktailA, 1*CocktailB, 2*CocktailC, 1*CocktailD)
CocktailA can be made by 4 tenders (Tenders A, B, C, D)
CocktailA can be made by 4 tenders (Tenders A, B, C, D)
CocktailB can be made by 3 tenders (Tenders A, B, C)
CocktailC can be made by 1 tender (Tender A)
CocktailC can be made by 1 tender (Tender A)
CocktailD can be made by 1 tender (Tender B)
Work backwards through that list, assigning jobs to tenders. If multiple tenders can make the cocktail, then pick the one with the least amount of jobs already assigned.
CocktailD = Tender B
CocktailC = Tender A
CocktailC = Tender A (again)
CocktailB = Tender C
CocktailA = Tender D
CocktailA = Tender B (again)
Tenders A and B both have 2 jobs, so the order will take 2 mins.
This is a vertex coloring problem. It is exactly analogous to the register allocation problem which is very well studied. See http://en.wikipedia.org/wiki/Register_allocation. It can also be thought of as a set cover problem which is analogous to vertex coloring.
Of course, here we need not find the actual coloring, we just need to determine whether its cardinality is 5 or less. If the bartender graph can be colored in 5 or fewer colors, then the answer is Yes, otherwise No. Here is another nice paper describing the problem in terms of "tasks" and "days" and "machines": http://www.polymtl.ca/pub/sites/lagrapheur/docs/en/documents/NotesChap7.pdf.
Now, to figure this out, what is called the "chromatic number" or "chromatic index" of the graph, is NP-hard. In fact, someone has already asked on SO for an algorithm to find the chromatic number of a graph, but unfortunately did not get much of a response, see Algorithm for Chromatic Number of a Graph?
Just looking around the web I did find some code resources for doing colorings. One that can do this problem is called SMALLK. SMALLK can find colorings up to 8. Since we only need 5 for this problem this package can do it.
This is a variant on the college matching problem. Where drinks are students and bartenders are colleges. In turn it is a generalization of the stable marriage problem, which might be of more use to you.
Related
I have to design an algorithm to solve a problem:
We have two groups of people (group A and group B, the number of people in group A is always less or equal to the number of people in group B), all standing in a one-dimensional line, each people have a corresponding number indicating its location. When the timer starts, each people in group A must find a partner in group B, but people in group B cannot move at all and each people in group B can only have at most 1 partner.
Suppose that people in group A move 1 unit/sec, how can I find the minimum time for everyone in group A to find a partner?
for example, if there are three people in group A with location {5,7,8}, and four people in group B with location {2,3,4,9}, the optimal solution would be 3 sec because max(5-3,7-4,9-8)=3
I could just use brute-force to solve it, but is there a better way of solving this problem?
This problem is a special case of the edit distance problem, and so a similar Dynamic Programming solution can be used to solve it. It's possible that a faster solution exists for this special case.
Let A = [a_0, a_1...,a_(m-1)] be the (sorted) positions of our m moving people, and B = [b_0, b_1...,b_(n-1)] be the n (sorted) destination spots, with m <= n. For the edit distance analogy, the allowed operations are:
Insert a number into A (free), or
Substitute an element a -> a' in A with cost |a-a'|.
We can solve this in O(n*m) time (plus sorting time of both A and B, if necessary).
We can define the dynamic programming via a cost function C(i, j) which is the minimum cost to move the first i people a_0, ... a_(i-1) using only the first j spots b_0, ... b_(j-1). You want C(m,n). Define C as follows:
Find the most appropriate team compositions for days in which it is possible. A set of n participants, k days, a team has m slots. A participant specifies how many days he wants to be a part of and which days he is available.
Result constraints:
Participants must not be participating in more days than they want
Participants must not be scheduled in days they are not available in.
Algorithm should do its best to include as many unique participants as possible.
A day will not be scheduled if less than m participants are available for that day.
I find myself solving this problem manually every week at work for my football team scheduling and I'm sure there is a smart programmatic approach to solve it. Currently, we consider only 2 days per week and colleagues write down their name for which day they wanna participate, and it ends up having big lists for each day and impossible to please everyone.
I considered a new approach in which each colleague writes down his name, desired times per week to play and which days he is available, an example below:
Kane 3 1 2 3 4 5
The above line means that Kane wants to play 3 times this week and he is available Monday through Friday. First number represents days to play, next numbers represent available days(1 to 7, MOnday to Sunday).
Days with less than m (in my case, m = 12) participants are not gonna be scheduled. What would be the best way to approach this problem in order to find a solution that does its best to include each participant at least once and also considers their desires(when to play, how much to play).
I can do programming, I just need to know what kind of algorithm to implement and maybe have a brief logical explanation for the choice.
Result constraints:
Participants must not play more than they want
Participants must not be scheduled in days they don't want to play
Algorithm should do its best to include as many participants as possible.
A day will not be scheduled if less than m participants are available for that day.
Scheduling problems can get pretty gnarly, but yours isn't too bad actually. (Well, at least until you put out the first automated schedule and people complain about it and you start adding side constraints.)
The fact that a day can have a match or not creates the kind of non-convexity that makes these problems hard, but if k is small (e.g., k = 7), it's easy enough to brute force through all of the 2k possibilities for which days have a match. For the rest of this answer, assume we know.
Figuring out how to assign people to specific matches can be formulated as a min-cost circulation problem. I'm going to write it as an integer program because it's easier to understand in my opinion, and once you add side constraints you'll likely be reaching for an integer program solver anyway.
Let P be the set of people and M be the set of matches. For p in P and m in M let p ~ m if p is willing to play in m. Let U(p) be the upper bound on the number of matches for p. Let D be the number of people demanded by each match.
For each p ~ m, let x(p, m) be a 0-1 variable that is 1 if p plays in m and 0 if p does not play in m. For all p in P, let y(p) be a 0-1 variable (intuitively 1 if p plays in at least one match and 0 if p plays in no matches, but hold on a sec). We have constraints
# player doesn't play in too many matches
for all p in P, sum_{m in M | p ~ m} x(p, m) ≤ U(p)
# match has the right number of players
for all m in M, sum_{p in P | p ~ m} x(p, m) = D
# y(p) = 1 only if p plays in at least one match
for all p in P, y(p) ≤ sum_{m in M | p ~ m} x(p, m)
The objective is to maximize
sum_{p in P} y(p)
Note that we never actually force y(p) to be 1 if player p plays in at least one match. The maximization objective takes care of that for us.
You can write code to programmatically formulate and solve a given instance as a mixed-integer program (MIP) like this. With a MIP formulation, the sky's the limit for side constraints, e.g., avoid playing certain people on consecutive days, biasing the result to award at least two matches to as many people as possible given that as many people as possible got their first, etc., etc.
I have an idea if you need a basic solution that you can optimize and refine by small steps. I am talking about Flow Networks. Most of those that already know what they are are probably turning their nose because flow network are usually used to solve maximization problem, not optimization problem. And they are right in a sense, but I think it can be initially seen as maximizing the amount of player for each day that play. No need to say it is a kind of greedy approach if we stop here.
No more introduction, the purpose is to find the maximum flow inside this graph:
Each player has a number of days in which he wants to play, represented as the capacity of each edge from the Source to node player x. Each player node has as many edges from player x to day_of_week as the capacity previously found. Each of this 2nd level edges has a capacity of 1. The third level is filled by the edges that link day_of_week to the sink node. Quick example: player 2 is available 2 days: monday and tuesday, both have a limit of player, which is 12.
Until now 1st, 2nd and 4th constraints are satisfied (well, it was the easy part too): after you found the maximum flow of the entire graph you only select those path that does not have any residual capacity both on 2nd level (from players to day_of_weeks) and 3rd level (from day_of_weeks to the sink). It is easy to prove that with this level of "optimization" and under certain conditions, it is possible that it will not find any acceptable path even though it would have found one if it had made different choices while visiting the graph.
This part is the optimization problem that i meant before. I came up with at least two heuristic improvements:
While you visit the graph, store day_of_weeks in a priority queue where days with more players assigned have a higher priority too. In this way the amount of residual capacity of the entire graph is certainly less evenly distributed.
randomness is your friend. You are not obliged to run this algorithm only once, and every time you run it you should pick a random edge from a node in the player's level. At the end you average the results and choose the most common outcome. This is an situation where the majority rule perfectly applies.
Better to specify that everything above is just a starting point: the purpose of heuristic is to find the best approximated solution possible. With this type of problem and given your probably small input, this is not the right way but it is the easiest one when you do not know where to start.
I have a problem which I am converting into a TSP kind of problem so that I can explain it here and I am looking for any existing algorithms that might help me.
There are a list of places that I need to visit and I need to visit them all.
There are some places that have to be visited as the first x of n (IE, they need to be first 3 or first 5 places visited). (where the number is arbitrary)
There are some other places that have to be visited as the last y of n (IE, they need to be last 3 or last 5 places visited).
The places are could be categorized (some may not have a category), for those in a category, they need to visited as far away from each other (ie, if 2 places are categorized as green, then I would like to visit as many other category places as possible between these green categorized places)
Here is an example list:
A: category green: last 3
B: category none: ordering none
C: category pink: first 3
D: category none: ordering none
E: category none: last 3
F: category green: ordering none
G: category pink: first 3
The order I would like to come up with is:
G(pink,first3) -> F(green,none) -> C(pink,first3) -> D(none,none) -> B(none,none) -> E(none,last3) -> A(green,last3)
Explanation:
G came first, to keep it as far away from C as possible.
F came next to keep it as far away from A as possible.
C came next as it needed to be in first 3. C and G could be interchanged
D B could be placed anywhere
E came next as it had to be last 3
A came last as it had to be last 3 and by placing it at the end, it was as far as possible from F.
My idea is to evaluate each edge cost and the edge cost would be dynamically calculated. So if you tried to visit A and then F it would have a high cost, as opposed to visiting A and then some other place and then F (where the number of places in between would some how be part of the cost). Also, I would introduce a start and end place and so, if I had to visit some places as first x, I would be able to give it a low cost if start was within N places of that place. Same for the end.
I was wondering if there is a graph algorithm that can account for such dynamic weights/cost and determine the shortest path from start to end?
note: In some cases a best case may not be available, and that would be ok, as long as I can show that the cost is high because there wasnt enough category separation (eg: all places were in the same category).
Brute force algorithm
Initial idea I had is: Given a list of places, come up with all combinations of place ordering and calculate the costs between each and then choose the cheapest. (but this would mean evaluating n! where for 8 that would be 362880 orders that i would have to evaluate! why 8, cause that is what I believe will be the average number of places to evaluate)
But is there an algorithm that I could potentially use to determine it without testing all orderings.
One thing you could do:
Order the places as follows: first 1, ... first n, unordered, last n, ... last 1.
Go through the list and separate elemnts with the same color where possible without violating the previous order
Calculate the cost of this list and store it as the current best
Use this list to determine the order in which you evaluate permutations
While you build permutations, keep track of the cost
Abort building the current permutation when the cost exceeds the current best (including the theoretical minimum cost for the remaining places, if there is any)
Also abort when you have the theoretically possible best score.
I have a list of 15 cities. I randomly draw 70 pairs out of the possible 15*14/2=105 pairs of cities. For each of the 70 pairs, I ask my participants to decide whether city A is bigger than city B.
The important thing is, sometimes participants make 'mistakes' and give an answer that's incompatible with their previous answers. (i.e., it violates transitivity).
I need a way to sort my cities based on each participant's response, in a way that minimizes the number of trials that violate transitivity.
I don't need the actual order of cities, as there might not be a unique solution. I just need to calculate the (minimum) number of intransitive answers given by each participant.
How could I do this other than using exhaustive search?
EDIT: To give an example, take cities A,B,C,D and E. Participant Jon Doe thinks that the correct order of the cities (from smallest to biggest) is ABCDE. I don't care whether he's actually right or not, I just care about how well his responses -listed below- match his belief.
In three independent trials, Jon replied the following:
trial 1: A < B
trial 2: B < C (+)
trial 3: C < D
trial 4: D < E (+)
trial 5: E > B (*)
So, the answer in trial 5 (*) is incompatible with those in trials 2 and 4 together. Either one trial (nr. 5) did not correspond to Jon's belief, or 2 trials (2 and 4) didn't. I don't care to figure out which was Jon's belief (ABCDE), I just need to know that the "minimum number of intransitive answers" for Jon Doe is 1.
So... the problem might be interesting, but it's not clear what you want. You need to sort your cities but you don't need their order?
Minimize the number of trials that violate transitivity... how do you do that? The intransitivity is in the answers you get, not in whatever you do with them.
Calculate the number of intransitive answers given by each participant... if you have all the answers of each of the subjects, then an inconsistency is a cycle in the direct graph where nodes are cities and a node points to another iff the participant said its city is bigger than the other's. There are algorithms for that, see this question.
Of course an edge might be part of more than one cycle, and in this case we could try to find the minimum number of edges we have to remove to make it acyclical. Unfortunately, the problem is NP complete; so you won't find a fast answer. However, since your numbers are fairly low, you might manage to find a passably fast solution.
Hope this helps.
I have a complex problem and I want to know if an existing and well understood solution model exists or applies, like the Traveling Salesman problem.
Input:
A calendar of N time events, defined by starting and finishing time, and place.
The capacity of each meeting place (maximum amount of people it can simultaneously hold)
A set of pairs (Ai,Aj) which indicates that attendant Ai wishes to meet with attendat Aj, and Aj accepted that invitation.
Output:
For each assistant A, a cronogram of all the events he will attend. The main criteria is that each attendants should meet as many of the attendants who accepted his invites as possible, satisfying the space constraints.
So far, we thought of solving with backtracking (trying out all possible solutions), and using linear programming (i.e. defining a model and solving with the simplex algorithm)
Update: If Ai already met Aj in some event, they don't need to meet anymore (they have already met).
Your problem is as hard as minimum maximal matching problem in interval graphs, w.l.o.g Assume capacity of rooms is 2 means they can handle only one meeting in time. You can model your problem with Interval graphs, each interval (for each people) is one node. Also edges are if A_i & A_j has common time and also they want to see each other, set weight of edges to the amount of time they should see each other, . If you find the minimum maximal matching in this graph, you can find the solution for your restricted case. But notice that this graph is n-partite and also each part is interval graph.
P.S: note that if the amount of time that people should be with each other is fixed this will be more easier than weighted one.
If you have access to a good MIP solver (cplex/gurobi via acedamic initiative, but coin OR and LP_solve are open-source, and not bad either), I would definitely give simplex a try. I took a look at formulating your problem as a mixed integer program, and my feeling is that it will have pretty strong relaxations, so branch and cut and price will go a long way for you. These solvers give remarkably scalable solutions nowadays, especially the commercial ones. Advantage is they also provide an upper bound, so you get an idea of solution quality, which is not the case for heuristics.
Formulation:
Define z(i,j) (binary) as a variable indicating that i and j are together in at least one event n in {1,2,...,N}.
Define z(i,j,n) (binary) to indicate they are together in event n.
Define z(i,n) to indicate that i is attending n.
Z(i,j) and z(i,j,m) only exist if i and j are supposed to meet.
For each t, M^t is a subset of time events that are held simulteneously.
So if event 1 is from 9 to 11, event 2 is from 10 to 12 and event 3 is from 11 to 13, then
M^1 = {event 1, event 2) and M^2 = {event 2, event 3}. I.e. no person can attend both 1 and 2, or 2 and 3, but 1 and 3 is fine.
Max sum Z(i,j)
z(i,j)<= sum_m z(i,j,m)
(every i,j)(i and j can meet if they are in the same location m at least once)
z(i,j,m)<= z(i,m) (for every i,j,m)
(if i and j attend m, then i attends m)
z(i,j,m)<= z(j,m) (for every i,j,m)
(if i and j attend m, then j attends m)
sum_i z(i,m) <= C(m) (for every m)
(only C(m) persons can visit event m)
sum_(m in M^t) z(i,m) <= 1 (for every t and i)
(if m and m' are both overlapping time t, then no person can visit them both. )
As pointed out by #SaeedAmiri, this looks like a complex problem.
My guess would be that the backtracking and linear programming options you are considering will explode as soon as the number of assistants grows a bit (maybe in the order of tens of assistants).
Maybe you should consider a (meta)heuristic approach if optimality is not a requirement, or constraint programming to build an initial model and see how it scales.
To give you a more precise answer, why do you need to solve this problem? what would be the typical number of attendees? number of rooms?