Shortest time for everyone to get to the destination - algorithm

I have to design an algorithm to solve a problem:
We have two groups of people (group A and group B, the number of people in group A is always less or equal to the number of people in group B), all standing in a one-dimensional line, each people have a corresponding number indicating its location. When the timer starts, each people in group A must find a partner in group B, but people in group B cannot move at all and each people in group B can only have at most 1 partner.
Suppose that people in group A move 1 unit/sec, how can I find the minimum time for everyone in group A to find a partner?
for example, if there are three people in group A with location {5,7,8}, and four people in group B with location {2,3,4,9}, the optimal solution would be 3 sec because max(5-3,7-4,9-8)=3
I could just use brute-force to solve it, but is there a better way of solving this problem?

This problem is a special case of the edit distance problem, and so a similar Dynamic Programming solution can be used to solve it. It's possible that a faster solution exists for this special case.
Let A = [a_0, a_1...,a_(m-1)] be the (sorted) positions of our m moving people, and B = [b_0, b_1...,b_(n-1)] be the n (sorted) destination spots, with m <= n. For the edit distance analogy, the allowed operations are:
Insert a number into A (free), or
Substitute an element a -> a' in A with cost |a-a'|.
We can solve this in O(n*m) time (plus sorting time of both A and B, if necessary).
We can define the dynamic programming via a cost function C(i, j) which is the minimum cost to move the first i people a_0, ... a_(i-1) using only the first j spots b_0, ... b_(j-1). You want C(m,n). Define C as follows:

Related

Optimize event seat assignments with Corona restrictions

Problem:
Given a set of group registrations, each for a varying number of people (1-7),
and a set of seating groups (immutable, at least 2m apart) varying from 1-4 seats,
I'd like to find the optimal assignment of people groups to seating groups:
People groups may be split among several seating groups (though preferably not)
Seating groups may not be shared by different people groups
(optional) the assignment should minimize the number of 'wasted' seats, i.e. maximize the number of seats in empty seating groups
(ideally it should run from within a Google Apps script, so memory and computational complexity should be as small as possible)
First attempt:
I'm interested in the decision problem (is it feasible?) as well as the optimization problem (see optional optimization function). I've modeled it as a SAT problem, but this does not find an optimal solution.
For this reason, I've tried to model it as an optimization problem. I'm thinking along the lines of a (remote) variation of multiple-knapsack, but I haven't been able to name it yet:
items: seating groups (size -> weight)
knapsacks: people groups (size -> container size)
constraint: combined item weight >= container size
optimization: minimize the number of items
As you can see, the constraint and optimization are inverted compared to the standard problem. So my question is: Am I on the right track here or would you go about it another way? If it's correct, does this optimization problem have a name?
You could approach this as an Integer Linear Programming Problem, defined as follows:
let P = the set of people groups, people group i consists of p_i people;
let T = the set of tables, table j has t_j places;
let x_ij be 1 if people from people group i are placed at table j, 0 otherwise
let M be a large penalty factor for empty seats
let N be a large penalty factor for splitting groups
// # of free spaces = # unavailable - # occupied
// every time a group uses more than one table,
// a penalty of N * (#tables - 1) is incurred
min M * [SUM_j(SUM_i[x_ij] * t_j) - SUM_i(p_i)] + N * SUM_i[(SUM_j(x_ij) - 1)]
// at most one group per table
s.t. SUM_i(x_ij) <= 1 for all j
// every group has enough seats
SUM_j(x_ij * t_j) = p_i for all i
0 <= x_ij <= 1
Although this minimises the number of empty seats, it does not minimise the number of tables used or maximise the number of groups admitted. If you'd like to do that, you could expand the objective function by adding a penalty for every group turned away.
ILPs are NP-hard, so without the right solvers, it might not be possible to make this run with Google Apps. I have no experience with that, so I'm afraid I can't help you. But there are some methods to reduce your search space.
One would be through something called column generation. Here, the problem is split into two parts. The complex master problem is your main research question, but instead of the entire solution space, it tries to find the optimum from different candidate assignments (or columns).
The goal is then to define a subproblem that recommends these new potential solutions that are then incorporated in the master problem. The power of a good subproblem is that it should be reducable to a simpler model, like Knapsack or Dijkstra.

Best approach to a variation of a bucketing problem

Find the most appropriate team compositions for days in which it is possible. A set of n participants, k days, a team has m slots. A participant specifies how many days he wants to be a part of and which days he is available.
Result constraints:
Participants must not be participating in more days than they want
Participants must not be scheduled in days they are not available in.
Algorithm should do its best to include as many unique participants as possible.
A day will not be scheduled if less than m participants are available for that day.
I find myself solving this problem manually every week at work for my football team scheduling and I'm sure there is a smart programmatic approach to solve it. Currently, we consider only 2 days per week and colleagues write down their name for which day they wanna participate, and it ends up having big lists for each day and impossible to please everyone.
I considered a new approach in which each colleague writes down his name, desired times per week to play and which days he is available, an example below:
Kane 3 1 2 3 4 5
The above line means that Kane wants to play 3 times this week and he is available Monday through Friday. First number represents days to play, next numbers represent available days(1 to 7, MOnday to Sunday).
Days with less than m (in my case, m = 12) participants are not gonna be scheduled. What would be the best way to approach this problem in order to find a solution that does its best to include each participant at least once and also considers their desires(when to play, how much to play).
I can do programming, I just need to know what kind of algorithm to implement and maybe have a brief logical explanation for the choice.
Result constraints:
Participants must not play more than they want
Participants must not be scheduled in days they don't want to play
Algorithm should do its best to include as many participants as possible.
A day will not be scheduled if less than m participants are available for that day.
Scheduling problems can get pretty gnarly, but yours isn't too bad actually. (Well, at least until you put out the first automated schedule and people complain about it and you start adding side constraints.)
The fact that a day can have a match or not creates the kind of non-convexity that makes these problems hard, but if k is small (e.g., k = 7), it's easy enough to brute force through all of the 2k possibilities for which days have a match. For the rest of this answer, assume we know.
Figuring out how to assign people to specific matches can be formulated as a min-cost circulation problem. I'm going to write it as an integer program because it's easier to understand in my opinion, and once you add side constraints you'll likely be reaching for an integer program solver anyway.
Let P be the set of people and M be the set of matches. For p in P and m in M let p ~ m if p is willing to play in m. Let U(p) be the upper bound on the number of matches for p. Let D be the number of people demanded by each match.
For each p ~ m, let x(p, m) be a 0-1 variable that is 1 if p plays in m and 0 if p does not play in m. For all p in P, let y(p) be a 0-1 variable (intuitively 1 if p plays in at least one match and 0 if p plays in no matches, but hold on a sec). We have constraints
# player doesn't play in too many matches
for all p in P, sum_{m in M | p ~ m} x(p, m) ≤ U(p)
# match has the right number of players
for all m in M, sum_{p in P | p ~ m} x(p, m) = D
# y(p) = 1 only if p plays in at least one match
for all p in P, y(p) ≤ sum_{m in M | p ~ m} x(p, m)
The objective is to maximize
sum_{p in P} y(p)
Note that we never actually force y(p) to be 1 if player p plays in at least one match. The maximization objective takes care of that for us.
You can write code to programmatically formulate and solve a given instance as a mixed-integer program (MIP) like this. With a MIP formulation, the sky's the limit for side constraints, e.g., avoid playing certain people on consecutive days, biasing the result to award at least two matches to as many people as possible given that as many people as possible got their first, etc., etc.
I have an idea if you need a basic solution that you can optimize and refine by small steps. I am talking about Flow Networks. Most of those that already know what they are are probably turning their nose because flow network are usually used to solve maximization problem, not optimization problem. And they are right in a sense, but I think it can be initially seen as maximizing the amount of player for each day that play. No need to say it is a kind of greedy approach if we stop here.
No more introduction, the purpose is to find the maximum flow inside this graph:
Each player has a number of days in which he wants to play, represented as the capacity of each edge from the Source to node player x. Each player node has as many edges from player x to day_of_week as the capacity previously found. Each of this 2nd level edges has a capacity of 1. The third level is filled by the edges that link day_of_week to the sink node. Quick example: player 2 is available 2 days: monday and tuesday, both have a limit of player, which is 12.
Until now 1st, 2nd and 4th constraints are satisfied (well, it was the easy part too): after you found the maximum flow of the entire graph you only select those path that does not have any residual capacity both on 2nd level (from players to day_of_weeks) and 3rd level (from day_of_weeks to the sink). It is easy to prove that with this level of "optimization" and under certain conditions, it is possible that it will not find any acceptable path even though it would have found one if it had made different choices while visiting the graph.
This part is the optimization problem that i meant before. I came up with at least two heuristic improvements:
While you visit the graph, store day_of_weeks in a priority queue where days with more players assigned have a higher priority too. In this way the amount of residual capacity of the entire graph is certainly less evenly distributed.
randomness is your friend. You are not obliged to run this algorithm only once, and every time you run it you should pick a random edge from a node in the player's level. At the end you average the results and choose the most common outcome. This is an situation where the majority rule perfectly applies.
Better to specify that everything above is just a starting point: the purpose of heuristic is to find the best approximated solution possible. With this type of problem and given your probably small input, this is not the right way but it is the easiest one when you do not know where to start.

Resource allocation- Matching

I've got a problem which I'm not sure can be solved by Linear Programming. Essentially there are 2 groups of people who are list their preference for one another and will be subsequently matchd. I'm writing an algorithm for this. Group A has upto 4 choices from Group B and vice versa.
In formulating a solution, I am currently assigning a cost to each combination of pairs. For example if Person 1 from Group A ranks Person 3 from Group B as his/her number 1 choice and vice versa, then the cost is minimal (Pair 1-3 cost: 0.01). Similarly, I would allot a cost to other pairs, devising an objective function which seeks to have pairings which minimize overall cost.
However, I do not see this being feasible because I don't know how to define my constraints and overall objective function. Reading online and from textbooks, I find resource allocation problems to be different from what I am trying to do.
Can I seek your advise on how to proceed?
Your problem can be formulated as an "Assignment Problem." As a canonical case, assignment problems are for assigning "jobs" to "machines." They can just as easily be used for Matching two sets.
Here's the formulation:
Two sets of people A and B
Decision Variable Xij
Let Xij be 1 if person i (ith person in set A) is matched with jth person in set B; 0 otherwise
Parameters:
Let Cij be the cost of pairing person i with person j
Objective Function: Minimize (Sum over i) (sum over j) Cij * Xij
Constraints:
Every Person i gets paired exactly once
Sum over j Xij = 1 (for each i)
Every Person j gets paired exactly once
Sum over i Xij = 1 (for each j)
Xij are Binary variables
Xij = (0,1)
The neat thing about Assignment problems is that the optimal pairings can be found using the fairly easy to understand 'Hungarian Method.' You can also use an LP/IP solver you have at your disposal.
Hope that helps.

Combinatorial best match

Say I have a Group data structure which contains a list of Element objects, such that each group has a unique set of elements.:
public class Group
{
public List<Element> Elements;
}
and say I have a list of populations who require certain elements, in such a way that each population has a unique set of required elements:
public class Population
{
public List<Element> RequiredElements;
}
I have an unlimited quantity of each defined Group, i.e. they are not consumed by populations.
Say I am looking at a particular Population. I want to find the best possible match of groups such that there is minimum excess elements, and no unmatched elements.
For example: I have a population which needs wood, steel, grain, and coal. The only groups available are {wood, herbs}, {steel, coal, oil}, {grain, steel}, and {herbs, meat}.
The last group - {herbs, meat} isn't required at all by my population so it isn't used. All others are needed, but herbs and oil are not required so it is wasted. Furthermore, steel exists twice in the minimum set, so one lot of steel is also wasted. The best match in this example has a wastage of 3.
So for a few hundred Population objects, I need to find the minimum wastage best match and compute how many elements are wasted.
How do I even begin to solve this? Once I have found a match, counting the wastage is trivial. Finding the match in the first place is hard. I could enumerate all possibilities but with a few thousand populations and many hundreds of groups, it's quite a task. Especially considering this whole thing sits inside each iteration of a simulated annealing algorithm.
I'm wondering whether I can formulate the whole thing as a mixed-integer program and call a solver like GLPK at each iteration.
I hope I have explained the problem correctly. I can clarify anything that's unclear.
Here's my binary program, for those of you interested...
x is the decision vector, an element of {0,1}, which says that the population in question does/doesn't receive from group i. There is an entry for each group.
b is the column vector, an element of {0,1}, which says which resources the population in question does/doesn't need. There is an entry for each resource.
A is a matrix, an element of {0,1}, which says what resources are in what groups.
The program is:
Minimise: ((Ax - b)' * 1-vector) + (x' * 1-vector);
Subject to: Ax >= b;
The constraint just says that all required resources must be satisfied. The objective is to minimise all excess and the total number of groups used. (i.e. 0 excess with 1 group used is better than 0 excess with 5 groups used).
You can formulate an integer program for each population P as follows. Use a binary variable xj to denote whether group j is chosen or not. Let A be a binary matrix, such that Aij is 1 if and only if item i is present in group j. Then the integer program is:
min Ei,j (xjAij)
s.t. Ej xjAij >= 1 for all i in P.
xj = 0, 1 for all j.
Note that you can obtain the minimum wastage by subtracting |P| from the optimal solution of the above IP.
Do you mean the Maximum matching problem?
You need to build a bipartite graph, where one of the sides is your populations and the other is groups, and edge exists between group A and population B if it have it in its set.
To find maximum edge matching you can easily use Kuhn algorithm, which is greatly described here on TopCoder.
But, if you want to find mimimum edge dominating set (the set of minimum edges that is covering all the vertexes), the problem becomes NP-hard and can't be solved in polynomial time.
Take a look at the weighted set cover problem, I think this is exactly what you described above. A basic description of the (unweighted) problem can be found here.
Finding the minimal waste as you defined above is equivalent to finding a set cover such that the sum of the cardinalities of the covering sets is minimal. Hence, the weight of each set (=a group of elements) has to be defined equal to its cardinality.
Since even the unweighted the set cover problem is NP-complete, it is not likely that an efficient algorithm for your problem instances exist. Maybe a good greedy approximation algorithm will be sufficient or your purpose? Googling weighted set cover provides several promising results, e.g. this script.

Algorithm to find minimum number of weightings required to find defective ball from a set of n balls

Okay here is a puzzle I come across a lot of times-
Given a set of 12 balls , one of which is defective (it weighs either less or more) . You are allow to weigh 3 times to find the defective and also tell which weighs less or more.
The solution to this problem exists, but I want to know whether we can algorithmically determine if given a set of 'n' balls what is the minimum number of times you would need to use a beam balance to determine which one is defective and how(lighter or heavier).
A wonderful algorithm by Jack Wert can be found here
http://www.cut-the-knot.org/blue/OddCoinProblems.shtml
(as described for the case n is of the form (3^k-3)/2, but it is generalizable to other n, see the writeup below)
A shorter version and probably more readable version of that is here
http://www.cut-the-knot.org/blue/OddCoinProblemsShort.shtml
For n of the form (3^k-3)/2, the above solution applies perfectly and the minimum number of weighings required is k.
In other cases...
Adapting Jack Wert's algorithm for all n.
In order to modify the above algorithm for all n, you can try the following (I haven't tried proving the correctness, though):
First check if n is of the from (3^k-3)/2. If it is, apply above algorithm.
If not,
If n = 3t (i.e. n is a multiple of 3), you find the least m > n such that m is of the form (3^k-3)/2. The number of weighings required will be k. Now form the groups 1, 3, 3^2, ..., 3^(k-2), Z, where 3^(k-2) < Z < 3^(k-1) and repeat the algorithm from Jack's solution.
Note: We would also need to generalize the method A (the case when we know if the coin is heavier of lighter), for arbitrary Z.
If n = 3t+1, try to solve for 3t (keeping one ball aside). If you don't find the odd ball among 3t, the one you kept aside is defective.
If n = 3t+2, form the groups for 3t+3, but have one group not have the one ball group. If you come to the stage when you have to rotate the one ball group, you know the defective ball is one of two balls and you can then weigh one of those two balls against one of the known good balls (from among the other 3t).
Trichotomy ! :)
Explanation :
Given a set of n balls, subdivide it in 3 sets A, B and C of n/3 balls.
Compare A and B. If equal, then the defective ball is in C.
etc.
So, your minimum number of times is the number of times you can divide n by three (sorry, i do not know the english word for that).
You could use a general planning algorithm: http://www.inf.ed.ac.uk/teaching/courses/plan/

Resources