algorithm for dividing x amount of people into n rooms of different sizes - algorithm

For a project I have to design an algorithm that will fit a group of people into hotel rooms given their preference. I have created a dictionary in Python that has a person as key, and as a value a list of all people they would like to be in a room with.
There are different types of rooms that can hold between 2-10 people. How many rooms of what type there are is specified by the user of the program.
I have tried to brute force this problem by trying all room combinations and then giving each room a score based on the preference of the residents and looking for the maximum score. This works fine for small group sizes but having a group of 200 will give 200! combinations which my poor computer will not be able to compute within my lifetime.
I was wondering if there is an algorithm that I have not been able to find with the solution to my problem.
Thanks in advance!
Thijs

What you can do is think of your dictionary as a graph. Then you can create an adjacency matrix.
For example let say you have a group of 4 people, A, B, C and D.
A: wants to be with B and C
B: wants to be with A
C: wants to be with D
D: want to be with A and C
Your matrix would look like this:
// A B C D
// A 0 1 1 0
// B 1 0 0 0
// C 0 0 0 1
// D 1 0 1 0
Let's call this matrix M. You can then calculate the transpose (let's call it MT) and add M to MT. You will get something like this.
// A B C D
// A 0 2 1 1
// B 2 0 0 0
// C 1 0 0 2
// D 1 0 2 0
Then order the lines (or the columns it doesn't matter because it is symmetric) based on the sum of its values.
// A B C D
// A 0 2 1 1
// C 1 0 0 2
// D 1 0 2 0
// B 2 0 0 0
Do the same with the columns
// A C D B
// A 0 1 1 2
// C 1 0 2 0
// D 1 2 0 0
// B 2 0 0 0
Start filling your rooms starting from the first line based on the greatest value in that line and reduce the matrix by removing people that were assigned a room. You should start by selecting the biggest room first.
For example if we have a room that can have 2 people you'd assign person B and A to it since the biggest value in the first line is 2 and it corresponds to person B.
The reduced matrix would then be:
// C D
// C 0 2
// D 2 0
And you loop till all is done.

You already had a greedy solution described. So instead I'll suggest a simulated annealing solution.
For this you first assign everyone to rooms randomly. And now you start considering swapping people at random. You always accept swaps that improve your score, but have a chance of accepting a bad swap. The chance of accepting a bad swap goes down if the swap is really bad, and also goes down with time. After you've experimented enough, whatever you have is probably pretty good.
It is called "simulated annealing" because it is a simulation of the process by which a slowly cooling substance forms a well-organized crystal structure. So the parameter that you usually use is called T for temperature. And a standard function is:
def maybe_swap(assignment, x, y, T):
score_now = score(assignment)
swapped = swap(assignment, x, y)
score_swapped = score(swapped)
if random.random() < math.exp( (score_swapped - score_now) / T ):
return swapped
else:
return assignment
And then you just have to play around with how much work to do. Something like this:
for count_down in range(400, -1, -1):
for i in range(n^2):
x = floor(random.random(n))
y = floor(random.random(n))
if x != y:
assignment = maybe_swap(assignment, x, y, count_down / 100.0)
(You should play around with the parameters.)

Related

Assignment regarding, dynamic programming. Making my code more efficient?

I've got an assignment regarding dynamic programming.
I'm to design an efficient algorithm that does the following:
There is a path, covered in spots. The user can move forward to the end of the path using a series of push buttons. There are 3 buttons. One moves you forward 2 spots, one moves you forward 3 spots, one moves you forward 5 spots. The spots on the path are either black or white, and you cannot land on a black spot. The algorithm finds the smallest number of button pushes needed to reach the end (past the last spot, can overshoot it).
The user inputs are for "n", the number of spots. And fill the array with n amount of B or W (Black or white). The first spot must be white. Heres what I have so far (Its only meant to be pseudo):
int x = 0
int totalsteps = 0
n = user input
int countAtIndex[n-1] <- Set all values to -1 // I'll do the nitty gritty stuff like this after
int spots[n-1] = user input
pressButton(totalSteps, x) {
if(countAtIndex[x] != -1 AND totalsteps >= countAtIndex[x]) {
FAILED } //Test to see if the value has already been modified (not -1 or not better)
else
if (spots[x] = "B") {
countAtIndex[x] = -2 // Indicator of invalid spot
FAILED }
else if (x >= n-5) { // Reached within 5 of the end, press 5 so take a step and win
GIVE VALUE OF TOTALSTEPS + 1 A SUCCESSFUL SHORTEST OUTPUT
FINISH }
else
countAtIndex[x] = totalsteps
pressButton(totalsteps + 1, x+5) //take 5 steps
pressButton(totalsteps + 1, x+3) //take 3 steps
pressButton(totalsteps + 1, x+2) //take 2 steps
}
I appreciate this may look quite bad but I hope it comes across okay, I just want to make sure the theory is sound before I write it out better. I'm wondering if this is not the most efficient way of doing this problem. In addition to this, where there are capitals, I'm unsure on how to "Fail" the program, or how to return the "Successful" value.
Any help would be greatly appreciated.
I should add incase its unclear, I'm using countAtIndex[] to store the number of moves to get to that index in the path. I.e at position 3 (countAtIndex[2]) could have a value 1, meaning its taken 1 move to get there.
I'm converting my comment into an answer since this will be too long for a comment.
There are always two ways to solve a dynamic programming problem: top-down with memoization, or bottom-up by systematically filling an output array. My intuition says that the implementation of the bottom-up approach will be simpler. And my intent with this answer is to provide an example of that approach. I'll leave it as an exercise for the reader to write the formal algorithm, and then implement the algorithm.
So, as an example, let's say that the first 11 elements of the input array are:
index: 0 1 2 3 4 5 6 7 8 9 10 ...
spot: W B W B W W W B B W B ...
To solve the problem, we create an output array (aka the DP table), to hold the information we know about the problem. Initially all values in the output array are set to infinity, except for the first element which is set to 0. So the output array looks like this:
index: 0 1 2 3 4 5 6 7 8 9 10 ...
spot: W B W B W W W B B W B
output: 0 - x - x x x - - x -
where - is a black space (not allowed), and x is being used as the symbol for infinity (a spot that's either unreachable, or hasn't been reached yet).
Then we iterate from the beginning of the table, updating entries as we go.
From index 0, we can reach 2 and 5 with one move. We can't move to 3 because that spot is black. So the updated output array looks like this:
index: 0 1 2 3 4 5 6 7 8 9 10 ...
spot: W B W B W W W B B W B
output: 0 - 1 - x 1 x - - x -
Next, we skip index 1 because the spot is black. So we move on to index 2. From 2, we can reach 4,5, and 7. Index 4 hasn't been reached yet, but now can be reached in two moves. The jump from 2 to 5 would reach 5 in two moves. But 5 can already be reached in one move, so we won't change it (this is where the recurrence relation comes in). We can't move to 7 because it's black. So after processing index 2, the output array looks like this:
index: 0 1 2 3 4 5 6 7 8 9 10 ...
spot: W B W B W W W B B W B
output: 0 - 1 - 2 1 x - - x -
After skipping index 3 (black) and processing index 4 (can reach 6 and 9), we have:
index: 0 1 2 3 4 5 6 7 8 9 10 ...
spot: W B W B W W W B B W B
output: 0 - 1 - 2 1 3 - - 3 -
Processing index 5 won't change anything because 7,8,10 are all black. Index 6 doesn't change anything because 8 is black, 9 can already be reached in three moves, and we aren't showing index 11. Indexes 7 and 8 are skipped because they're black. And all jumps from 9 are into parts of the array that aren't shown.
So if the goal was to reach index 11, the number of moves would be 4, and the possible paths would be 2,4,6,11 or 2,4,9,11. Or if the array continued, we would simply keep iterating through the array, and then check the last five elements of the array to see which has the smallest number of moves.

Converting a number into a special base system

I want to convert a number in base 10 into a special base form like this:
A*2^2 + B*3^1 + C*2^0
A can take on values of [0,1]
B can take on values of [0,1,2]
C can take on values of [0,1]
For example, the number 8 would be
1*2^2 + 1*3 + 1.
It is guaranteed that the given number can be converted to this specialized base system.
I know how to convert from this base system back to base-10, but I do not know how to convert from base-10 to this specialized base system.
In short words, treat every base number (2^2, 3^1, 2^0 in your example) as weight of an item, and the whole number as the capacity of a bag. This problem wants us to find a combination of these items which they fill the bag exactly.
In the first place this problem is NP-complete. It is identical to the subset sum problem, which can also be seen as a derivative problem of the knapsack problem.
Despite this fact, this problem can however be solved by a pseudo-polynomial time algorithm using dynamic programming in O(nW) time, which n is the number of bases, and W is the number to decompose. The details can be find in this wikipedia page: http://en.wikipedia.org/wiki/Knapsack_problem#Dynamic_programming and this SO page: What's it called when I want to choose items to fill container as full as possible - and what algorithm should I use?.
Simplifying your "special base":
X = A * 4 + B * 3 + C
A E {0,1}
B E {0,1,2}
C E {0,1}
Obviously the largest number that can be represented is 4 + 2 * 3 + 1 = 11
To figure out how to get the values of A, B, C you can do one of two things:
There are only 12 possible inputs: create a lookup table. Ugly, but quick.
Use some algorithm. A bit trickier.
Let's look at (1) first:
A B C X
0 0 0 0
0 0 1 1
0 1 0 3
0 1 1 4
0 2 0 6
0 2 1 7
1 0 0 4
1 0 1 5
1 1 0 7
1 1 1 8
1 2 0 10
1 2 1 11
Notice that 2 and 9 cannot be expressed in this system, while 4 and 7 occur twice. The fact that you have multiple possible solutions for a given input is a hint that there isn't a really robust algorithm (other than a look up table) to achieve what you want. So your table might look like this:
int A[] = {0,0,-1,0,0,1,0,1,1,-1,1,1};
int B[] = {0,0,-1,1,1,0,2,1,1,-1,2,2};
int C[] = {0,1,-1,0,2,1,0,1,1,-1,0,1};
Then look up A, B, C. If A < 0, there is no solution.

Resource Sharing/Trading algorithm

Lets say we have 3 people, Alice, Bob, and Charlie.
Lets say each of them have a resource, Aplles, Bannanas, and Coconuts.
Each of them have 3 of this resource.
The goal of the algorithm is to make 1-1 trades such that each of them end up with 1 of each of our 3 resources. A list of those trades is what I want to obtain.
Ideally I would like to know how to solve this. But I'm willing to settle for the name of this kind of problem, or a problem similar to it that I can research and get ideas from.
The problem I'm working on will have around 600 objects, with ~1000 people each with a random amount/type of starting resources, (with the assumption that there are enough resources to satisfy our end result) so Ideally any solution provided would be feasible for such a scale. But I'll take whatever I can get, I just need some kind of starting point.
The answers of ElKamina and Tyler Durden are decent, but they don't seem to take into account that Kuriso would like to perform 1-1 trades, that people may have multiple commodities, and multiple units of commodities. I have a naive solution that does.
I think the original example was a bit oversimplified, so let's take another one:
c1 c2 c3 c4
A 5 0 1 0
B 0 1 0 1
C 0 6 2 0
Where A,B,C are people and c1,c2,c3,c4 are the commodities.
First, let's calculate the ideal distribution, which is easily done: for each commodity, divide the sum of stuff by the number of people, rounded down, and everybody gets that:
c1 c2 c3 c4
A 1 2 1 0
B 1 2 1 0
C 1 2 1 0
Now let's define a WANT function denoting how much of a stuff c would person X need to get into the ideal position: WANT(X,c) = IDEAL(c) - Xc.
c1 c2 c3 c4 sum
A -4 2 0 0 -2
B 1 1 1 0 3
C 1 -4 -1 0 -4
Let's make a list of people ordered by the sum of their wants. Let's take the richest guy, the one with the lowest want, in this case C, and let's try to satisfy his wants by matching him up with people who has the most to offer of the commodity he wants most. If they can make a trade, great, if not, continue until we find a match (a match is guaranteed, eventually). In this example, C needs c1; the one offering the most c1 is A, iterating over the commodities, we find that A needs c2 and C does have surplus c2, so they exchange them. Update their position in the list, or remove them if they no longer have any needs. Iterate this until nobody has any wants. This won't produce properly equal distribution, but as equal as they can get to by 1 for 1 trading.
This is indeed a naive solution, with the heuristics that the richest guy has the most chance to offer stuff in return for the commodity he needs. The complexity is high, but with ordered lists it should be managable for the numbers you specified.
Assume you have a total number of x1 resources of kind 1,..., xn resources of kind n.
Assume you have k people and each of them have (or need to end up with y1, y2,..., yk resources respectively.
Now, pick a person i and assign him resources that are most prevalent. Once assignment is done, decrement the corresponding xj s (i.e. if resource j is assigned to i, decrement xj).
Keep repeating until all resources are assigned.
This is the way to assign stuff most evenly. It assumes that you dont care about sequences of trades, but the end result itself.
To restate this, let's say you have set of lists like this:
{ 1, 1, 1 }
{ 2, 2, 2 }
{ 3, 3, 3 }
and you want to swap elements from different sets until you have the sets like this:
{ 1, 2, 3 }
{ 1, 2, 3 }
{ 1, 2, 3 }
Now, you might notice that if we regard these lists as a single matrix then one matrix is the inverse of the other. You can perform this inversion by swapping across the 1-2-3 diagonal.
So item 2 in list 1 is swapped with item 2 in row 2, item 3 in list 1 is swapped with item 1 in list 3, and finally item 3 in list 2 is swapped with item 2 in list 3.
To sum up: do a matrix inversion by swapping across the diagonal.

Hungarian algorithm - assign systematically

I'm implementing the Hungarian algorithm in a project. I managed to get it working until what is called step 4 on Wikipedia. I do manage to let the computer create enough zeroes so that the minimal amount of covering lines is the amount of rows/columns, but I'm stuck when it comes to actually assign the right agent to the right job. I see how I could assign myself, but that's more trial and error - i.e., I do not see the systematic method which is of course essential for the computer to get it work.
Say we have this matrix in the end:
a b c d
0 30 0 0 0
1 0 35 5 0
2 60 5 0 0
3 0 50 35 40
The zeroes we have to take to have each agent assigned to a job are (a, 3), (b, 0), (c,2) and (d,1). What is the system behind chosing these ones? My code now picks (b, 0) first, and ignores row 0 and column b from now on. However, it then picks (a, 1), but with this value picked there is no assignment possible for row 3 anymore.
Any hints are appreciated.
Well, I did manage to solve it in the end. The method I used was to check whether there are any columns/rows with only one zero. In such case, that agent must use that job, and that column and row have to be ignored in the future. Then, do it again so as to get a job for every agent.
In my example, (b, 0) would be the first choice. After that we have:
a b c d
0 x x x x
1 0 x 5 0
2 60 x 0 0
3 0 x 35 40
Using the method again, we can do (a, 3), etc. I'm not sure whether it has been proven that this is always correct, but it seems it is.

Sorting a binary 2D matrix?

I'm looking for some pointers here as I don't quite know where to start researching this one.
I have a 2D matrix with 0 or 1 in each cell, such as:
1 2 3 4
A 0 1 1 0
B 1 1 1 0
C 0 1 0 0
D 1 1 0 0
And I'd like to sort it so it is as "upper triangular" as possible, like so:
4 3 1 2
B 0 1 1 1
A 0 1 0 1
D 0 0 1 1
C 0 0 0 1
The rows and columns must remain intact, i.e. elements can't be moved individually and can only be swapped "whole".
I understand that there'll probably be pathological cases where a matrix has multiple possible sorted results (i.e. same shape, but differ in the identity of the "original" rows/columns.)
So, can anyone suggest where I might find some starting points for this? An existing library/algorithm would be great, but I'll settle for knowing the name of the problem I'm trying to solve!
I doubt it's a linear algebra problem as such, and maybe there's some kind of image processing technique that's applicable.
Any other ideas aside, my initial guess is just to write a simple insertion sort on the rows, then the columns and iterate that until it stabilises (and hope that detecting the pathological cases isn't too hard.)
More details: Some more information on what I'm trying to do may help clarify. Each row represents a competitor, each column represents a challenge. Each 1 or 0 represents "success" for the competitor on a particular challenge.
By sorting the matrix so all 1s are in the top-right, I hope to then provide a ranking of the intrinsic difficulty of each challenge and a ranking of the competitors (which will take into account the difficulty of the challenges they succeeded at, not just the number of successes.)
Note on accepted answer: I've accepted Simulated Annealing as "the answer" with the caveat that this question doesn't have a right answer. It seems like a good approach, though I haven't actually managed to come up with a scoring function that works for my problem.
An Algorithm based upon simulated annealing can handle this sort of thing without too much trouble. Not great if you have small matrices which most likely hae a fixed solution, but great if your matrices get to be larger and the problem becomes more difficult.
(However, it also fails your desire that insertions can be done incrementally.)
Preliminaries
Devise a performance function that "scores" a matrix - matrices that are closer to your triangleness should get a better score than those that are less triangle-y.
Devise a set of operations that are allowed on the matrix. Your description was a little ambiguous, but if you can swap rows then one op would be SwapRows(a, b). Another could be SwapCols(a, b).
The Annealing loop
I won't give a full exposition here, but the idea is simple. You perform random transformations on the matrix using your operations. You measure how much "better" the matrix is after the operation (using the performance function before and after the operation). Then you decide whether to commit that transformation. You repeat this process a lot.
Deciding whether to commit the transform is the fun part: you need to decide whether to perform that operation or not. Toward the end of the annealing process, you only accept transformations that improved the score of the matrix. But earlier on, in a more chaotic time, you allow transformations that don't improve the score. In the beginning, the algorithm is "hot" and anything goes. Eventually, the algorithm cools and only good transforms are allowed. If you linearly cool the algorithm, then the choice of whether to accept a transformation is:
public bool ShouldAccept(double cost, double temperature, Random random) {
return Math.Exp(-cost / temperature) > random.NextDouble();
}
You should read the excellent information contained in Numerical Recipes for more information on this algorithm.
Long story short, you should learn some of these general purpose algorithms. Doing so will allow you to solve large classes of problems that are hard to solve analytically.
Scoring algorithm
This is probably the trickiest part. You will want to devise a scorer that guides the annealing process toward your goal. The scorer should be a continuous function that results in larger numbers as the matrix approaches the ideal solution.
How do you measure the "ideal solution" - triangleness? Here is a naive and easy scorer: For every point, you know whether it should be 1 or 0. Add +1 to the score if the matrix is right, -1 if it's wrong. Here's some code so I can be explicit (not tested! please review!)
int Score(Matrix m) {
var score = 0;
for (var r = 0; r < m.NumRows; r++) {
for (var c = 0; c < m.NumCols; c++) {
var val = m.At(r, c);
var shouldBe = (c >= r) ? 1 : 0;
if (val == shouldBe) {
score++;
}
else {
score--;
}
}
}
return score;
}
With this scoring algorithm, a random field of 1s and 0s will give a score of 0. An "opposite" triangle will give the most negative score, and the correct solution will give the most positive score. Diffing two scores will give you the cost.
If this scorer doesn't work for you, then you will need to "tune" it until it produces the matrices you want.
This algorithm is based on the premise that tuning this scorer is much simpler than devising the optimal algorithm for sorting the matrix.
I came up with the below algorithm, and it seems to work correctly.
Phase 1: move rows with most 1s up and columns with most 1s right.
First the rows. Sort the rows by counting their 1s. We don't care
if 2 rows have the same number of 1s.
Now the columns. Sort the cols by
counting their 1s. We don't care
if 2 cols have the same number of
1s.
Phase 2: repeat phase 1 but with extra criterions, so that we satisfy the triangular matrix morph.
Criterion for rows: if 2 rows have the same number of 1s, we move up the row that begin with fewer 0s.
Criterion for cols: if 2 cols have the same number of 1s, we move right the col that has fewer 0s at the bottom.
Example:
Phase 1
1 2 3 4 1 2 3 4 4 1 3 2
A 0 1 1 0 B 1 1 1 0 B 0 1 1 1
B 1 1 1 0 - sort rows-> A 0 1 1 0 - sort cols-> A 0 0 1 1
C 0 1 0 0 D 1 1 0 0 D 0 1 0 1
D 1 1 0 0 C 0 1 0 0 C 0 0 0 1
Phase 2
4 1 3 2 4 1 3 2
B 0 1 1 1 B 0 1 1 1
A 0 0 1 1 - sort rows-> D 0 1 0 1 - sort cols-> "completed"
D 0 1 0 1 A 0 0 1 1
C 0 0 0 1 C 0 0 0 1
Edit: it turns out that my algorithm doesn't give proper triangular matrices always.
For example:
Phase 1
1 2 3 4 1 2 3 4
A 1 0 0 0 B 0 1 1 1
B 0 1 1 1 - sort rows-> C 0 0 1 1 - sort cols-> "completed"
C 0 0 1 1 A 1 0 0 0
D 0 0 0 1 D 0 0 0 1
Phase 2
1 2 3 4 1 2 3 4 2 1 3 4
B 0 1 1 1 B 0 1 1 1 B 1 0 1 1
C 0 0 1 1 - sort rows-> C 0 0 1 1 - sort cols-> C 0 0 1 1
A 1 0 0 0 A 1 0 0 0 A 0 1 0 0
D 0 0 0 1 D 0 0 0 1 D 0 0 0 1
(no change)
(*) Perhaps a phase 3 will increase the good results. In that phase we place the rows that start with fewer 0s in the top.
Look for a 1987 paper by Anna Lubiw on "Doubly Lexical Orderings of Matrices".
There is a citation below. The ordering is not identical to what you are looking for, but is pretty close. If nothing else, you should be able to get a pretty good idea from there.
http://dl.acm.org/citation.cfm?id=33385
Here's a starting point:
Convert each row from binary bits into a number
Sort the numbers in descending order.
Then convert each row back to binary.
Basic algorithm:
Determine the row sums and store
values. Determine the column sums
and store values.
Sort the row sums in ascending order. Sort the column
sums in ascending order.
Hopefully, you should have a matrix with as close to an upper-right triangular region as possible.
Treat rows as binary numbers, with the leftmost column as the most significant bit, and sort them in descending order, top to bottom
Treat the columns as binary numbers with the bottommost row as the most significant bit and sort them in ascending order, left to right.
Repeat until you reach a fixed point. Proof that the algorithm terminates left as an excercise for the reader.

Resources