Enumerate all possible distributions of n balls into k boxes [duplicate] - algorithm

This question already has answers here:
combinations totaling to sum
(5 answers)
Closed 5 years ago.
The exact problem i am referring to and the number of distributions for the problem is computed here. I am interested in knowing those distributions explicitly.
For example, there are 5 balls and 3 boxes: one distribution is 2 balls in box 1, 2 in box 2, 1 in box 3 referred to as, say 221. Now i want to list all such possible distributions: -
212
131
104
.
.
.
One way is that i run the matlab command: perms([0,0,0,0,0,1,1,1]). This essentially generates all permutations of 5 balls and 2 sticks. but there is massive over-counting as the command perms does not recognize identical objects.

Very simple ... sort of.
function alloc(balls, boxes):
if boxes = 1
return [balls]
else
for n in range 0:balls
return alloc(balls-n, boxes-1)
That's the basic recursion logic: pick each possible quantity of balls, then recur on the remaining balls and one box fewer.
The list-gluing methods will be language-dependent; I leave them as an exercise for the student.

You can use unique() to get rid of identical rows generated by perms():
A = unique(perms([0,0,0,0,0,1,1,1]), 'rows');
% `A` will contain all combinations, not permutations, of [0,0,0,0,0,1,1,1]

Related

Double hashing using composite numbers in second hash function

I realize that the best practice is to use the largest prime number (smaller then the size of the array) in the mod function of the second hash function is best practice.
But my question is regarding the use of numbers that are not prime numbers.
I'm not interested in a pseudo-code just the idea behind the concept.
Let's say I have an array m=20, and I have to choose between 6,9,12 and 15 as the values that will be entered in the second hash function. Which of them will give me the best 'spread'?
My first thought is to go for the same idea as choosing a prime number, only slightly modified, which means using the largest number the has the minimum amount of permutations:
6 -> 2,3
9 -> 3,3 = 3
12 -> 2,3,4,6
15 -> 3,5
Right of the bat I can rule 6 (a larger number with the same amount of permutations exists) and 12 (too many permutations) out.
Now the question arises, should I use 9 - has the least amount permutations, or should I choose 15 - although it has more permutations it is much larger the 9 and a lot closer to the size of the array (m=20).
Am I correct in using this approach? or is there a better way of choosing a number, given I can only choose from the numbers stated above?
I have found the answer I was looking for, so I'm leaving the question here with the correct answer in case anyone else ever needs it.
If we are forced to choose a number that is not a prime number as the number to be used in the second hash function (in the mod of that function):
The correct approach is to use the GCD function (Greatest Common Denominator), to find numbers that are "prime with respect to each other". This means that we are looking for any number that its gcd with 20 will result in 1.
In this case:
gcd(20,6)= 2
gcd(20,9)= 1
gcd(20,12)= 3
gcd(20,15)= 5
As we can see, the gcd between 20 and 9 is 1, which means that they have no common factors other than 1. Therefore, 9 is the correct answer.

Select optimal pairings of elements

I have following problem e.g:
Given a bucket with symbols
1 1 2 3 3 4
And book of recipes to create pairs e.g:
12 13 24
Select from bucket optimal pairing, leaving as little as possible symbols in the bucket. So using the examplary values above the optimal pairing would be:
13 13 24 Which would use all the symbols given.
Naive picking from the bucket could result in something like:
12 13 Leaving the 3 and 4 unmatched. 3 and 4 cannot be matched because the book does not contain a recipe for that particular connection
Notes:
Real problem consits on average of: 500 elements in bucket in about 30 kind of symbols.
We've tried to implement the solution using the bruteforce algorithm, however I am afraid that even our grandchildren will not live long enough to see the result :).
There is no limit to the size of recipe book, it could even have every possible in the bucket. Pair made of the same element twice is not allowed.
The answer is not required to empty the bucket completely. Its just about getting the most pairs out of the bucket. Its okay to leave some in the bucket. It would be best to look for the optimal solution, however close approximation is also good enough.
I will appreciate an answer that proposes/gives hint to an algorithm to solve the problem.
Examples:
Bucket:
1 1 2 2 2 2 3 3 3 4 5 6 7 8 8
Recipe book:
12 34 15 68
Optimal result (one of possible):
{1 2} {1 2} {3 4} {6 8}
Leftover:
2 2 3 3 5 7 8
This problem is essentially the maximum matching problem with the small twist that you're allowed to have duplicate objects. Here's one way to solve this problem assuming you have a solver for maximum matching:
Create a node for each number in the input list.
For each recipe, for each pair of numbers matching that recipe, add an edge between the nodes for those numbers.
Run a maximum matching algorithm and return the pairs reported that way.
There are a good number of off-the-shelf maximum matching algorithms you can use, and if you need to code one up yourself, consider Edmonds' Blossom Algorithm, which is reasonably efficient and less tricky to code up than other approaches.
First generate all possibles pairs of symbols and store them with the indices of each symbol , so if you have n symbols , then n*(n+1)/2 pairs are going to be generated (max case n=500 then 125250 pairs are going to be generated ).
Ex : bucket with symbols 1 1 3
Then pairs are going to be generated are (11,1,2)(13,1,3)(13,2,3).
General format ( a[i]a[j], i, j ).
Now lets loop over generated pairs and delete pairs that doesn't exist in the book of recipes, so now we have at most 30 pairs .
Next lets build a graph such that the nodes are our generated pairs, and each 2 nodes are connected if the indices of the 2 pairs are different (using 2 nested loops over our pairs ) .
Finally we can perform BFS or DFS and find the longest graph between all generated graphs , which has the answer to our problem.
If you want c++/Java implementation ,please don't hesitate to ask.

program to get all the combinations of ball-box application

I am new to combination and permutation related algorithms. Does anybody have any thoughts on how to program to solve this classical problem? There are 3 boxes(A,B,C) and 10 balls(1,2,3,...,10), we want to put all balls into the boxes. The result should be {Box A: ball 1; Box B: ball 2,3,4; Box C: ball 5 6 7 8 9 10}, {Box A: ball 1 2; Box B: ball 3 4 5; Box C: 7 8 9 10}, .... I want to get all combinations (not the number of different combinations).
Furthermore, what if there is a constraint that each box contains at most 4 balls?
Thank you.
You can put the first ball in any of three boxes, so you have three variants.
There are three variants for the second ball, three for the third and so on.
They are independent, so you have 3^10 variants, and each variant has 1:1 mapping with a number in range 0..3^10-1.
Consider number in ternary number system, so k-th ternary digit of number tells us what box (a=0,b=1,c=2) k-th ball belongs to.
Example for 3 balls:
Number 14 = 112 ternary, so first ball in C, second and third in B
For case of limited box size simple approach is recursive generation - arguments of recursion are list of available balls and current combination (list of boxes with balls and vacant places).

Can't make/configure the difficult function

While solving sudoku I can remove possibility digits (1) and (2) from the cells D[1,2] and D[2,2]. Because (8) and (9) are possible only in those cells, so those cells are either (8 and 9) or (9 and 8). This means that digits (1) and (2) are at the 3rd line of the D block. Thats why I can eliminate the possibility of the digit (1) from the cell A[3,3].
I have been configuring a function to do this during last 40 hours, but couldn't manage. Is there anyone who can make the function to detect this type of intellectual issue (eliminating some possibilities because some other n count of possibilities can exist only in n count of cells, in our case 2 digits 8 and 9 can exist in 2 cells D[1,2] and D[2,2]).
Please dont advice me about other functions of sudoku; I have already done them, the only algorithm that I couldn't program is this one. Btw you can use r[i] (string which consists the possibilities for the row number i), c[i] for the column, and b[i] for the blocks (ex: b[4] (in this image block A) = 1,2,3,4,5,6,7 because 8 and 9 are already defined). Thanks
I really don't see the problem, you basically already answered your problem.
In general you should do the following:
Step 1: Loop over all 9 cells of one block and check if (1) is contained only in two cells.
Step 2: If not, try the next number. If yes, Loop over all 9 cells and check if (2) is also in those two cells but not in any other of the remaining 7.
Step 3: If not, check the next number. If yes, remove the other possibilities of the two cells except for the two numbers you found and you are basically done.
Step 4: If no matching number could be found for (1) (or any larger number that was chosen in the "not" part of step 2), start over from step 1 but trying the next number, unless you are already at 8, then you can stop.
In the end you could dynamically extend the same pattern for 3 numbers in 3 cells, 4 numbers...

Generate multiple sequences of numbers with unique values at each index

I have a row with numbers 1:n. I'm looking to add a second row also with the numbers 1:n but these should be in a random order while satisfying the following:
No positions have the same number in both rows
No combination of numbers occurs twice
For example, in the following
Row 1: 1 2 3 4 5 6 7 ...
Row 2: 3 6 15 8 13 12 7 ...
the number 7 occurs at the same position in both rows 1 and 2 (namely position 7; thereby not satisfying rule 1)
while in the following
Row 1: 1 2 3 4 5 6 7 ...
Row 2: 3 7 15 8 13 12 2 ...
the combination of 2+7 appears twice (in positions 2 and 7; thereby not satisfying rule 2).
It would perhaps be possible – but unnecessarily time-consuming – to do this by hand (at least up until a reasonable number), but there must be quite an elegant solution for this in MATLAB.
This problem is called a derangment of a permutation.
Use the function randperm, in order to find a random permutation of your data.
x = [1 2 3 4 5 6 7];
y = randperm(x);
Then, you can check that the sequence is legal. If not, do it again and again..
You have a probability of about 0.3 each time to succeed, which means that you need roughly 10/3 times to try until you find it.
Therefore you will find the answer really quickly.
Alternatively, you can use this algorithm to create a random derangment.
Edit
If you want to have only cycles of size > 2, this is a generalization of the problem.
In it is written that the probability
in that case is smaller, but big enough to find it in a fixed amount of steps. So the same approach is still valid.
This is fairly straightforward. Create a random permutation of the nodes, but interpret the list as follows: Interpret it as a random walk around the nodes, and if node 'b' appears after node 'a', it means that node 'b' appears below node 'a' in the lists:
So if your initial random permutation is
3 2 5 1 4
Then the walk in this case is 3 -> 2 -> 5 -> 1 -> 4 and you creates the rows as follows:
Row 1: 1 2 3 4 5
Row 2: 4 5 2 3 1
This random walk will satisfy both conditions.
But do you wish to allow more than one cycle in your network? I know you don't want two people to have each other's hat. But what about 7 people, where 3 of them have each other's hats and the other 4 have each other's hats? Is this acceptable and/or desirable?
Andrey has already pointed you to randperm and the rejection-sampling-like approach. After generating a permutation p, an easy way to check whether it has fixed point is any(p==1:n). An easy way to check whether it contains cycles of length 2 is any(p(p)==1:n).
So this gets permutations p of 1:n fulfilling your requirements:
p=[];
while (isempty(p))
p=randperm(n);
if any(p==1:n), p=[];
elseif any(p(p)==1:n), p=[];
end
end
Surrounding this with a for loop and for each counting the iterations of the while loop, it seems that one needs to generate on average 4.5 permutations for every "valid" one (and 6.2 if cycles of length three are not allowed, either). Very interesting.

Resources