How to optimize selection of elements that matches criteria and sizing objectives? - algorithm

Good morning,
Imagine a list of many bags (100).
Each bag can be color=red/yellow/green, size=small/big, heavy=yes/no, woolmade=yes/no
I want to select P=10 numbers of bags that satisfies these conditions:
A=3 number of red bags
B=5 number of small bags
C=2 number of heavy=yes bags
D=3 number of woolmade=yes bags
Here is a concrete example (simplified to 2 attributes):
List of 10 bags (id, color, woolmade Y/N):
(1, red, Y), (2, red, Y), (3, red, Y), (4, red, Y), (5, red, N), (6, green, N), (7, green, N),(8, green, N), (9, green, N), (10, green, Y)
I want to get 5 bags with 3 of red and 4 of woolmade=Y
One possible answer are IDs: 1, 2, 3, 9 & 10
The following answer is NOT correct: IDs 1, 2, 3, 5 & 10 because I will have 4 of red (I only want 3 of red) and 4 of woolmade=Y (correct)
I am interested in a algorithm explanation and a possibly implementation (nodes, java, python, vba, ...)
Thanks

Brute Force:
I just thought about, how to search in all possible combinations for the right ones.
There are 24 different kinds of bags so there are 24 ^ 10 possible combinations.
These can be generated like numbers to root 24 simply by incrementing these.
Each number can be checked according to your criteria.
If the checking of one combinations needs 1 microsecond, about 17612 hours are necessary to check all combinations.
Using only one thread, in about 2 years you might find all possible combinations that fit your criteria.
Backtracking:
If you stop at the first combination of 10 bags that works out, you implemented a backtracking algorithm. How long this might last I can't calculate right now.
Better Brute Force:
Looking at the example below, you see that not all numbers to base 24 are interesting. The problem can be solved by looking at all combinations of 10 elements of numbers between 0 and 23 which may repeat themselves and check these elements. There are 92561040 possible combinations. The check of those may last about 92 seconds. Afterward you will have all possible combinations which solve your problem, if you stop at the first combination you might be much faster.
Example
to understand the 10 numbers see the bag mapping below.
0000000000 10 red, 10 small, 10 heavy, 10 wool
0000000001 9 red, 1 yellow, 10 small, 10 heavy, 10 wool
0000000002 9 red, 1 green, 10 small, 10 heavy, 10 wool
...
000000000A 9 red, 1 yellow, 9 small, 1 big, 9 heavy, 1 small, 10 wool
000000000B 9 red, 1 green, 9 small, 1 big, 9 heavy, 1 small, 10 wool
...
as you can see, for the result, it is only important the name of the same digits, not the place. therefore the number of interesting combinations are not all numbers to the base 24 with 10 places, but all combinations of 10 numbers between 0 and 23 which may be repeated:
(24 + 10 - 1)!/((24 - 1)!10!) = 92561040 combinations which may be checked in 92 seconds if each check need 1 microsecond.
Mapping:
red/small/yes/yes = 0
yellow/small/yes/yes = 1
green/small/yes/yes = 2
red/big/yes/yes = 3
yellow/big/yes/yes = 4
green/big/yes/yes = 5
red/small/no/yes = 6
yellow/small/no/yes = 7
green/small/no/yes = 8
red/big/no/yes = 9
yellow/big/no/yes = A
green/big/no/yes = B
red/small/yes/no = C
yellow/small/yes/no = D
green/small/yes/no = E
red/big/yes/no = F
yellow/big/yes/no = G
green/big/yes/no = H
red/small/no/no = I
yellow/small/no/no = J
green/small/no/no = K
red/big/no/no = L
yellow/big/no/no = M
green/big/no/no = N

Related

Finding the best combination of elements with the max total parameter value

I have 100 elements. Each element has 4 features A,B,C,D. Each feature is an integer.
I want to select 2 elements for each feature, so that I have selected a total of 8 distinct elements. I want to maximize the sum of the 8 selected features A,A,B,B,C,C,D,D.
A greedy algorithm would be to select the 2 elements with highest A, then the two elements with highest B among the remaining elements, etc. However, this might not be optimal, because the elements that have highest A could also have a much higher B.
Do we have an algorithm to solve such a problem optimally?
This can be solved as a minimum cost flow problem. In particular, this is an Assignment problem
First of all, see that we only need the 8 best elements of each features, meaning 32 elements maximum. It should even be possible to cut the search space further (as if the 2 best elements of A is not one of the 6 best elements of any other feature, we can already assigne those 2 elements to A, and each other feature only needs to look at the first 6 best elements. If it's not clear why, I'll try to explain further).
Then we make the vertices S,T and Fa,Fb,Fc,Fd and E1,E2,...E32, with the following edge :
for each vertex Fx, an edge from S to Fx with maximum flow 2 and a weight of 0 (as we want 2 element for each feature)
for each vertex Ei, an edge from Fx to Ei if Ei is one of the top elements of feature x, with maximum flow 1 and weight equal to the negative value of feature x of Ei. (negative because the algorithm will find the minimum cost)
for each vertex Ei, an edge from Ei to T, with maximum flow 1 and weight 0. (as each element can only be selected once)
I'm not sure if this is the best way, but It should work.
As suggested per #AloisChristen, this can be written as an assignment problem:
On the one side, we select the 8 best elements for each feature; that's 32 elements or less, since one element might be in the best 8 for more than one feature;
On the other side, we put 8 seats A,A,B,B,C,C,D,D
Solve the resulting assignment problem.
Here the problem is solved using scipy's linear_sum_assignment optimization function:
from numpy.random import randint
from numpy import argpartition, unique, concatenate
from scipy.optimize import linear_sum_assignment
# PARAMETERS
n_elements = 100
n_features = 4
n_per_feature = 2
# RANDOM DATA
data = randint(0, 21, (n_elements, n_features)) # random data with integer features between 0 and 20 included
# SELECT BEST 8 CANDIDATES FOR EACH FEATURE
n_selected = n_features * n_per_feature
n_candidates = n_selected * n_features
idx = argpartition(data, range(-n_candidates, 0), axis=0)
idx = unique(idx[-n_selected:].ravel())
candidates = data[idx]
n_candidates = candidates.shape[0]
# SOLVE ASSIGNMENT PROBLEM
cost_matrix = -concatenate((candidates,candidates), axis=1) # 8 columns in order ABCDABCD
element_idx, seat_idx = linear_sum_assignment(cost_matrix)
score = -cost_matrix[element_idx, seat_idx].sum()
# DISPLAY RESULTS
print('SUM OF SELECTED FEATURES: {}'.format(score))
for e,s in zip(element_idx, seat_idx):
print('{:2d}'.format(idx[e]),
'ABCDABCD'[s],
-cost_matrix[e,s],
data[idx[e]])
Output:
SUM OF SELECTED FEATURES: 160
3 B 20 [ 5 20 14 11]
4 A 20 [20 9 3 12]
6 C 20 [ 3 3 20 8]
10 A 20 [20 10 9 9]
13 C 20 [16 12 20 18]
23 D 20 [ 6 10 4 20]
24 B 20 [ 5 20 6 8]
27 D 20 [20 13 19 20]

Strategy with regard to how to approach this algorithm?

I was asked this question in a test and I need help with regards to how I should approach the solution, not the actual answer. The question is
You have been given a 7 digit number(with each digit being distinct and 0-9). The number has this property
product of first 3 digits = product of last 3 digits = product of central 3 digits
Identify the middle digit.
Now, I can do this on paper by brute force(trial and error), the product is 72 and digits being
8,1,9,2,4,3,6
Now how do I approach the problem in a no brute force way?
Let the number is: a b c d e f g
So as per the rule(1):
axbxc = cxdxe = exfxg
more over we have(2):
axb = dxe and
cxd = fxg
This question can be solved with factorization and little bit of hit/trial.
Out of the digits from 1 to 9, 5 and 7 can rejected straight-away since these are prime numbers and would not fit in the above two equations.
The digits 1 to 9 can be factored as:
1 = 1, 2 = 2, 3 = 3, 4 = 2X2, 6 = 2X3, 8 = 2X2X2, 9 = 3X3
After factorization we are now left with total 7 - 2's, 4 - 3's and the number 1.
As for rule 2 we are left with only 4 possibilities, these 4 equations can be computed by factorization logic since we know we have overall 7 2's and 4 3's with us.
1: 1X8(2x2x2) = 2X4(2x2)
2: 1X6(3x2) = 3X2
3: 4(2x2)X3 = 6(3x2)X2
4: 9(3x3)X2 = 6(3x2)X3
Skipping 5 and 7 we are left with 7 digits.
With above equations we have 4 digits with us and are left with remaining 3 digits which can be tested through hit and trial. For example, if we consider the first case we have:
1X8 = 2X4 and are left with 3,6,9.
we have axbxc = cxdxe we can opt c with these 3 options in that case the products would be 24, 48 and 72.
24 cant be correct since for last three digits we are left with are 6,9,4(=216)
48 cant be correct since for last three digits we are left with 3,9,4(=108)
72 could be a valid option since the last three digits in that case would be 3,6,4 (=72)
This question is good to solve with Relational Programming. I think it very clearly lets the programmer see what's going on and how the problem is solved. While it may not be the most efficient way to solve problems, it can still bring desired clarity and handle problems up to a certain size. Consider this small example from Oz:
fun {FindDigits}
D1 = {Digit}
D2 = {Digit}
D3 = {Digit}
D4 = {Digit}
D5 = {Digit}
D6 = {Digit}
D7 = {Digit}
L = [D1 D2 D3] M = [D3 D4 D5] E= [D5 D6 D7] TotL in
TotL = [D1 D2 D3 D4 D5 D6 D7]
{Unique TotL} = true
{ProductList L} = {ProductList M} = {ProductList E}
TotL
end
(Now this would be possible to parameterize furthermore, but non-optimized to illustrate the point).
Here you first pick 7 digits with a function Digit/0. Then you create three lists, L, M and E consisting of the segments, as well as a total list to return (you could also return the concatenation, but I found this better for illustration).
Then comes the point, you specify relations that have to be intact. First, that the TotL is unique (distinct in your tasks wording). Then the next one, that the segment products have to be equal.
What now happens is that a search is conducted for your answers. This is a depth-first search strategy, but could also be breadth-first, and a solver is called to bring out all solutions. The search strategy is found inside the SolveAll/1 function.
{Browse {SolveAll FindDigits}}
Which in turns returns this list of answers:
[[1 8 9 2 4 3 6] [1 8 9 2 4 6 3] [3 6 4 2 9 1 8]
[3 6 4 2 9 8 1] [6 3 4 2 9 1 8] [6 3 4 2 9 8 1]
[8 1 9 2 4 3 6] [8 1 9 2 4 6 3]]
At least this way forward is not using brute force. Essentially you are searching for answers here. There might be heuristics that let you find the correct answer sooner (some mathematical magic, perhaps), or you can use genetic algorithms to search the space or other well-known strategies.
Prime factor of distinct digit (if possible)
0 = 0
1 = 1
2 = 2
3 = 3
4 = 2 x 2
5 = 5
6 = 2 x 3
7 = 7
8 = 2 x 2 x 2
9 = 3 x 3
In total:
7 2's + 4 3's + 1 5's + 1 7's
With the fact that When A=B=C, composition of prime factor of A must be same as composition of prime factor of B and that of C, 0 , 5 and 7 are excluded since they have unique prime factor that can never match with the fact.
Hence, 7 2's + 4 3's are left and we have 7 digit (1,2,3,4,6,8,9). As there are 7 digits only, the number is formed by these digits only.
Recall the fact, A, B and C must have same composition of prime factors. This implies that A, B and C have same number of 2's and 3's in their composition. So, we should try to achieve (in total for A and B and C):
9 OR 12 2's AND
6 3's
(Must be product of 3, lower bound is total number of prime factor of all digits, upper bound is lower bound * 2)
Consider point 2 (as it has one possibility), A has 2 3's and same for B and C. To have more number of prime factor in total, we need to put digit in connection digit between two product (third or fifth digit). Extract digits with prime factor 3 into two groups {3,6} and {9} and put digit into connection digit. The only possible way is to put 9 in connection digit and 3,6 on unconnected product. That mean xx9xx36 or 36xx9xx (order of 3,6 is not important)
With this result, we get 9 x middle x connection digit = connection digit x 3 x 6. Thus, middle = (3 x 6) / 9 = 2
My answer actually extends #Ansh's answer.
Let abcdefg be the digits of the number. Then
ab=de
cd=fg
From these relations we can exclude 0, 5 and 7 because there are no other multipliers of these numbers between 0 and 9. So we are left with seven numbers and each number is included once in each answer. We are going to examine how we can pair the numbers (ab, de, cd, fg).
What happens with 9? It can't be combined with 3 or 6 since then their product will have three times the factor 3 and we have at total 4 factors of 3. Similarly, 3 and 6 must be combined at least one time together in response to the two factors of 9. This gives a product of 18 and so 9 must be combined at least once with 2.
Now if 9x2 is in a corner then 3x6 must be in the middle. Meaning in the other corner there must be another multiplier of 3. So 9 and 2 are in the middle.
Let's suppose ab=3x6 (The other case is symmetric). Then d must be 9 or 2. But if d is 9 then f or g must be multiplier of 3. So d is 2 and e is 9. We can stop here and answer the middle digit is
2
Now we have 2c = fg and the remaining choices are 1, 4, 8. We see that the only solutions are c = 4, f = 1, g = 8 and c = 4, f = 8, g = 1.
So if is 3x6 is in the left corner we have the following solutions:
3642918, 3642981, 6342918, 6342981
If 3x6 is in the right corner we have the following solutions which are the reverse of the above:
8192463, 1892463, 8192436, 1892436
Here is how you can consider the problem:
Let's note the final solution N1 N2 N3 N4 N5 N6 N7 for the 3 numbers N1N2N3, N3N4N5 and N5N6N7
0, 5 and 7 are to exclude because they are prime and no other ciphers is a multiple of them. So if they had divided one of the 3 numbers, no other number could have divided the others.
So we get the 7 remaining ciphers : 1234689
where the product of the ciphers is 2^7*3^4
(N1*N2*N3) and (N5*N6*N7) are equals so their product is a square number. We can then remove, one of the number (N4) from the product of the previous point to find a square number (i.e. even exponents on both numbers)
N4 can't be 1, 3, 4, 6, 9.
We conclude N4 is 2 or 8
If N4 is 8 and it divides (N3*N4*N5), we can't use the remaining even numbers (2, 4, 6) to divides
both (N1*N2*N3) and (N6*N7*N8) by 8. So N4 is 2 and 8 does not belong to the second group (let's put it in N1).
Now, we have: 1st grp: 8XX, 2nd group: X2X 3rd group: XXX
Note: at this point we know that the product is 72 because it is 2^3*3^2 (the square root of 2^6*3^4) but the result is not really important. We have made the difficult part knowing the 7 numbers and the middle position.
Then, we know that we have to distribute 2^3 on (N1*N2*N3), (N3*N4*N5), (N5*N6*N7) because 2^3*2*2^3=2^7
We already gave 8 to N1, 2 to N4 and we place 6 to N6, and 4 to N5 position, resulting in each of the 3 numbers being a multiple of 8.
Now, we have: 1st grp: 8XX, 2nd group: X24 3rd group: 46X
We have the same way of thinking considering the odd number, we distribute 3^2, on each part knowing that we already have a 6 in the last group.
Last group will then get the 3. And first and second ones the 9.
Now, we have: 1st grp: 8X9, 2nd group: 924 3rd group: 463
And, then 1 at N2, which is the remaining position.
This problem is pretty easy if you look at the number 72 more carefully.
We have our number with this form abcdefg
and abc = cde = efg, with those digits 8,1,9,2,4,3,6
So, first, we can conclude that 8,1,9 must be one of the triple, because, there is no way 1 can go with other two numbers to form 72.
We can also conclude that 1 must be in the start/end of the whole number or middle of the triple.
So now we have 819defg or 918defg ...
Using some calculations with the rest of those digits, we can see that only 819defg is possible, because, we need 72/9 = 8,so only 2,4 is valid, while we cannot create 72/8 = 9 from those 2,4,3,6 digits, so -> 81924fg or 81942fg and 819 must be the triple that start or end our number.
So the rest of the job is easy, we need either 72/4 = 18 or 72/2 = 36, now, we can have our answers: 8192436 or 8192463.
7 digits: 8,1,9,2,4,3,6
say XxYxZ = 72
1) pick any two from above 7 digits. say X,Y
2) divide 72 by X and then Y.. you will get the 3rd number i.e Z.
we found XYZ set of 3-digits which gives result 72.
now repeat 1) and 2) with remaining 4 digits.
this time we found ABC which multiplies to 72.
lets say, 7th digit left out is I.
3) divide 72 by I. result R
4) divide R by one of XYZ. check if result is in ABC.
if No, repeat the step 3)
if yes, found the third pair.(assume you divided R by Y and the result is B)
YIB is the third pair.
so... solution will be.
XZYIBAC
You have your 7 numbers - instead of looking at it in groups of 3 divide up the number as such:
AB | C | D | E | FG
Get the value of AB and use it to get the value of C like so: C = ABC/AB
Next you want to do the same thing with the trailing 2 digits to find E using FG. E = EFG/FG
Now that you have C & E you can solve for D
Since CDE = ABC then D = ABC/CE
Remember your formulas - instead of looking at numbers create a formula aka an algorithm that you know will work every time.
ABC = CDE = EFG However, you have to remember that your = signs have to balance. You can see that D = ABC/CE = EFG/CE Once you know that, you can figure out what you need in order to solve the problem.
Made a quick example in a fiddle of the code:
http://jsfiddle.net/4ykxx9ve/1/
var findMidNum = function() {
var num = [8, 1, 9, 2, 4, 3, 6];
var ab = num[0] * num[1];
var fg = num[5] * num[6];
var abc = num[0] * num[1] * num[2];
var cde = num[2] * num[3] * num[4];
var efg = num[4] * num[5] * num[6];
var c = abc/ab;
var e = efg/fg;
var ce = c * e
var d = abc/ce;
console.log(d); //2
}();
You have been given a 7 digit number(with each digit being distinct and 0-9). The number has this property
product of first 3 digits = product of last 3 digits = product of central 3 digits
Identify the middle digit.
Now, I can do this on paper by brute force(trial and error), the product is 72 and digits being
8,1,9,2,4,3,6
Now how do I approach the problem in a no brute force way?
use linq and substring functions
example var item = array.Skip(3).Take(3) in such a way that you have a loop
for(f =0;f<charlen.length;f++){
var xItemSum = charlen[f].Skip(f).Take(f).Sum(f => f.Value);
}
// untested code

Project Euler - 68

I have already read What is an "external node" of a "magic" 3-gon ring? and I have solved problems up until 90 but this n-gon thing totally baffles me as I don't understand the question at all.
So I take this ring and I understand that the external circles are 4, 5, 6 as they are outside the inner circle. Now he says there are eight solutions. And the eight solutions are without much explanation listed below. Let me take
9 4,2,3; 5,3,1; 6,1,2
9 4,3,2; 6,2,1; 5,1,3
So how do we arrive at the 2 solutions? I understand 4, 3, 2, is in straight line and 6,2,1 is in straight line and 5, 1, 3 are in a straight line and they are in clockwise so the second solution makes sense.
Questions
Why does the first solution 4,2,3; 5,3,1; 6,1,2 go anti clock wise? Should it not be 423 612 and then 531?
How do we arrive at 8 solutions. Is it just randomly picking three numbers? What exactly does it mean to solve a "N-gon"?
The first doesn't go anti-clockwise. It's what you get from the configuration
4
\
2
/ \
1---3---5
/
6
when you go clockwise, starting with the smallest number in the outer ring.
How do we arrive at 8 solutions. Is it just randomly picking three numbers? What exactly does it mean to solve a "N-gon"?
For an N-gon, you have an inner N-gon, and for each side of the N-gon one spike, like
X
|
X---X---X
| |
X---X---X
|
X
so that the spike together with the side of the inner N-gon connects a group of three places. A "solution" of the N-gon is a configuration where you placed the numbers from 1 to 2*N so that each of the N groups sums to the same value.
The places at the end of the spikes appear in only one group each, the places on the vertices of the inner N-gon in two. So the sum of the sums of all groups is
N
∑ k + ∑{ numbers on vertices }
k=1
The sum of the numbers on the vertices of the inner N-gon is at least 1 + 2 + ... + N = N*(N+1)/2 and at most (N+1) + (N+2) + ... + 2*N = N² + N*(N+1)/2 = N*(3*N+1)/2.
Hence the sum of the sums of all groups is between
N*(2*N+1) + N*(N+1)/2 = N*(5*N+3)/2
and
N*(2*N+1) + N*(3*N+1)/2 = N*(7*N+3)/2
inclusive, and the sum per group must be between
(5*N+3)/2
and
(7*N+3)/2
again inclusive.
For the triangle - N = 3 - the bounds are (5*3+3)/2 = 9 and (7*3+3)/2 = 12. For a square - N = 4 - the bounds are (5*4+3)/2 = 11.5 and (7*4+3)/2 = 15.5 - since the sum must be an integer, the possible sums are 12, 13, 14, 15.
Going back to the triangle, if the sum of each group is 9, the sum of the sums is 27, and the sum of the numbers on the vertices must be 27 - (1+2+3+4+5+6) = 27 - 21 = 6 = 1+2+3, so the numbers on the vertices are 1, 2 and 3.
For the sum to be 9, the value at the end of the spike for the side connecting 1 and 2 must be 6, for the side connecting 1 and 3, the spike value must be 5, and 4 for the side connecting 2 and 3.
If you start with the smallest value on the spikes - 4 - you know you have to place 2 and 3 on the vertices of the side that spike protrudes from. There are two ways to arrange the two numbers there, leading to the two solutions for sum 9.
If the sum of each group is 10, the sum of the sums is 30, and the sum of the numbers on the vertices must be 9. To represent 9 as the sum of three distinct numbers from 1 to 6, you have the possibilities
1 + 2 + 6
1 + 3 + 5
2 + 3 + 4
For the first group, you have one side connecting 1 and 2, so you'd need a 7 on the end of the spike to make 10 - no solution.
For the third group, the minimal sum of two of the numbers is 5, but 5+6 = 11 > 10, so there's no place for the 6 - no solution.
For the second group, the sums of the sides are
1 + 3 = 4 -- 6 on the spike
1 + 5 = 6 -- 4 on the spike
3 + 5 = 8 -- 2 on the spike
and you have two ways to arrange 3 and 5, so that the group is either 2-3-5 or 2-5-3, the rest follows again.
The solutions for the sums 11 and 12 can be obtained similarly, or by replacing k with 7-k in the solutions for the sums 9 resp. 10.
To solve the problem, you must now find out
what it means to obtain a 16-digit string or a 17-digit string
which sum for the groups gives rise to the largest value when the numbers are concatenated in the prescribed way.
(And use pencil and paper for the fastest solution.)

Tennis match scheduling

There are a limited number of players and a limited number of tennis courts. At each round, there can be at most as many matches as there are courts.
Nobody plays 2 rounds without a break. Everyone plays a match against everyone else.
Produce the schedule that takes as few rounds as possible. (Because of the rule that there must a break between rounds for everyone, there can be a round without matches.)
The output for 5 players and 2 courts could be:
| 1 2 3 4 5
-|-------------------
2| 1 -
3| 5 3 -
4| 7 9 1 -
5| 3 7 9 5 -
In this output the columns and rows are the player-numbers, and the numbers inside the matrix are the round numbers these two players compete.
The problem is to find an algorithm which can do this for larger instances in a feasible time. We were asked to do this in Prolog, but (pseudo-) code in any language would be useful.
My first try was a greedy algorithm, but that gives results with too many rounds.
Then I suggested an iterative deepening depth-first search, which a friend of mine implemented, but that still took too much time on instances as small as 7 players.
(This is from an old exam question. No one I spoke to had any solution.)
Preface
In Prolog, CLP(FD) constraints are the right choice for solving such scheduling tasks.
See clpfd for more information.
In this case, I suggest using the powerful global_cardinality/2 constraint to restrict the number of occurrences of each round, depending on the number of available courts. We can use iterative deepening to find the minimal number of admissible rounds.
Freely available Prolog systems suffice to solve the task satisfactorily. Commercial-grade systems will run dozens of times faster.
Variant 1: Solution with SWI-Prolog
:- use_module(library(clpfd)).
tennis(N, Courts, Rows) :-
length(Rows, N),
maplist(same_length(Rows), Rows),
transpose(Rows, Rows),
Rows = [[_|First]|_],
chain(First, #<),
length(_, MaxRounds),
numlist(1, MaxRounds, Rounds),
pairs_keys_values(Pairs, Rounds, Counts),
Counts ins 0..Courts,
foldl(triangle, Rows, Vss, Dss, 0, _),
append(Vss, Vs),
global_cardinality(Vs, Pairs),
maplist(breaks, Dss),
labeling([ff], Vs).
triangle(Row, Vs, Ds, N0, N) :-
length(Prefix, N0),
append(Prefix, [-|Vs], Row),
append(Prefix, Vs, Ds),
N #= N0 + 1.
breaks([]).
breaks([P|Ps]) :- maplist(breaks_(P), Ps), breaks(Ps).
breaks_(P0, P) :- abs(P0-P) #> 1.
Sample query: 5 players on 2 courts:
?- time(tennis(5, 2, Rows)), maplist(writeln, Rows).
% 827,838 inferences, 0.257 CPU in 0.270 seconds (95% CPU, 3223518 Lips)
[-,1,3,5,7]
[1,-,5,7,9]
[3,5,-,9,1]
[5,7,9,-,3]
[7,9,1,3,-]
The specified task, 6 players on 2 courts, solved well within the time limit of 1 minute:
?- time(tennis(6, 2, Rows)),
maplist(format("~t~w~3+~t~w~3+~t~w~3+~t~w~3+~t~w~3+~t~w~3+\n"), Rows).
% 6,675,665 inferences, 0.970 CPU in 0.977 seconds (99% CPU, 6884940 Lips)
- 1 3 5 7 10
1 - 6 9 11 3
3 6 - 11 9 1
5 9 11 - 2 7
7 11 9 2 - 5
10 3 1 7 5 -
Further example: 7 players on 5 courts:
?- time(tennis(7, 5, Rows)),
maplist(format("~t~w~3+~t~w~3+~t~w~3+~t~w~3+~t~w~3+~t~w~3+~t~w~3+\n"), Rows).
% 125,581,090 inferences, 17.476 CPU in 18.208 seconds (96% CPU, 7185927 Lips)
- 1 3 5 7 9 11
1 - 5 3 11 13 9
3 5 - 9 1 7 13
5 3 9 - 13 11 7
7 11 1 13 - 5 3
9 13 7 11 5 - 1
11 9 13 7 3 1 -
Variant 2: Solution with SICStus Prolog
With the following additional definitions for compatibility, the same program also runs in SICStus Prolog:
:- use_module(library(lists)).
:- use_module(library(between)).
:- op(700, xfx, ins).
Vs ins D :- maplist(in_(D), Vs).
in_(D, V) :- V in D.
chain([], _).
chain([L|Ls], Pred) :-
chain_(Ls, L, Pred).
chain_([], _, _).
chain_([L|Ls], Prev, Pred) :-
call(Pred, Prev, L),
chain_(Ls, L, Pred).
pairs_keys_values(Ps, Ks, Vs) :- keys_and_values(Ps, Ks, Vs).
foldl(Pred, Ls1, Ls2, Ls3, S0, S) :-
foldl_(Ls1, Ls2, Ls3, Pred, S0, S).
foldl_([], [], [], _, S, S).
foldl_([L1|Ls1], [L2|Ls2], [L3|Ls3], Pred, S0, S) :-
call(Pred, L1, L2, L3, S0, S1),
foldl_(Ls1, Ls2, Ls3, Pred, S1, S).
time(Goal) :-
statistics(runtime, [T0|_]),
call(Goal),
statistics(runtime, [T1|_]),
T #= T1 - T0,
format("% Runtime: ~Dms\n", [T]).
Major difference: SICStus, being a commercial-grade Prolog that ships with a serious CLP(FD) system, is much faster than SWI-Prolog in this use case and others like it.
The specified task, 6 players on 2 courts:
?- time(tennis(6, 2, Rows)),
maplist(format("~t~w~3+~t~w~3+~t~w~3+~t~w~3+~t~w~3+~t~w~3+\n"), Rows).
% Runtime: 34ms (!)
- 1 3 5 7 10
1 - 6 11 9 3
3 6 - 9 11 1
5 11 9 - 2 7
7 9 11 2 - 5
10 3 1 7 5 -
The larger example:
| ?- time(tennis(7, 5, Rows)),
maplist(format("~t~w~3+~t~w~3+~t~w~3+~t~w~3+~t~w~3+~t~w~3+~t~w~3+\n"), Rows).
% Runtime: 884ms
- 1 3 5 7 9 11
1 - 5 3 9 7 13
3 5 - 1 11 13 7
5 3 1 - 13 11 9
7 9 11 13 - 3 1
9 7 13 11 3 - 5
11 13 7 9 1 5 -
Closing remarks
In both systems, global_cardinality/3 allows you to specify options that alter the propagation strength of the global cardinality constraint, enabling weaker and potentially more efficient filtering. Choosing the right options for a specific example may have an even larger impact than the choice of Prolog system.
This is very similar to the Traveling Tournament Problem, which is about scheduling football teams. In TTP, they can find the optimal solution up to only 8 teams. Anyone who breaks an ongoing record of 10 or more teams, has it a lot easier to get published in a research journal.
It is NP hard and the trick is to use meta-heuristics, such as tabu search, simulated annealing, ... instead of brute force or branch and bound.
Take a look my implementation with Drools Planner (open source, java).
Here are the constraints, it should be straightforward to replace that with constraints such as Nobody plays 2 rounds without a break.
Each player must play at least n - 1 matches where n is the number of players. So the minimum of rounds is 2(n - 1) - 1, since every player needs to rest a match. The minimum is also bound by (n(n-1))/2 total matches divided by number of courts. Using the smallest of these two gives you the length of the optimal solution. Then it's a matter of coming up with a good lower estimating formula ((number of matches+rests remaining)/courts) and run A* search.
As Geoffrey said, I believe the problem is NP Hard, but meta-heuristics such as A* is very applicable.
Python Solution:
import itertools
def subsets(items, count = None):
if count is None:
count = len(items)
for idx in range(count + 1):
for group in itertools.combinations(items, idx):
yield frozenset(group)
def to_players(games):
return [game[0] for game in games] + [game[1] for game in games]
def rounds(games, court_count):
for round in subsets(games, court_count):
players = to_players(round)
if len(set(players)) == len(players):
yield round
def is_canonical(player_count, games_played):
played = [0] * player_count
for players in games_played:
for player in players:
played[player] += 1
return sorted(played) == played
def solve(court_count, player_count):
courts = range(court_count)
players = range(player_count)
games = list( itertools.combinations(players, 2) )
possible_rounds = list( rounds(games, court_count) )
rounds_last = {}
rounds_all = {}
choices_last = {}
choices_all = {}
def update(target, choices, name, value, choice):
try:
current = target[name]
except KeyError:
target[name] = value
choices[name] = choice
else:
if current > value:
target[name] = value
choices[name] = choice
def solution(games_played, players, score, choice, last_players):
games_played = frozenset(games_played)
players = frozenset(players)
choice = (choice, last_players)
update(rounds_last.setdefault(games_played, {}),
choices_last.setdefault(games_played, {}),
players, score, choice)
update(rounds_all, choices_all, games_played, score, choice)
solution( [], [], 0, None, None)
for games_played in subsets(games):
if is_canonical(player_count, games_played):
try:
best = rounds_all[games_played]
except KeyError:
pass
else:
for next_round in possible_rounds:
next_games_played = games_played.union(next_round)
solution(
next_games_played, to_players(next_round), best + 2,
next_round, [])
for last_players, score in rounds_last[games_played].items():
for next_round in possible_rounds:
if not last_players.intersection( to_players(next_round) ):
next_games_played = games_played.union(next_round)
solution( next_games_played, to_players(next_round), score + 1,
next_round, last_players)
all_games = frozenset(games)
print rounds_all[ all_games ]
round, prev = choices_all[ frozenset(games) ]
while all_games:
print "X ", list(round)
all_games = all_games - round
if not all_games:
break
round, prev = choices_last[all_games][ frozenset(prev) ]
solve(2, 6)
Output:
11
X [(1, 2), (0, 3)]
X [(4, 5)]
X [(1, 3), (0, 2)]
X []
X [(0, 5), (1, 4)]
X [(2, 3)]
X [(1, 5), (0, 4)]
X []
X [(2, 5), (3, 4)]
X [(0, 1)]
X [(2, 4), (3, 5)]
This means it will take 11 rounds. The list shows the games to be played in the rounds in reverse order. (Although I think the same schedule works forwards and backwords.)
I'll come back and explain why I have the chance.
Gets incorrect answers for one court, five players.
Some thoughts, perhaps a solution...
Expanding the problem to X players and Y courts, I think we can safely say that when given the choice, we must select the players with the fewest completed matches, otherwise we run the risk of ending up with one player left who can only play every other week and we end up with many empty weeks in between. Picture the case with 20 players and 3 courts. We can see that during round 1 players 1-6 meet, then in round 2 players 7-12 meet, and in round 3 we could re-use players 1-6 leaving players 13-20 until later. Therefor, I think our solution cannot be greedy and must balance the players.
With that assumption, here is a first attempt at a solution:
1. Create master-list of all matches ([12][13][14][15][16][23][24]...[45][56].)
2. While (master-list > 0) {
3. Create sub-list containing only eligible players (eliminate all players who played the previous round.)
4. While (available-courts > 0) {
5. Select match from sub-list where player1.games_remaining plus player2.games_remaining is maximized.
6. Place selected match in next available court, and
7. decrement available-courts.
8. Remove selected match from master-list.
9. Remove all matches that contain either player1 or player2 from sub-list.
10. } Next available-court
11. Print schedule for ++Round.
12. } Next master-list
I can't prove that this will produce a schedule with the fewest rounds, but it should be close. The step that may cause problems is #5 (select match that maximizes player's games remaining.) I can imagine that there might be a case where it's better to pick a match that almost maximizes 'games_remaining' in order to leave more options in the following round.
The output from this algorithm would look something like:
Round Court1 Court2
1 [12] [34]
2 [56] --
3 [13] [24]
4 -- --
5 [15] [26]
6 -- --
7 [35] [46]
. . .
Close inspection will show that in round 5, if the match on Court2 had been [23] then match [46] could have been played during round 6. That, however, doesn't guarantee that there won't be a similar issue in a later round.
I'm working on another solution, but that will have to wait for later.
I don't know if this matters, the "5 Players and 2 Courts" example data is missing three other matches: [1,3], [2,4] and [3,5]. Based on the instruction: "Everyone plays a match against everyone else."

Using one probability set to generate another [duplicate]

This question already has answers here:
Expand a random range from 1–5 to 1–7
(78 answers)
Closed 8 years ago.
How can I generate a bigger probability set from a smaller probability set?
This is from Algorithm Design Manual -Steven Skiena
Q:
Use a random number generator (rng04) that generates numbers from {0,1,2,3,4} with equal probability to write a random number generator that generates numbers from 0 to 7 (rng07) with equal probability?
I tried for around 3 hours now, mostly based on summing two rng04 outputs. The problem is that in that case the probability of each value is different - 4 can come with 5/24 probability while 0 happening is 1/24. I tried some ways to mask it, but cannot.
Can somebody solve this?
You have to find a way to combine the two sets of random numbers (the first and second random {0,1,2,3,4} ) and make n*n distinct possibilities. Basically the problem is that with addition you get something like this
X
0 1 2 3 4
0 0 1 2 3 4
Y 1 1 2 3 4 5
2 2 3 4 5 6
3 3 4 5 6 7
4 4 5 6 7 8
Which has duplicates, which is not what you want. One possible way to combine the two sets would be the Z = X + Y*5 where X and Y are the two random numbers. That would give you a set of results like this
X
0 1 2 3 4
0 0 1 2 3 4
Y 1 5 6 7 8 9
2 10 11 12 13 14
3 15 16 17 18 19
4 20 21 22 23 24
So now that you have a bigger set of random numbers, you need to do the reverse and make it smaller. This set has 25 distinct values (because you started with 5, and used two random numbers, so 5*5=25). The set you want has 8 distinct values. A naïve way to do this would be
x = rnd(5) // {0,1,2,3,4}
y = rnd(5) // {0,1,2,3,4}
z = x+y*5 // {0-24}
random07 = x mod 8
This would indeed have a range of {0,7}. But the values {1,7} would appear 3/25 times, and the value 0 would appear 4/25 times. This is because 0 mod 8 = 0, 8 mod 8 = 0, 16 mod 8 = 0 and 24 mod 8 = 0.
To fix this, you can modify the code above to this.
do {
x = rnd(5) // {0,1,2,3,4}
y = rnd(5) // {0,1,2,3,4}
z = x+y*5 // {0-24}
while (z != 24)
random07 = z mod 8
This will take the one value (24) that is throwing off your probabilities and discard it. Generating a new random number if you get a 'bad' value like this will make your algorithm run very slightly longer (in this case 1/25 of the time it will take 2x as long to run, 1/625 it will take 3x as long, etc). But it will give you the right probabilities.
The real problem, of course, is the fact that the numbers in the middle of the sum (4 in this case) occur in many combinations (0+4, 1+3, etc.) whereas 0 and 8 have exactly one way to be produced.
I don't know how to solve this problem, but I'm going to try to reduce it a bit for you. Some points to consider:
The 0-7 range has 8 possible values, so ultimately the total number of possible situations that you should aim for has to be a multiple of 8. That way you can have an integral number of distributions per value in that codomain.
When you take the sum of two density functions, the number of possible situations (not necessarily distinct when you evaluate the sum, just in terms of different permutations of inputs) is equal to the product of the size of each of the input sets.
Thus, given two {0,1,2,3,4} sets summed together, you have 5*5=25 possibilities.
It will not be possible to get a multiple of eight (see first point) from powers of 5 (see second point, but extrapolate it to any number of sets > 1), so you will need to have a surplus of possible situations in your function and ignore some of them if they occur.
The simplest way to do that, as far as I can see at this point, is to use the sum of two {0,1,2,3,4} sets (25 possibilities) and ignore 1 (to leave 24, a multiple of 8).
Thus the challenge now has been reduced to this: Find a way to distribute the remaining 24 possibilities among the 8 output values. For this, you'll probably NOT want to use the sum, but rather just the input values.
One way to do that is, imagine a number in base 5 constructed from your input. Ignore 44 (that's your 25th, superfluous value; if you get it, synthesize a new set of inputs) and take the others, modulo 8, and you'll get your 0-7 across 24 different input combinations (3 each), which is an equal distribution.
My logic would be this:
rn07 = 0;
do {
num = rng04;
}
while(num == 4);
rn07 = num * 2;
do {
num = rng04;
}
while(num == 4);
rn07 += num % 2

Resources