permutations without repetition - algorithm

I would like to know, what is the best approach to solve this problem:
Given x, y and y integers: a1, a2, a3 .. ay find all combinations of
a1 ± a2 ± ... ± ay = x, y < 20.
My recent approach is to find all permutations of 1 and 0 stored in table T and then, depending on whether number T[i] is 1 and 0, add or subtract ai from sum. The problem is that there are n! permutations of n-element array. Hence, for 20-element array, I have to check 20! possibilities where most of them are repeated. Could you please suggest me any potential approach to solving my problem?

There are only 2^20 (just over a million) binary vectors of length 20 rather than the infeasible 20!. Use should be able to brute-force that few in less than a second, especially if you use a Gray Code which would allow you to pass from one candidate sum to another in a single step (e.g. to go from a + b - c -d to a + b - c + d just add 2*d.
The excellent branch and bound idea of #MikeWise would be good if y gets much larger. Generate a tree starting with a root node of 0. Give it children of -a1 and +a1. Then 4 grand children by adding and subtracting a2, etc. If you ever get farther than the sum of the remaining ai from the target x -- you can prune that branch. In the worst case, this might be slightly worse than the Gray-code based brute force (because you need to do so much more processing at each node), but in the best case you might be able to prune away most possibilities.
On Edit: Here is some Python code. First I define a generator which, given an integer n, successively returns which bit position needs to flip to step through a Gray code:
def grayBit(n):
code = [0]*n
odd = True
done = False
while not done:
if odd:
code[0] = 1 - code[0] #flip bit
odd = False
yield 0
else:
i = code.index(1)
if i == n-1:
done = True
else:
code[i+1] = 1 - code[i+1]
odd = True
yield i+1
(This uses an algorithm which I learned years ago in the excellent book "Constructive Combinatorics" by Stanton and White).
Then -- I use this to return all solutions (as lists consisting of the input list of numbers with negative signs inserted as needed). The key point is that I can take the current bit-to-flip and either add or subtract twice the corresponding number:
def signedSums(nums, target):
n = len(nums)
patterns = []
total = sum(nums)
pattern = [1]*n
if target == total: patterns.append([x*y for x,y in zip(nums,pattern)])
deltas = [2*i for i in nums]
for i in grayBit(n):
if pattern[i] == 1:
total -= deltas[i]
else:
total += deltas[i]
pattern[i] = -1 * pattern[i]
if target == total: patterns.append([x*y for x,y in zip(nums,pattern)])
return patterns
Typical output:
>>> signedSums([1,2,3,4,5,9],6)
[[1, -2, -3, -4, 5, 9], [1, 2, 3, -4, -5, 9], [-1, 2, -3, 4, -5, 9], [1, 2, 3, 4, 5, -9]]
It only takes about a second to evaluate:
>>> len(signedSums([i for i in range(1,21)],100))
2865
Hence there are 2865 ways to add or subtract the integers in the range 1,2,..,20 to get a net sum of 100.
I assumed that a1 can be either added or subtracted (instead of just added, which is what your question implies if taken literally). Note that if you really want to insist that a1 occurs positively, then you could just subtract it from x and apply the above algorithm to the rest of the list and the adjusted target.
Finally, it is not too hard to see that if you solve the subset sub problem with the set of weights {2*a1, 2*a2, 2*a3, .... 2*ay} and with a target sum of x + a1 + a2 + ... + ay then the subsets selected will correspond exactly to the subsets where the positive signs occur in the solution to the original problem. Thus your problem is easily reducible to the subset-sum problem and it is thus NP-complete to determine if it has any solutions (and NP-hard to list them all).

We have conditions:
a1 ± a2 ± ... ± ay = x, y<20 [1]
First of all, I would generalize the condition [1], allowing all 'a' including 'a1' to be ±:
±a1 ± a2 ± ... ± ay = x [2]
If we have solution for [2], we can easily get solution for [1]
To solve [2] we can use the following approach:
combinations list x
| x == 0 && null list = [[]]
| null list = []
| otherwise = plusCombinations ++ minusCombinations where
a = head list
rest = tail list
plusCombinations = map (\c -> a:c) $ combinations rest (x-a)
minusCombinations = map (\c -> -a:c) $ combinations rest (x+a)
Explanation:
First condition checks if x reached zero and used all numbers from list. This means that solution found and we return single solution: [[]]
Second condition checks that list is empty and as far as x is not 0 this means that no solution can be found, returning empty solution: []
Third branch means that we can two alternatives: to use ai with '+' or with '-' so we concatenate plus and minus combinations
Example output:
*Main> combinations [1,2,3,4] 2
[[1,2,3,-4],[-1,2,-3,4]]
*Main> combinations [1,2,3,4] 3
[]
*Main> combinations [1,2,3,4] 4
[[1,2,-3,4],[-1,-2,3,4]]

Related

Minimum Delete operations to empty the vector

My friend was asked this question in an interview:
We have a vector of integers consisting only of 0s and 1s. A delete consists of selecting consecutive equal numbers and removing them. The remaining parts are then attached to each other. For e.g., if the vector is [0,1,1,0] then after removing [1,1] we get [0,0]. We need one delete to remove an element from the vector, if no consecutive elements are found.
We need to write a function that returns the minimum number of deletes to make the vector empty.
Examples 1:
Input: [0,1,1,0]
Output: 2
Explanation: [0,1,1,0] -> [0,0] -> []
Examples 2:
Input: [1,0,1,0]
Output: 3
Explanation: [1,0,1,0] -> [0,1,0] -> [0,0] -> [].
Examples 3:
Input: [1,1,1]
Output: 1
Explanation: [1,1,1] -> []
I am unsure of how to solve this question. I feel that we can use a greedy approach:
Remove all consecutive equal elements and increment the delete counter for each;
Remove elements of the form <a, b, c> where a==c and a!=b, because of we had multiple consecutive bs, it would have been deleted in step (1) above. Increment the delete counter once as we delete one b.
Repeat steps (1) and (2) as long as we can.
Increment delete counter once for each of the remaining elements in the vector.
But I am not sure if this would work. Could someone please confirm if this is the right approach? If not, how do we solve this?
Hint
You can simplify this problem greatly by noticing the following fact: a chain of consecutive zeros or ones can be shortened or lengthened without changing the final solution. By example, the two vectors have the same solution:
[1, 0, 1]
[1, 0, 0, 0, 0, 0, 0, 1]
With that in mind, the solution becomes simpler. So I encourage you to pause and try to figure it out!
Solution
With the previous remark, we can reduce the problem to vectors of alternating zeros and ones. In fact, since zero and one have no special meaning here, it suffices to solve for all such vector which start by... say a one.
[] # number of steps: 0
[1] # number of steps: 1
[1, 0] # number of steps: 2
[1, 0, 1] # number of steps: 2
[1, 0, 1, 0] # number of steps: 3
[1, 0, 1, 0, 1] # number of steps: 3
[1, 0, 1, 0, 1, 0] # number of steps: 4
[1, 0, 1, 0, 1, 0, 1] # number of steps: 4
We notice a pattern, the solution seems to be floor(n / 2) + 1 for n > 1 where n is the length of those sequences. But can we prove it..?
Proof
We will proceed by induction. Suppose you have a solution for a vector of length n - 2, then any move you do (except for deleting the two characters on the edges of the vector) will have the following result.
[..., 0, 1, 0, 1, 0 ...]
^------------ delete this one
Result:
[..., 0, 1, 1, 0, ...]
But we already mentioned that a chain of consecutive zeros or ones can be shortened or lengthened without changing the final solution. So the result of the deletion is in fact equivalent to now having to solve for:
[..., 0, 1, 0, ...]
What we did is one deletion in n elements and arrived to a case which is equivalent to having to solve for n - 2 elements. So the solution for a vector of size n is...
Solution(n) = Solution(n - 2) + 1
= [floor((n - 2) / 2) + 1] + 1
= floor(n / 2) + 1
Keeping in mind that the solutions for [1] and [1, 0] are respectively 1 and 2, this concludes our proof. Notice here, that [] turns out to be an edge case.
Interestingly enough, this proof also shows us that the optimal sequence of deletions for a given vector is highly non-unique. You can simply delete any block of ones or zeros, except for the first and last ones, and you will end up with an optimal solution.
Conclusion
In conclusion, given an arbitrary vector of ones and zeros, the smallest number of deletions you will need can be computed by counting the number of groups of consecutive ones or zeros. The answer is then floor(n / 2) + 1 for n > 1.
Just for fun, here is a Python implementation to solve this problem.
from itertools import groupby
def solution(vector):
n = 0
for group in groupby(vector):
n += 1
return n // 2 + 1 if n > 1 else n
Intuition: If we remove the subsegments of one integer, then all the remaining integers are of one type leads to only one operation.
Choosing the integer which is not the starting one to remove subsegments leads to optimal results.
Solution:
Take the integer other than the one that is starting as a flag.
Count the number of contiguous segments of the flag in a vector.
The answer will be the above count + 1(one operation for removing a segment of starting integer)
So, the answer is:
answer = Count of contiguous segments of flag + 1
Example 1:
[0,1,1,0]
flag = 1
Count of subsegments with flag = 1
So, answer = 1 + 1 = 2
Example 2:
[1,0,1,0]
flag = 0
Count of subsegments with flag = 2
So, answer = 2 + 1 = 3
Example 3:
[1,1,1]
flag = 0
Count of subsegments with flag = 0
So, answer = 0 + 1 = 1

Minimum number X such that X % P == N

This question is taken from an ACM-ICPC Romanian archive.
You are given T tuples of the form (N, P), find the smallest number X for every tuple such that X % P == N. If this is not possible, print -1. X can only be formed using digits from the set {2, 3, 5, 7}.
Example :
3
52 100
11 100
51 1123
Output for given example :
52
-1
322352
Restrictions :
1 ≤ P ≤ 5 * 10^6
1 ≤ N ≤ P - 1
I attempted solving this problem by using a recursive function that would build numbers with digits from the given set and check if the condition is met, but that is way too slow because I have no idea when to stop searching (i.e. when there's no solution for the given tuple).
The author hints at using BFS somehow, but I really don't see any way to construct a meaningful graph using the input data of this problem.
How would you approach solving this problem?
You can solve this with a BFS, starting from 0, where adjacent vertices to a number n are 10n+2, 10n+3, 10n+5 and 10n+7. By keeping a record of all numbers mod p already queued, one can reduce the size of the search space, but more importantly know when the whole space has been searched.
Here's a simple Python implementation:
import collections
def ns(n, p):
q = collections.deque([0])
done = set()
while q:
x = q.popleft()
for d in [2, 3, 5, 7]:
nn = 10 * x + d
if nn % p in done:
continue
if nn % p == n:
return nn
q.append(nn)
done.add(nn % p)
return -1
assert ns(52, 100) == 52
assert ns(11, 100) == -1
assert ns(51, 1123) == 322352
assert ns(0, 55) == 55

Finding the lowest sum of values in a list to form a target factor

I'm stuck as to how to make an algorithm to find a combination of elements from a list where the sum of those factors is the lowest possible where the factor of those numbers is a predetermined target value.
For instance a list:
(2,5,7,6,8,2,3)
And a target value:
12
Would result in these factors:
(2,2,3) and (2,6)
But the optimal combination would be:
(2,2,3)
As it has a lower sum
First erase from the list all numbers that aren't factors of n. So in your example your list would reduce to (2, 6, 2, 3). Then I would sort the list. So you have (2, 2, 3, 6). Start multiplying the elements from the left to right if you reach n stop. If you exceed n find the next smallest permutation of your numbers and repeat. This will be (2, 2, 6, 3) (for a C++ function that finds the next permutation see this link). This will guarantee to find the multiplication with the smallest sum because the we are checking the products in order from smallest sum to largest. This runs in the size of your list factorial but I think that is as good as you're going to get. This problem sounds NP hard.
You can do slightly better by pruning the permutations. Lets say you were looking for 24 and your list is (2, 4, 8, 12). The only subset is (2, 12). But the next permutation will be (2, 4, 12, 8) which you don't even need to generate because you knew that 2*4 was too small and 2*4*8 was too big and swapping 12 with 8 only increased 2*4*8. This way you didn't have to test that permutation.
You should be able to break the problem down recursively. You have a multiset of potential factors S = {n_1, n_2, ..., n_k}. Let f(S,n) be the maximum sum n_i_1 + n_i_2 + ... + n_i_j where n_i_l are distinct elements of the multiset and n_i_1 * ... * n_i_j = n. Then f(S,n) = max_i { (n_i + f(S-{n_i},n/n_i)) where n_i divides n }. In other words, f(S,n) can be computed recursively. With a little more work you can get the algorithm to spit out the actual n_is that work. The time complexity could be bad, but you don't say what your goals are in that regard.
def primes(n):
primfac = []
d = 2
while d*d <= n:
while (n % d) == 0:
primfac.append(d) # supposing you want multiple factors repeated
n //= d
d += 1
if n > 1:
primfac.append(n)
return primfac
def get_factors_list(dividend, ceiling = float('infinity')):
""" Yield all lists of factors where the largest is no larger than ceiling """
for divisor in range(min(ceiling, dividend - 1), 1, -1):
quotient, mod = divmod(dividend, divisor)
if mod == 0:
if quotient <= divisor:
yield [divisor, quotient]
for factors in get_factors_list(quotient, divisor):
yield [divisor] + factors
def print_factors(x):
factorList = []
if x > 0:
for factors in get_factors_list(x):
factorList.append(list(map(int, factors)))
return factorList
Here's is how you could do it in Haskell:
import Data.List(sortBy, subsequences)
import Data.Function(on)
lowestSumTargetFactor :: (Ord b, Num b) => [b] -> b -> [b]
lowestSumTargetFactor xs target = do
let l = filter (/= []) $ sortBy (compare `on` sum)
[x | x <- subsequences xs, product x == target]
if l == []
then error $ "lowestSumTargetFactor: " ++
"no subsequence product equals target."
else head l
Here's what is happening:
[x | x <- subsequences xs, product x == target] builds a list made of all subsequences of the list xs whose product equals target. In your example, it would build the list [[2,6],[6,2],[2,2,3]].
Then the sortBy (compareonsum) part sorts that list of list by the sum of it's list elements. It would return the list [[2,2,3],[2,6],[6,2]].
I then filter that list, removing any [] elements because product [] returns 1 (don't know the reasoning for this, yet). This was done because lowestSumTargetFactor [1, 1, 1] 1 would return [] instead of the expected [1].
Then I ask if the list we built is []. If no, I use the function head to return the first element of that list ([2,2,3] in your case). If yes, it returns the error as written.
Obs1: where it appears above, the $ just means that everything after it is enclosed in parentheses.
Obs2: the lowestSumTargetFactor :: (Ord b, Num b) => [b] -> b -> [b] part is just the function's type signature. It means that the function takes a list made of bs, a second argument b and returns another list made of bs, b being a member of both the Ord class of totally ordered datatypes, and the Num class, the basic numeric class.
Obs3: I'm still a beginner. A more experienced programmer would probably do this much more efficiently and elegantly.

Indexing ranked permutations into other ranked permutations

I'm considering all permutations of 0, ..., n-1 in lexicographic order. I'm given two ranks, i and j, and asked to find the rank of the permutation that results from applying the i'th permutation to the j'th permutation.
A couple examples for n=3:
p(3) = [1, 2, 0], p(4) = [2, 0, 1], result = [0, 1, 2], rank = 0
Given i = j = 4, we get [2, 0, 1] applied to itself is [1, 2, 0], rank = 3.
What I've come up with so far: I convert the ranks to their respective permutations via Lehmer codes, calculate the desired permutation, and convert back to rank via Lehmer codes.
Can anyone suggest a way to get the rank of the desired permutation from the other two ranks, without having to actually calculate the permutations? Storing the n! x n! array is not an option.
-edit- Note that I'm not wedded to lexicographic order if some other ordering would enable this.
-edit- Here are the n! by n! grids for n=3 & 4, for lexicographic ranks. Row i is indexed into column j to get the output. Note that the n=3 grid is identical to the top-left corner of the n=4 grid.
00|01|02|03|04|05|
01|00|03|02|05|04|
02|04|00|05|01|03|
03|05|01|04|00|02|
04|02|05|00|03|01|
05|03|04|01|02|00|
00|01|02|03|04|05|06|07|08|09|10|11|12|13|14|15|16|17|18|19|20|21|22|23|
01|00|03|02|05|04|07|06|09|08|11|10|13|12|15|14|17|16|19|18|21|20|23|22|
02|04|00|05|01|03|08|10|06|11|07|09|14|16|12|17|13|15|20|22|18|23|19|21|
03|05|01|04|00|02|09|11|07|10|06|08|15|17|13|16|12|14|21|23|19|22|18|20|
04|02|05|00|03|01|10|08|11|06|09|07|16|14|17|12|15|13|22|20|23|18|21|19|
05|03|04|01|02|00|11|09|10|07|08|06|17|15|16|13|14|12|23|21|22|19|20|18|
06|07|12|13|18|19|00|01|14|15|20|21|02|03|08|09|22|23|04|05|10|11|16|17|
07|06|13|12|19|18|01|00|15|14|21|20|03|02|09|08|23|22|05|04|11|10|17|16|
08|10|14|16|20|22|02|04|12|17|18|23|00|05|06|11|19|21|01|03|07|09|13|15|
09|11|15|17|21|23|03|05|13|16|19|22|01|04|07|10|18|20|00|02|06|08|12|14|
10|08|16|14|22|20|04|02|17|12|23|18|05|00|11|06|21|19|03|01|09|07|15|13|
11|09|17|15|23|21|05|03|16|13|22|19|04|01|10|07|20|18|02|00|08|06|14|12|
12|18|06|19|07|13|14|20|00|21|01|15|08|22|02|23|03|09|10|16|04|17|05|11|
13|19|07|18|06|12|15|21|01|20|00|14|09|23|03|22|02|08|11|17|05|16|04|10|
14|20|08|22|10|16|12|18|02|23|04|17|06|19|00|21|05|11|07|13|01|15|03|09|
15|21|09|23|11|17|13|19|03|22|05|16|07|18|01|20|04|10|06|12|00|14|02|08|
16|22|10|20|08|14|17|23|04|18|02|12|11|21|05|19|00|06|09|15|03|13|01|07|
17|23|11|21|09|15|16|22|05|19|03|13|10|20|04|18|01|07|08|14|02|12|00|06|
18|12|19|06|13|07|20|14|21|00|15|01|22|08|23|02|09|03|16|10|17|04|11|05|
19|13|18|07|12|06|21|15|20|01|14|00|23|09|22|03|08|02|17|11|16|05|10|04|
20|14|22|08|16|10|18|12|23|02|17|04|19|06|21|00|11|05|13|07|15|01|09|03|
21|15|23|09|17|11|19|13|22|03|16|05|18|07|20|01|10|04|12|06|14|00|08|02|
22|16|20|10|14|08|23|17|18|04|12|02|21|11|19|05|06|00|15|09|13|03|07|01|
23|17|21|11|15|09|22|16|19|05|13|03|20|10|18|04|07|01|14|08|12|02|06|00|
Here are the factoradics for n=4. I left off the last digit, which is always zero, for compactness.
000|001|010|011|020|021|100|101|110|111|120|121|200|201|210|211|220|221|300|301|310|311|320|321|
001|000|011|010|021|020|101|100|111|110|121|120|201|200|211|210|221|220|301|300|311|310|321|320|
010|020|000|021|001|011|110|120|100|121|101|111|210|220|200|221|201|211|310|320|300|321|301|311|
011|021|001|020|000|010|111|121|101|120|100|110|211|221|201|220|200|210|311|321|301|320|300|310|
020|010|021|000|011|001|120|110|121|100|111|101|220|210|221|200|211|201|320|310|321|300|311|301|
021|011|020|001|010|000|121|111|120|101|110|100|221|211|220|201|210|200|321|311|320|301|310|300|
100|101|200|201|300|301|000|001|210|211|310|311|010|011|110|111|320|321|020|021|120|121|220|221|
101|100|201|200|301|300|001|000|211|210|311|310|011|010|111|110|321|320|021|020|121|120|221|220|
110|120|210|220|310|320|010|020|200|221|300|321|000|021|100|121|301|311|001|011|101|111|201|211|
111|121|211|221|311|321|011|021|201|220|301|320|001|020|101|120|300|310|000|010|100|110|200|210|
120|110|220|210|320|310|020|010|221|200|321|300|021|000|121|100|311|301|011|001|111|101|211|201|
121|111|221|211|321|311|021|011|220|201|320|301|020|001|120|101|310|300|010|000|110|100|210|200|
200|300|100|301|101|201|210|310|000|311|001|211|110|320|010|321|011|111|120|220|020|221|021|121|
201|301|101|300|100|200|211|311|001|310|000|210|111|321|011|320|010|110|121|221|021|220|020|120|
210|310|110|320|120|220|200|300|010|321|020|221|100|301|000|311|021|121|101|201|001|211|011|111|
211|311|111|321|121|221|201|301|011|320|021|220|101|300|001|310|020|120|100|200|000|210|010|110|
220|320|120|310|110|210|221|321|020|300|010|200|121|311|021|301|000|100|111|211|011|201|001|101|
221|321|121|311|111|211|220|320|021|301|011|201|120|310|020|300|001|101|110|210|010|200|000|100|
300|200|301|100|201|101|310|210|311|000|211|001|320|110|321|010|111|011|220|120|221|020|121|021|
301|201|300|101|200|100|311|211|310|001|210|000|321|111|320|011|110|010|221|121|220|021|120|020|
310|210|320|110|220|120|300|200|321|010|221|020|301|100|311|000|121|021|201|101|211|001|111|011|
311|211|321|111|221|121|301|201|320|011|220|021|300|101|310|001|120|020|200|100|210|000|110|010|
320|220|310|120|210|110|321|221|300|020|200|010|311|121|301|021|100|000|211|111|201|011|101|001|
321|221|311|121|211|111|320|220|301|021|201|011|310|120|300|020|101|001|210|110|200|010|100|000|
I found an algorithm to convert between permutations and ranks in linear time. That's not quite what I want, but is probably good enough. It turns out that the fact that I don't care about lexicographic order is important. The ranking this uses is weird. I'm going to give two functions, one that converts from a rank to a permutation, and one that does the inverse.
First, to unrank (go from rank to permutation)
Initialize:
n = length(permutation)
r = desired rank
p = identity permutation of n elements [0, 1, ..., n]
unrank(n, r, p)
if n > 0 then
swap(p[n-1], p[r mod n])
unrank(n-1, floor(r/n), p)
fi
end
Next, to rank:
Initialize:
p = input permutation
q = inverse input permutation (in linear time, q[p[i]] = i for 0 <= i < n)
n = length(p)
rank(n, p, q)
if n=1 then return 0 fi
s = p[n-1]
swap(p[n-1], p[q[n-1]])
swap(q[s], q[n-1])
return s + n * rank(n-1, p, q)
end
That's the pseudocode. For my project I'll be careful to work with a copy of p so I don't mutate it when calculating its rank.
The running time of both of these is O(n).
There's a nice, readable paper explaining why this works: Ranking & Unranking Permutations in Linear Time, by Myrvold & Ruskey, Information Processing Letters Volume 79, Issue 6, 30 September 2001, Pages 281–284.
http://webhome.cs.uvic.ca/~ruskey/Publications/RankPerm/MyrvoldRuskey.pdf
If, in addition to R, you are not wedded to a particular P either, we could redefine the permutation function to facilitate a possible answer. The function, newPerm, below would permute a list in relation to R with the same consistency as the permuting function that "indexes into."
The example below is not optimized for efficiency (e.g., ranking/unranking can be done in O(n)). The last two lines of output compare the redefined permuting function to the "indexing" permuting function - as you can see, they both generate the same number of unique permutations when mapped to the permutation set. The function, f, would be the answer to the question.
Haskell code:
import Data.List (sort,permutations)
import Data.Maybe (fromJust)
sortedPermutations = sort $ permutations [0,1,2,3,4,5,6]
rank p = fromJust (lookup p rs) where rs = zip sortedPermutations [0..]
unrank r = fromJust (lookup r ps) where ps = zip [0..] sortedPermutations
tradPerm p s = foldr (\a b -> s!!a : b) [] p
newPerm p s = unrank (f (rank p) (rank s))
f r1 r2 = let l = r1 - r2 in if l < 0 then length sortedPermutations + l else l
Output:
*Main Data.List> unrank 3
[0,1,2,3,5,6,4]
*Main Data.List> unrank 8
[0,1,2,4,5,3,6]
*Main Data.List> f 3 8
5035
*Main Data.List> newPerm [0,1,2,3,5,6,4] [0,1,2,4,5,3,6]
[6,5,4,3,0,2,1]
*Main Data.List> rank [6,5,4,3,0,2,1]
5035
*Main Data.List> length $ group $ sort $ map (tradPerm [1,2,5,0,4,3,6]) sortedPermutations
5040
*Main Data.List> length $ group $ sort $ map (newPerm [1,2,5,0,4,3,6]) sortedPermutations
5040

Create many constrained, random permutation of a list

I need to make a random list of permutations. The elements can be anything but assume that they are the integers 0 through x-1. I want to make y lists, each containing z elements. The rules are that no list may contain the same element twice and that over all the lists, the number of times each elements is used is the same (or as close as possible). For instance, if my elements are 0,1,2,3, y is 6, and z is 2, then one possible solution is:
0,3
1,2
3,0
2,1
0,1
2,3
Each row has only unique elements and no element has been used more than 3 times. If y were 7, then 2 elements would be used 4 times, the rest 3.
This could be improved, but it seems to do the job (Python):
import math, random
def get_pool(items, y, z):
slots = y*z
use_each_times = slots/len(items)
exceptions = slots - use_each_times*len(items)
if (use_each_times > y or
exceptions > 0 and use_each_times+1 > y):
raise Exception("Impossible.")
pool = {}
for n in items:
pool[n] = use_each_times
for n in random.sample(items, exceptions):
pool[n] += 1
return pool
def rebalance(ret, pool, z):
max_item = None
max_times = None
for item, times in pool.items():
if times > max_times:
max_item = item
max_times = times
next, times = max_item, max_times
candidates = []
for i in range(len(ret)):
item = ret[i]
if next not in item:
candidates.append( (item, i) )
swap, swap_index = random.choice(candidates)
swapi = []
for i in range(len(swap)):
if swap[i] not in pool:
swapi.append( (swap[i], i) )
which, i = random.choice(swapi)
pool[next] -= 1
pool[swap[i]] = 1
swap[i] = next
ret[swap_index] = swap
def plist(items, y, z):
pool = get_pool(items, y, z)
ret = []
while len(pool.keys()) > 0:
while len(pool.keys()) < z:
rebalance(ret, pool, z)
selections = random.sample(pool.keys(), z)
for i in selections:
pool[i] -= 1
if pool[i] == 0:
del pool[i]
ret.append( selections )
return ret
print plist([0,1,2,3], 6, 2)
Ok, one way to approximate that:
1 - shuffle your list
2 - take the y first elements to form the next row
4 - repeat (2) as long as you have numbers in the list
5 - if you don't have enough numbers to finish the list, reshuffle the original list and take the missing elements, making sure you don't retake numbers.
6 - Start over at step (2) as long as you need rows
I think this should be as random as you can make it and will for sure follow your criteria. Plus, you have very little tests for duplicate elements.
First, you can always randomly sort the list in the end, so let's not worry about making "random permutations" (hard); and just worry about 1) making permutations (easy) and 2) randomizing them (easy).
If you want "truly" random groups, you have to accept that randomization by nature doesn't really allow for the constraint of "even distribution" of results -- you may get that or you may get a run of similar-looking ones. If you really want even distribution, first make the sets evenly distributed, and then randomize them as a group.
Do you have to use each element in the set x evenly? It's not clear from the rules that I couldn't just make the following interpretation:
Note the following: "over all the lists, the number of times each elements is used is the same (or as close as possible)"
Based on this criteria, and the rule that z < x*, I postulate that you can simply enumerate all the items over all the lists. So you automatically make y list of the items enumerated to position z. Your example doesn't fulfill the rule above as closely as my version will. Using your example of x={0,1,2,3} y=6 and z=2, I get:
0,1 0,1 0,1 0,1 0,1 0,1
Now I didn't use 2 or 3, but you didn't say I had to use them all. If I had to use them all and I don't care to be able to prove that I am "as close as possible" to even usage, I would just enumerate across all the items through the lists, like this:
0,1 2,3 0,1 2,3 0,1 2,3
Finally, suppose I really do have to use all the elements. To calculate how many times each element can repeat, I just take (y*z)/(count of x). That way, I don't have to sit and worry about how to divide up the items in the list. If there is a remainder, or the result is less than 1, then I know that I will not get an exact number of repeats, so in those cases, it doesn't much matter to try to waste computational energy to make it perfect. I contend that the fastest result is still to just enumerate as above, and use the calculation here to show why either a perfect result was or wasn't achieved. A fancy algorithm to extract from this calculation how many positions will be duplicates could be achieved, but "it's too long to fit here in the margin".
*Each list has the same z number of elements, so it will be impossible to make lists where z is greater than x and still fulfill the rule that no list may contain the same element twice. Therefore, this rule demands that z cannot be greater than x.
Based on new details in the comments, the solution may simply be an implementation of a standard random permutation generation algorithm. There is a lengthy discussion of random permutation generation algorithms here:
http://www.techuser.net/randpermgen.html
(From Google search: random permutation generation)
This works in Ruby:
# list is the elements to be permuted
# y is the number of results desired
# z is the number of elements per result
# equalizer keeps track of who got used how many times
def constrained_permutations list, y, z
list.uniq! # Never trust the user. We want no repetitions.
equalizer = {}
list.each { |element| equalizer[element] = 0 }
results = []
# Do this until we get as many results as desired
while results.size < y
pool = []
puts pool
least_used = equalizer.each_value.min
# Find how used the least used element was
while pool.size < z
# Do this until we have enough elements in this resultset
element = nil
while element.nil?
# If we run out of "least used elements", then we need to increment
# our definition of "least used" by 1 and keep going.
element = list.shuffle.find do |x|
!pool.include?(x) && equalizer[x] == least_used
end
least_used += 1 if element.nil?
end
equalizer[element] += 1
# This element has now been used one more time.
pool << element
end
results << pool
end
return results
end
Sample usage:
constrained_permutations [0,1,2,3,4,5,6], 6, 2
=> [[4, 0], [1, 3], [2, 5], [6, 0], [2, 5], [3, 6]]
constrained_permutations [0,1,2,3,4,5,6], 6, 2
=> [[4, 5], [6, 3], [0, 2], [1, 6], [5, 4], [3, 0]]
enter code here
http://en.wikipedia.org/wiki/Fisher-Yates_shuffle

Resources