N-fold partition of an array with equal sum in each partition - algorithm

Given an array of integers a, two numbers N and M, return N group of integers from a such that each group sums to M.
For example, say:
a = [1,2,3,4,5]
N = 2
M = 5
Then the algorithm could return [2, 3], [1, 4] or [5], [2, 3] or possibly others.
What algorithms could I use here?
Edit:
I wasn't aware that this problem is NP complete. So maybe it would help if I provided more details on my specific scenario:
So I'm trying to create a "match-up" application. Given the number of teams N and the number of players per team M, the application listens for client requests. Each client request will give a number of players that the client represents. So if I need 2 teams of 5 players, then if 5 clients send requests, each representing 1, 2, 3, 4, 5 players respectively, then my application should generate a match-up between clients [1, 4] and clients [2, 3]. It could also generate a match-up between [1, 4] and [5]; I don't really care.
One implication is that any client representing more than M or less than 0 players is invalid. Hope this could simplify the problem.

this appears to be a variation of the subset sum problem. as this problem is np-complete, there will be no efficient algorithm without further constraints.
note that it is already hard to find a single subset of the original set whose elements would sum up to M.

People give up too easily on NP-complete problems. Just because a problem is NP complete doesn't mean that there aren't more and less efficient algorithms in the general case. That is you can't guarantee that for all inputs there is an answer that can be computed faster than a brute force search, but for many problems you can certainly have methods that are faster than the full search for most inputs.
For this problem there are certainly 'perverse' sets of numbers that will result in worst case search times, because there may be say a large vector of integers, but only one solution and you have to end up trying a very large number of combinations.
But for non-perverse sets, there are probably many solutions, and an efficient way of 'tripping over' a good partitioning will run much faster than NP time.
How you solve this will depend a lot on what you expect to be the more common parameters. It also makes a difference if the integers are all positive, or if negatives are allowed.
In this case I'll assume that:
N is small relative to the length of the vector
All integers are positive.
Integers cannot be re-used.
Algorithm:
Sort the vector, v.
Eliminate elements bigger than M. They can't be part of any solution.
Add up all remaining numbers in v, divide by N. If the result is smaller than M, there is no solution.
Create a new array w, same size as v. For each w[i], sum all the numbers in v[i+1 - end]
So if v was 5 4 3 2 1, w would be 10, 6, 3, 1, 0.
While you have not found enough sets:
Chose the largest number, x, if it is equal to M, emit a solution set with just x, and remove it from the vector, remove the first element from w.
Still not enough sets? (likely), then again while you have not found enough sets:
A solution theory is ([a,b,c], R ) where [a,b,c] is a partial set of elements of v and a remainder R. R = M-sum[a,b,c]. Extending a theory is adding a number to the partial set, and subtracting that number from R. As you extend the theories, if R == 0, that is a possible solution.
Recursively create theories like so: loop over the elements v, as v[i] creating theories, ( [v[i]], R ), And now recursively extend extend each theory from just part of v. Binary search into v to find the first element equal to or smaller than R, v[j]. Start with v[j] and extend each theory with the elements of v from j until R > w[k].
The numbers from v[j] to v[k] are the only numbers that be used to extend a theory and still get R to 0. Numbers larger than v[j] will make R negative. Smaller larger than v[k], and there aren't any more numbers left in the array, even if you used them all to get R to 0

Here is my own Python solution that uses dynamic programming. The algorithm is given here.
def get_subset(lst, s):
'''Given a list of integer `lst` and an integer s, returns
a subset of lst that sums to s, as well as lst minus that subset
'''
q = {}
for i in range(len(lst)):
for j in range(1, s+1):
if lst[i] == j:
q[(i, j)] = (True, [j])
elif i >= 1 and q[(i-1, j)][0]:
q[(i, j)] = (True, q[(i-1, j)][1])
elif i >= 1 and j >= lst[i] and q[(i-1, j-lst[i])][0]:
q[(i, j)] = (True, q[(i-1, j-lst[i])][1] + [lst[i]])
else:
q[(i, j)] = (False, [])
if q[(i, s)][0]:
for k in q[(i, s)][1]:
lst.remove(k)
return q[(i, s)][1], lst
return None, lst
def get_n_subset(n, lst, s):
''' Returns n subsets of lst, each of which sums to s'''
solutions = []
for i in range(n):
sol, lst = get_subset(lst, s)
solutions.append(sol)
return solutions, lst
# print(get_n_subset(7, [1, 2, 3, 4, 5, 7, 8, 4, 1, 2, 3, 1, 1, 1, 2], 5))
# [stdout]: ([[2, 3], [1, 4], [5], [4, 1], [2, 3], [1, 1, 1, 2], None], [7, 8])

Related

Dynamic Programming Solution for a Variant of Coin Exchange

I am practicing Dynamic Programming. I am focusing on the following variant of the coin exchange problem:
Let S = [1, 2, 6, 12, 24, 48, 60] be a constant set of integer coin denominations. Let n be a positive integer amount of money attainable via coins in S. Consider two persons A and B. In how many different ways can I split n among persons A and B so that each person gets the same amount of coins (disregarding the actual amount of money each gets)?
Example
n = 6 can be split into 4 different ways per person:
Person A gets {2, 2} and person B gets {1, 1}.
Person A gets {2, 1} and person B gets {2, 1}.
Person A gets {1, 1} and person B gets {2, 2}.
Person A gets {1, 1, 1} and person B gets {1, 1, 1}.
Notice that each way is non-redundant per person, i.e. we do not count both {2, 1} and {1, 2} as two different ways.
Previous research
I have studied at very similar DP problems, such as the coin exchange problem and the partition problem. In fact, there are questions in this site referring to almost the same problem:
Dynamic Programming for a variant of the coin exchange - Here, OP studies the recursion relationship, but seems confused introducing the parity constraint.
Coin Change :Dynamic Programming - Here, OP seems to pursue the reconstruction of the solution.
Coin change(Dynamic programming) - Here, OP seems to also pursue the reconstruction of the solution.
https://cs.stackexchange.com/questions/87230/dynamic-programming-for-a-variant-of-the-coin-exchange-problem - Here, OP seems to ask about a similar problem, yet parity, i.e. splitting into two persons, becomes the main issue.
I am interested mostly in the recursion relation that could help me solve this problem. Defining it will allow me to easily apply either a memoization of a tabulation approach to design an algorithm for this problem.
For example, this recursion:
def f(n, coins):
if n < 0:
return 0
if n == 0:
return 1
return sum([f(n - coin, coins) for coin in coins])
Is tempting, yet it does not work, because when executed:
# => f(6, [1, 2, 6]) # 14
Here's an example of a run for S' = {1, 2, 6} and n = 6, in order to help me clarify the pattern (there might be errors):
This is what you can try:
Let C(n, k, S) be the number of distinct representations of an amount n using some k coins from S.
Then C(n, k, S) = sum(C(n - s_i, k - 1, S[i:])) The summation is for every s_i from S. S[i:] means all the elements from S starting from i-th element to the end - we need this to prevent repeated combinations.
The initial conditions are C(0, 0, _) = 1 and C(n, k, _) = 0 if n < 0 or k < 0 or n > 0 and k < 1 .
The number you want to calculate:
R = sum(C(i, k, S) * C(n - i, k, S)) for i = 1..n-1, k = 1..min(i, n-i)/Smin where Smin - the smallest coin denomination from S.
The value min(i, n-i)/Smin represents the maximum number of coins that is possible when partitioning the given sum. For example if the sum n = 20 and i = 8 (1st person gets $8, 2nd gets $12) and the minimum coin denomination is $2, the maximum possible number of coins is 8/2 = 4. You can't get $8 with >4 coins.
Here is a table implementation and a little elaboration on algrid's beautiful answer. This produces an answer for f(500, [1, 2, 6, 12, 24, 48, 60]) in about 2 seconds.
The simple declaration of C(n, k, S) = sum(C(n - s_i, k - 1, S[i:])) means adding all the ways to get to the current sum, n using k coins. Then if we split n into all ways it can be partitioned in two, we can just add all the ways each of those parts can be made from the same number, k, of coins.
The beauty of fixing the subset of coins we choose from to a diminishing list means that any arbitrary combination of coins will only be counted once - it will be counted in the calculation where the leftmost coin in the combination is the first coin in our diminishing subset (assuming we order them in the same way). For example, the arbitrary subset [6, 24, 48], taken from [1, 2, 6, 12, 24, 48, 60], would only be counted in the summation for the subset [6, 12, 24, 48, 60] since the next subset, [12, 24, 48, 60] would not include 6 and the previous subset [2, 6, 12, 24, 48, 60] has at least one 2 coin.
Python code (see it here; confirm here):
import time
def f(n, coins):
t0 = time.time()
min_coins = min(coins)
m = [[[0] * len(coins) for k in xrange(n / min_coins + 1)] for _n in xrange(n + 1)]
# Initialize base case
for i in xrange(len(coins)):
m[0][0][i] = 1
for i in xrange(len(coins)):
for _i in xrange(i + 1):
for _n in xrange(coins[_i], n + 1):
for k in xrange(1, _n / min_coins + 1):
m[_n][k][i] += m[_n - coins[_i]][k - 1][_i]
result = 0
for a in xrange(1, n + 1):
b = n - a
for k in xrange(1, n / min_coins + 1):
result = result + m[a][k][len(coins) - 1] * m[b][k][len(coins) - 1]
total_time = time.time() - t0
return (result, total_time)
print f(500, [1, 2, 6, 12, 24, 48, 60])

List of all possible permutations of factors of a number

I am trying to find all the possible factorizations of a number provided in Python.
For example: 1)given n=12,
the output will be, f(n)=[[2,2,3],[4,3],[6,2],[12]]
2) given n=24,
the output will be,f(n)=[2,2,2,3],[2,2,6],[2,12],[4,6],[8,3],[24]]
Here is my code:
def p(a):
k=1
m=1
n=[]
for i in range (len(a)):
for j in range(0,i+1):
k*=a[j]
for l in range(i+1,len(a)):
m*=a[l]
n+=[[k,m],]
k=1
m=1
return n
def f(n):
primfac = []
d = 2
while d*d <= n:
while (n % d) == 0:
primfac.append(d)
n //= d
d += 1
if n > 1:
primfac.append(n)
return p(primfac)
But my code returns following values:
1) For n=12,The output is ,
[[2, 10], [4, 5], [20, 1]]
2)1) For n=24,The output is ,
[[2, 12], [4, 6], [8, 3], [24, 1]]
What can I do for getting relevant results?
I don't know python, so can't help you with the code, but here in an explanation I provided for a related question (a bit of Java code as well, if you can read Java).
get your number factored with multiplicity - this is with high probability the most expensive step O(sqrt(N)) - you can stop here if this is al that you want
build you sets of {1, pi1, pi1, ..., pimi} - pi being a prime factor with multiplicity of mi
perform a Cartesian product between these sets and you'll get all the divisors of your number - you'll spend longer time here only for numbers with many distinct factors (and multiplicities) - e.g 210 x 3 8 x 54 x 73 will have 1980 divisors.
Now, each divisor d resulted from the above will come with it's pair (N/d) so if you want distinct factorisation irrespective of the order, you''l need to sort them and eliminate the duplicates.

no. of permutation of number from 1 to n in which i >i+1 and i-1

for a given N how many permutations of [1, 2, 3, ..., N] satisfy the following property.
Let P1, P2, ..., PN denote the permutation. The property we want to satisfy is that there exists an i between 2 and n-1 (inclusive) such that
Pj > Pj + 1 ∀ i ≤ j ≤ N - 1.
Pj > Pj - 1 ∀ 2 ≤ j ≤ i.
like for N=3
Permutations [1, 3, 2] and [2, 3, 1] satisfy the property.
Is there any direct formula or algorithm to find these set in programming.
There are 2^(n-1) - 2 such permutations. If n is the largest element, then the permutation is uniquely determined by the nonempty, proper subset of {1, 2, ..., n-1} which lies to the left of n in the permutation. This answer is consistent with the excellent answer of #גלעדברקן in view of the well-known fact that the elements in each row of Pascal's triangle sum to a power of two (hence the part of the row between the two ones is two less than a power of two).
Here is a Python enumeration which generates all n! permutations and checks them for validity:
import itertools
def validPerm(p):
n = max(p)
i = p.index(n)
if i == 0 or i == n-1:
return False
else:
before = p[:i]
after = p[i+1:]
return before == sorted(before) and after == sorted(after, reverse = True)
def validPerms(n):
nums = list(range(1,n+1))
valids = []
for p in itertools.permutations(nums):
lp = list(p)
if validPerm(lp): valids.append(lp)
return valids
For example,
>>> validPerms(4)
[[1, 2, 4, 3], [1, 3, 4, 2], [1, 4, 3, 2], [2, 3, 4, 1], [2, 4, 3, 1], [3, 4, 2, 1]]
which gives the expected number of 6.
On further edit: The above code was to verify the formula for nondegenerate unimodal permutations (to coin a phrase since "unimodal permutations" is used in the literature for the 2^(n-1) permutations with exactly one peak, but the 2 which either begin or end with n are arguably in some sense degenerate). From an enumeration point of view you would want to do something more efficient. The following is a Python implementation of the idea behind the answer of #גלעדברקן :
def validPerms(n):
valids = []
nums = list(range(1,n)) #1,2,...,n-1
snums = set(nums)
for i in range(1,n-1):
for first in itertools.combinations(nums,i):
#first will be already sorted
rest = sorted(snums - set(first),reverse = True)
valids.append(list(first) + [n] + rest)
return valids
It is functionally equivalent to the above code, but substantially more efficient.
Let's look at an example:
{1,2,3,4,5,6}
Clearly, any positioning of 6 at i will mean the right side of it will be sorted descending and the left side of it ascending. For example, i = 3
{1,2,6,5,4,3}
{1,3,6,5,4,2}
{1,4,6,5,3,2}
...
So for each positioning of N between 2 and n-1, we have (n - 1) choose (position - 1) arrangements. This leads to the answer:
sum [(n - 1) choose (i - 1)], for i = 2...(n - 1)
there are ans perm. and ans is as follows
ans equal to 2^(n-1) and
ans -= 2
as it need to be in between 2 <=i <= n-1 && we know that nC1 ans nCn = 1

Finding a group of subsets that does not overlap

I am reviewing for an upcoming programming contest and was working on the following problem:
Given a list of integers, an integer t, an integer r, and an integer p, determine if the list contains t sets of 3, r runs of 3, and p pairs of numbers. For each of these subsets, the numbers must be adjacent and any given number can only exist in one subset, if any at all.
Currently, I am solving the problem by simply finding all sets of 3, runs of 3, and pairs and then checking all permutations until finding one which has no overlapping subsets. This seems inefficient, however, and I was wondering if there was a better solution to the problem.
Here are two examples of the problem:
{1, 1, 1, 2, 3, 4, 4, 4, 5, 5, 1, 0}, t = 1, r = 1, p = 2.
This works because we have the triple {4 4 4}, the run {1 2 3}, and the pairs {1 1} and {5 5}
{1, 1, 1, 2, 3, 3}, t = 1, r = 1, p = 1
This does not work because the only triple is {1 1 1} and the only run is {1 2 3} and the two overlap (They share a 1).
I am looking for a more efficient approach to this problem.
There is probably a faster way, but you can solve this with dynamic programming. Compute a recursive function F(t,r,p,n) which decides whether it is possible to have t triples, r runs, and p pairs in the sequence starting at position 1 and ending at n, and storing the last subset of the solution ending at position n if it is possible. If you can have a triple, run, or pair ending at position n then you have a recursive case, either. F(t-1,r,p,n-3) or F(t,r-1,p,n-3) or F(t,r,p-1,n-2), and you have the last subset stored, or otherwise you have a recursive case F(t,r,p,n-1). This looks like fourth power complexity but it really isn't, because the value of n is always decreasing so the complexity is actually O(n + TRP), where T is the total desired number of triples, R is the total desired number of runs, and P is the total desired number of pairs. So O(n^3) in the worst case.

Partitioning a superset and getting the list of original sets for each partition

Introduction
While trying to do some cathegorization on nodes in a graph (which will be rendered differenty), I find myself confronted with the following problem:
The Problem
Given a superset of elements S = {0, 1, ... M} and a number n of non-disjoint subsets T_i thereof, with 0 <= i < n, what is the best algorithm to find out the partition of the set S called P?
P = S is the union of all disjoint partitions P_j of the original superset S, with 0 <= j < M, such that for all elements x in P_j, every x has the same list of "parents" among the "original" sets T_i.
Example
S = [1, 2, 3, 4, 5, 6, 8, 9]
T_1 = [1, 4]
T_2 = [2, 3]
T_3 = [1, 3, 4]
So all P_js would be:
P_1 = [1, 4] # all elements x have the same list of "parents": T_1, T_3
P_2 = [2] # all elements x have the same list of "parents": T_2
P_3 = [3] # all elements x have the same list of "parents": T_2, T_3
P_4 = [5, 6, 8, 9] # all elements x have the same list of "parents": S (so they're not in any of the P_j
Questions
What are good functions/classes in the python packages to compute all P_js and the list of their "parents", ideally restricted to numpy and scipy? Perhaps there's already a function which does just that
What is the best algorithm to find those partitions P_js and for each one, the list of "parents"? Let's note T_0 = S
I think the brute force approach would be to generate all 2-combinations of T sets and split them in at most 3 disjoint sets, which would be added back to the pool of T sets and then repeat the process until all resulting Ts are disjoint, and thus we've arrived at our answer - the set of P sets. A little problematic could be caching all the "parents" on the way there.
I suspect a dynamic programming approach could be used to optimize the algorithm.
Note: I would have loved to write the math parts in latex (via MathJax), but unfortunately this is not activated :-(
The following should be linear time (in the number of the elements in the Ts).
from collections import defaultdict
S = [1, 2, 3, 4, 5, 6, 8, 9]
T_1 = [1, 4]
T_2 = [2, 3]
T_3 = [1, 3, 4]
Ts = [S, T_1, T_2, T_3]
parents = defaultdict(int)
for i, T in enumerate(Ts):
for elem in T:
parents[elem] += 2 ** i
children = defaultdict(list)
for elem, p in parents.items():
children[p].append(elem)
print(list(children.values()))
Result:
[[5, 6, 8, 9], [1, 4], [2], [3]]
The way I'd do this is to construct an M × n boolean array In where In(i, j) &equals; Si &in; Tj. You can construct that in O(Σj|Tj|), provided you can map an element of S onto its integer index in O(1), by scanning all of the sets T and marking the corresponding bit in In.
You can then read the "signature" of each element i directly from In by concatenating row i into a binary number of n bits. The signature is precisely the equivalence relationship of the partition you are seeking.
By the way, I'm in total agreement with you about Math markup. Perhaps it's time to mount a new campaign.

Resources