I am facing difficulty in coming up with a solution for the problem given below:
We are given n boxes each having a weight ( it means each ball in box B_i have weight C_i),
Each box contain some balls specifically
{b1,b2,b3...,b_n} (b_i is the count of balls in Box B_i).
we have to choose m balls out of it such that sum of the weights of m chosen balls be less than a given number T.
How many ways to do it?
First, let's have a look on a similar problem:
The similar problem is: you are looking to maximize the sum (such that it is still smaller then T), you are facing a variation of subset-sum problem, which is NP-Hard. The variation with a constant number of items is discussed in this thread: Sum-subset with a fixed subset size.
An alternative way to look at the problem is with a 2-dimensional knapsack problem, where weight = cost, and an extra dimension for number of elements. This concept is discussed in this thread: What's the fastest way to solve knapsack prob with two properties
Now, look at your problem: Finding the number of possible ways to achieve a sum which is smaller/equal T is still NP-Hard.
Assume you had a polynomial algorithm to do it, let it be A.
Running A(T) and A(T-1) will give you two numbers, if A(T) > A(T-1), the answer to the subset sum problem would have been true - otherwise it is false, so given a polynomial solution to this problem, we could prove P=NP.
You can solve it by using dynamic programming techniques.
Let f[i][j][k] denote the number of ways to choose j balls from B_1 to B_i with sum of weights to be exactly k. The answer you want to get is f[n][m][T].
Initially, let f[i][j][k] = 1 for all i,j,k
for i = 1 to n
for j = 0 to m
for k = 0 to T
for x = 0 to min(b_i,j) # choose x balls from B_i
y = x * C_i
if y <= k
f[i][j][k] = f[i][j][k] * f[i-1][j-x][k-y] * Comb(b_i,x)
Comb(n,k) is the number of ways to choose k elements from n elements.
The time complexity is O(n m T b) where b is the maximum number of balls in a box.
Note that, because of the T in the big-O notation, theoretically it is NP-hard. However, in practice, when T is relatively small, this algorithm is still feasible.
Related
Given a non-negative integer n and a positive real weight vector w with dimension m, partition n into a length-m non-negative integer vector that sums to n (call it v) such that max w_iv_i is the smallest, that is, we want to find the vector v such that the maximum of element-wise product between w and v is the smallest. There maybe several partitions, and we only want the smallest value of max w_iv_i among all possible v.
Seems like this problem can use a greedy algorithm to solve. From a target vector v for n-1, we add 1 to each entry, and find the minimum among those m vectors. but I don't think it's correct. The intuition is that it might add "over" the minimum. That is, there exists another partition not yielded by the add 1 procedure that falls in between the "minimum" of n-1 produced by this greedy algorithm and that of n produced by this greedy algorithm. Can anyone prove if this is correct or incorrect?
If you already know the maximum element-wise product P, then you can just set vi = floor(P/wi) until you run out of n.
Use binary search to find the smallest possible value of P.
The largest guess you need to try is n * min(w), so that means testing log(n) + log(min(w)) candidates, spending O(m) time for each test, or O(m*(log n + log(min(w))) all together.
Given a list of integers l, how can I partition it into 2 lists a and b such that d(a,b) = abs(sum(a) - sum(b)) is minimum. I know the problem is NP-complete, so I am looking for a pseudo-polynomial time algorithm i.e. O(c*n) where c = sum(l map abs). I looked at Wikipedia but the algorithm there is to partition it into exact halves which is a special case of what I am looking for...
EDIT:
To clarify, I am looking for the exact partitions a and b and not just the resulting minimum difference d(a, b)
To generalize, what is a pseudo-polynomial time algorithm to partition a list of n numbers into k groups g1, g2 ...gk such that (max(S) - min(S)).abs is as small as possible where S = [sum(g1), sum(g2), ... sum(gk)]
A naive, trivial and still pseudo-polynomial solution would be to use the existing solution to subset-sum, and repeat for sum(array)/2to 0 (and return the first one found).
Complexity of this solution will be O(W^2*n) where W is the sum of the array.
pseudo code:
for cand from sum(array)/2 to 0 descending:
subset <- subsetSumSolver(array,cand)
if subset != null:
return subset
The above will return the maximal subset that is lower/equals sum(array)/2, and the other part is the complement for the returned subset.
However, the dynamic programming for subset-sum should be enough.
Recall that the formula is:
f(0,i) = true
f(x,0) = false | x != 0
f(x,i) = f(x-arr[i],i-1) OR f(x,i-1)
When building the matrix, the above actually creates you each row with value lower than the initial x, if you input sum(array)/2 - it's basically all values.
After you generate the DP matrix, just find the maximal value of x such that f(x,n)=true, and this is the best partition you can get.
Complexity in this case is O(Wn)
You can phrase this as a 0/1 integer linear programming optimization problem. Let wi be the ith number, and let xi be a 0/1 variable which indicates whether wi is in the first set or not. Then you want to minimize sum(xi wi) - sum((1 - xi) wi) subject to
sum(xi wi) >= sum((1 - xi) wi)
and also subject to all xi being 0 or 1. There has been a lot of research into optimizing 0/1 linear programming solvers. For large total sum W this may be an improvement over the O(W n) pseudo-polynomial time algorithm presented because the W factor is scary.
My first thought is to:
Sort list of integers
Create two empty lists A and B
While iterating from biggest integer to smallest integer...add next integer to the list with the smallest current sum.
This is, of course, not guaranteed to give you the best result but you can bound the result it will give you by the size of the biggest integer in your list
I have to implement an algorithm that solves the multi-selection problem.
The multiselection problem is:
Given a set S of n elements drawn from a linearly ordered set, and a set K = {k1, k2,...,kr} of positive integers between 1 and n, the multiselection problem is to select the ki-th smallest element for all values of i, 1 <= i <= r
I need to solve the average case on Θ(n log r)
I've found a paper that implements the solution I need, but it assumes that there are no repeated numbers on the set S. The problem is that I can't assume that and I don't know how to adapt the algorithm of that paper to support repeated numbers.
The paper is here: http://www.ccse.kfupm.edu.sa/~suwaiyel/publications/multiselection_parCom.pdf
and the algorithm is on the second page. Any tips are welcome!
For posterity: the algorithm to which Ivan refers is to sort K, then solve the problem recursively as follows. Use QuickSelect to find the ki-th smallest element x where i is ceil(r/2), then recurse on the smaller halves of K and S, and the larger halves of K and S, splitting K about i and S about x.
Finding algorithms that work in the presence of degeneracy (here, equal elements) is often not a high priority for authors of theoretical works, because it makes the presentation of the common case more difficult and doesn't often play a role in determining the computational complexity of the problem. This is essentially a one-dimensional problem, and the black box solution is easy; replace the i-th element of the input yi by (yi, i) and break ties in the comparisons using the second component.
In practice, we can do better. Instead of recursing on {y : y in S, y < x} and {y : y in S, y > x}, use a three-way partitioning algorithm about x (see, e.g., every sufficiently complete treatment of QuickSort), then divide the array S by index instead of value.
I've given some intervals I = {I(1), I(2), ..., I(m)} for I(i) = [a_i, b_i] (1<=a_i<=b_i<=n). You may suppose that intervals cover each other(sorry i'm poor in english), so there's no intervals such as {[1,5], [3,6]}, {[2,5], [5,7]}. And {[1,1], [2,2], ..., [n,n]} must be included in I.
Let's suppose C(i) = b_i - a_i + 1.
I want to find {I(c_1), I(c_2), ..., I(c_k)} that are non overlapped by each other, and C(c_1) + C(c_2) + ... + C(c_k) = T. (1 <= T <= n).
I could find O(n*T) DP solution using Subset Sum problem, and I think it's NP, but I'm not sure. Can I optimize more than O(n*T)?
The problem is reduceable from the Subset Sum problem (Given a set of numbers and a target number, find out if there is a subset that sums to this target) with a simple reduction:
Given an instance of subset-sum: S={c_1,c_2,..,c_n},T - create an instance of this problem by creating n non overlapping intervals, interval i, with c_i points (easy to do by ascending order). The same T remains.
Now, the answer to the subset-sum problem is true if and only if there is a subset of intervals that sums to T. It is basically the same problem, since all intervals do not overlap each other by definition of the problem.
From this we can conclude - your problem is NP-Hard.
Moreover, if we could solve the problem better then O(T*n), we could use the same approach to solve the subset sum problem better then O(T*n)1,2.
However, AFAIK, best pseudo polynomial solution to subset sum is O(T*n), so if you have such solution - stick with it.
(1) Converting the problem is O(n)
(2) This claim is true for this specific reduction alone, and NOT for the general case of polynomial reductions.
Given an array of integers, find a set of at least one integer which sums to 0.
For example, given [-1, 8, 6, 7, 2, 1, -2, -5], the algorithm may output [-1, 6, 2, -2, -5] because this is a subset of the input array, which sums to 0.
The solution must run in polynomial time.
You'll have a hard time doing this in polynomial time, as the problem is known as the Subset sum problem, and is known to be NP-complete.
If you do find a polynomial solution, though, you'll have solved the "P = NP?" problem, which will make you quite rich.
The closest you get to a known polynomial solution is an approximation, such as the one listed on Wikipedia, which will try to get you an answer with a sum close to, but not necessarily equal to, 0.
This is a Subset sum problem, It's NP-Compelete but there is pseudo polynomial time algorithm for it. see wiki.
The problem can be solved in polynomial if the sum of items in set is polynomially related to number of items, from wiki:
The problem can be solved as follows
using dynamic programming. Suppose the
sequence is
x1, ..., xn
and we wish to determine if there is a
nonempty subset which sums to 0. Let N
be the sum of the negative values and
P the sum of the positive values.
Define the boolean-valued function
Q(i,s) to be the value (true or false)
of
"there is a nonempty subset of x1, ..., xi which sums to s".
Thus, the solution to the problem is
the value of Q(n,0).
Clearly, Q(i,s) = false if s < N or s
P so these values do not need to be stored or computed. Create an array to
hold the values Q(i,s) for 1 ≤ i ≤ n
and N ≤ s ≤ P.
The array can now be filled in using a
simple recursion. Initially, for N ≤ s
≤ P, set
Q(1,s) := (x1 = s).
Then, for i = 2, …, n, set
Q(i,s) := Q(i − 1,s) or (xi = s) or Q(i − 1,s − xi) for N ≤ s ≤ P.
For each assignment, the values of Q
on the right side are already known,
either because they were stored in the
table for the previous value of i or
because Q(i − 1,s − xi) = false if s −
xi < N or s − xi > P. Therefore, the
total number of arithmetic operations
is O(n(P − N)). For example, if all
the values are O(nk) for some k, then
the time required is O(nk+2).
This algorithm is easily modified to
return the subset with sum 0 if there
is one.
This solution does not count as
polynomial time in complexity theory
because P − N is not polynomial in the
size of the problem, which is the
number of bits used to represent it.
This algorithm is polynomial in the
values of N and P, which are
exponential in their numbers of bits.
A more general problem asks for a
subset summing to a specified value
(not necessarily 0). It can be solved
by a simple modification of the
algorithm above. For the case that
each xi is positive and bounded by the
same constant, Pisinger found a linear
time algorithm.[2]
It is well known Subset sum problem which NP-complete problem.
If you are interested in algorithms then most probably you are math enthusiast that I advise you look at
Subset Sum problem in mathworld
and here you can find the algorithm for it
Polynomial time approximation algorithm
initialize a list S to contain one element 0.
for each i from 1 to N do
let T be a list consisting of xi+y,
for all y in S
let U be the union of T and S
sort U
make S empty
let y be the smallest element of U
add y to S
for each element z of U in
increasing order do //trim the list by
eliminating numbers
close one to another
if y<(1-c/N)z, set y=z and add z to S
if S contains a number between (1-c)s and s, output yes, otherwise no