Algorithms for bucketizing integers into buckets with zero sums - algorithm

Suppose we have an array of integers (both negative and positive) A[1 ... n] such that all the elements sum to zero. Now, whenever I have a bunch of integers that sum to zero, I will call them a group and I want to split A in as many disjoint groups as possible. Can you suggest any paper discussing this very same problem?

It sounds like your problem consists of two NP-Complete problems.
The first would be finding all subsets that solve the Subset Sum problem. This problem does have an exponential time complexity (as implied by amit in the comments), but it is a very reasonable extension of the Subset Sum problem from a theoretical standpoint. For example, if you can solve the Subset Sum problem by dynamic programming and generate the canonical 2D array as a result, this array will contain enough information to generate all possible solutions using a traceback.
The second NP-Complete problem embedded within your problem is the Integer Linear Programming problem. Given all possible subsets solving the Subset Sum problem, N total, we want to select select 0<=n<=N, such that the value of n is maximized and no element of A is repeated.
I doubt there is a publication devoted to describing this problem because it seems to involve a straightforward application of known theory.

Related

Subset sum with unlimited elements

I am trying to solve a coding problem at topcoder for practice. I believe I have solved it partly, but am struggling with the other half.
The essence of the problem is "Given a set P with positive integers, find the smallest set of numbers that adds up to a sum S. You can use an element of the set more than once. There may be a case where the sum is not attainable as well."
For small inputs, the exponential algorithm of searching through all possible subsets works. However, the size of the set can go up to 1024.
What is the idea behind solving this problem? Is this problem even an extension of subset-sum?
[EDIT]
This is the problem on topcoder : https://community.topcoder.com/stat?c=problem_statement&pm=8571

Fewest subsets with sum less than N

I have a specific sub-problem for which I am having trouble coming up with an optimal solution. This problem is similar to the subset sum group of problems as well as space filling problems, but I have not seen this specific problem posed anywhere. I don't necessarily need the optimal solution (as I am relatively certain it is NP-hard), but an effective and fast approximation would certainly suffice.
Problem: Given a list of positive valued integers find the fewest number of disjoint subsets containing the entire list of integers where each subset sums to less than N. Obviously no integer in the original list can be greater than N.
In my application I have many lists and I can concatenate them into columns of a matrix as long as they fit in the matrix together. For downstream purposes I would like to have as little "wasted" space in the resulting ragged matrix, hence the space filling similarity.
Thus far I am employing a greedy-like approach, processing from the largest integers down and finding the largest integer that fits into the current subset under the limit N. Once the smallest integer no longer fits into the current subset I proceed to the next subset similarly until all numbers are exhausted. This almost certainly does not find the optimal solution, but was the best I could come up with quickly.
BONUS: My application actually requires batches, where there is a limit on the number of subsets in each batch (M). Thus the larger problem is to find the fewest batches where each batch contains M subsets and each subset sums to less than N.
Straight from Wikipedia (with some bold amendments):
In the bin packing problem, objects [Integers] of different volumes [values] must be
packed into a finite number of bins [sets] or containers each of volume V [summation of the subset < V] in
a way that minimizes the number of bins [sets] used. In computational
complexity theory, it is a combinatorial NP-hard problem.
https://en.wikipedia.org/wiki/Bin_packing_problem
As far as I can tell, this is exactly what you are looking for.

How to choose the space of optimal substructures for dynamic programming algorithms?

I am reading up dynamic programming chapter of Introduction to Algorithms by Cormen et al. I am trying to understand how to characterize the space of subproblems . They gave two examples of dynamic programming . Both these two problems have an input of size n
Rod cutting problem (Cut the rod of size n optimally)
Matrix parenthesization problem .(Parenthesize the matrix product A1 . A2 .A3 ...An optimally to get the least number of scalar multiplications)
For the first problem , they choose a subproblem of the form where they make a cut of length k , assuming that the left subproblem resulting from the cut can not be cut any further and the right subproblem can be cut further thereby giving us a single subproblem of size (n-k) .
But for the second problem that choose subproblems of the type Ai...Aj where 1<=i<=j<=n . Why did they choose to keep both ends open for this problem ? Why not close one end and just consider on subproblems of size (n-k)? Why need both i and j here instead of a single k split?
It is an art. There are many types of dynamic programming problems, and it is not easy to define one way to work out what dimensions of space we want to solve sub-problems for.
It depends on how the sub-problems interact, and very much on the size of each dimension of space.
Dynamic programming is a general term describing the caching or memoization of sub-problems to solve larger problems more efficiently. But there are so many different problems that can be solved by dynamic programming in so many different ways, that I cannot explain it all, unless you have a specific dynamic programming problem that you need to solve.
All that I can suggest is to try when solving a problem is:
if you know how to solve one problem, you can use similar techniques for similar problems.
try different approaches, and estimate the order of complexity (in time and memory) in terms of input size for each dimension, then given the size of each dimension, see if it executes fast enough, and within memory limits.
Some algorithms that can be described as dynamic programming, include:
shortest path algorithms (Dijkstra, Floyd-Warshall, ...)
string algorithms (longest common subsequence, Levenshtein distance, ...)
and much more...
Vazirani's technical note on Dynamic Programming
http://www.cs.berkeley.edu/~vazirani/algorithms/chap6.pdf has some useful ways create subproblems given an input. I have added some other ways to the list below:
Input x_1, x_2, ..x_n. Subproblem is x_1...x_i.
Input x_1, x_2....x_n. Subproblem is x_i, ...x_j.
Input x_1, x_2...x_n and y_1, y_2..y_m. Subproblem is x_1, x_2, ..x_i and y_1, y_2, ..y_j.
Input is a rooted tree. Subproblem is a rooted subtree.
Input is a matrix. Subproblem is submatrices of different lengths that share a corner with the original matrix.
Input is a matrix. Subproblem is all possible submatrices.
Which subproblems to use usualy depends on the problem. Try out these known variations and see which one suits your needs the best.

test whether k elements of a set add up to a certain number

Is there a way to determine if k elements of a set add up to a certain number in polynomial time?
How big is the number?
This is a variation on the subset sum problem, which is well-known and NP-complete. However dynamic programming techniques will make it polynomial if the set of possible values that the subsets can take grows polynomially. Which with general integers, isn't true. But with numbers picked from a restricted range happens surprisingly often.

Maximum two-dimensional subset-sum

I'm given a task to write an algorithm to compute the maximum two dimensional subset, of a matrix of integers. - However I'm not interested in help for such an algorithm, I'm more interested in knowing the complexity for the best worse-case that can possibly solve this.
Our current algorithm is like O(n^3).
I've been considering, something alike divide and conquer, by splitting the matrix into a number of sub-matrices, simply by adding up the elements within the matrices; and thereby limiting the number of matrices one have to consider in order to find an approximate solution.
Worst case (exhaustive search) is definitely no worse than O(n^3). There are several descriptions of this on the web.
Best case can be far better: O(1). If all of the elements are non-negative, then the answer is the matrix itself. If the elements are non-positive, the answer is the element that has its value closest to zero.
Likewise if there are entire rows/columns on the edges of your matrix that are nothing but non-positive integers, you can chop these off in your search.
I've figured that there isn't a better way to do it. - At least not known to man yet.
And I'm going to stick with the solution I got, mainly because its simple.

Resources