Partitioning of an array in to - algorithm

Given an array of positive integers {a1, a2, ..., an} you are required to partition the array into k blocks/partitions such that the maximum of sums of integers in each partition is the minimum it can be. Restriction: you cannot alter the turn in which the numbers appear (example: if you have {2, 5, 80, 1, 200, 80, 8000, 90} one partition CANNOT be the {2, 80, 1, 90}). I need the output partition-values and the maximum sum of the partition. Some kind of Knuth's algorithm or anything else? Any sugggestion? I have no idea...
So, for example:
{11, 16, 5, 5, 12, 10} n (n=3)
The best partitioning according to the problem is:
[(11), (16, 5, 5), (12, 10)]

Given that you can't change the order of the numbers in the array, my suggestion of the solution is:
Binary search on the maximum sum
Given the current maximum sum, use greedy algorithm to see if k blocks are enough to cover the whole array.
The algorithm looks like:
l=0
r=sum(A)
while l<=r:
mid=(l+r)/2;
if greedy(mid):
r=mid-1
else:
l=mid+1
The final maximum sum will be l, and you can use it to construct the partition using greedy.
And the function greedy will look like:
def greedy(s):
now_k=0
now_sum=0
for i in A:
now_sum+=i
if now_sum>s:
now_sum=0
now_k++
if nowsum > 0:
now_k++
return now_k<=k

Related

Algorithm for obtaining from a set all sequences of size N that include a given subset

To be more specific, the sequences that we want to output can have the elements of the input subset in a different order than the one in the input subset.
EXAMPLE: Let's say we have a set {1, 9, 1, 5, 6, 9, 0 , 9, 9, 1, 10} and an input subset {9, 1}.
We want to find all subsets of size 3 that include all elements of the input subset.
This would return the sets {1, 9, 1} , {9, 1, 5}, {9, 9, 1}, {9, 1, 10}.
Of course, the algorithm with the lowest complexity possible is preferred.
EDIT: Edited with better terminology. Also, here's what I considered, in pseudocode:
1. For each sequence s in the set of size n, do the following:
2. Create a list l that is a copy of the input subset i.
3. j = 0
4. For each element e in s, do the following:
5. Check if it's in i.
If it is, remove that element from l.
6. If l is empty, add s to the sequences to return,
skip to the next sequence (go to step 1)
7. Increment j.
8. if j == n, go to the next sequence (go to step 1)
This is what I came up with, but it takes a really awful amount of time
due to the fact that we consider EVERY sequence with no memory of
previously scanned ones whatsoever. I really don't have an idea on how to implement such memory, this is all very new to me.
You could simply find all occurrences of that subset and then produce all lists that contain that combination.
In python for example:
def get_Allsubsets_withSubset(original_list, N , subset):
subset_len = len(subset)
list_len = len(original_list)
index_of_occurrence = []
overflow = N - subset_len
for index in range(len(original_list)-len(subset)):
if original_list[index: index +len(subset)] == subset:
index_of_occurrence.append(index)
final_result = []
for index in index_of_occurrence:
for value in range(overflow+1):
i = index - value
final_result.append(original_list[i:i+N])
return final_result
Its not beautiful, but I would say its a start

Subset sum problem, break ties by initial ordering

Given a set of numbers, find the subset s.t. the sum of all numbers in the subset is equal to N. Break ties between all feasible subsets by the initial order of their elements. Assume that the numbers are integers, and there exists such subset that perfectly sums up to N.
For example, given an array [2, 5, 1, 3, 4, 5] and N = 6, the output needs to be {2, 1, 3}. Although {5, 1}, {2, 4} and {1, 5} are also subsets whose total sums up to 6, we need to return {2, 1, 3} according to the ordering of the numbers in the array.
For the classic problem I know how to do it with dynamic programming but to break ties I can't think of a better approach besides finding ALL possible subsets first and then choosing the one with the best ordering. Any ideas?
I will try to provide an interesting way of breaking ties.
Let's assign another value to the items, let the value of the i_th item be called v[i].
Let there T items, the i-th item will have = , weight = .
Among all the subsets of maximum weight, we will look for the one that has the maximum accumulated value of the items. I have set the values as powers of two so as to guarantee that an item that comes first in the array is prioritized over all it's successors.
Here is a practical example:
Consider N = 8, as the limit weight that we have.
Items {8, 4, 2, 2}
Values {8, 4, 2, 1}
We will have two distinct subsets that have weight sum = 8, {8} and {4, 2, 2}.
But the first has accumulated value = 8, and the other one has accumulated_value = 7, so we will choose {8} over {4, 2, 2}.
The idea now is to formulate a dynamic programming that takes into account the total value.
= maximum accumulated value of among all subset of items from the interval [0, i] that have total weight = W.
I will give a pseudocode for the solution
Set all DP[i][j] = -infinity
DP[0][0] = 0
for(int weight = 0; weight <= N; ++weight )
{
for(int item = 0; item < T; ++item )
{
DP[item][weight] = DP[item - 1][weight]; // not using the current item
if( v[item] < weight ) continue;
else
{
DP[item][weight] = max( DP[item][weight], DP[item - 1][ weight - w[item]] + v[item])
}
}
}
How to recover the items is really trivial after running the algorithm.
DP[T][N] will be a sum of powers of two (or -infinity if it is not possible to select items that sum up to N) and the i-th item belongs to the answer if and only if, (DP[T][N] & v[i]) == v[i].

Finding a group of subsets that does not overlap

I am reviewing for an upcoming programming contest and was working on the following problem:
Given a list of integers, an integer t, an integer r, and an integer p, determine if the list contains t sets of 3, r runs of 3, and p pairs of numbers. For each of these subsets, the numbers must be adjacent and any given number can only exist in one subset, if any at all.
Currently, I am solving the problem by simply finding all sets of 3, runs of 3, and pairs and then checking all permutations until finding one which has no overlapping subsets. This seems inefficient, however, and I was wondering if there was a better solution to the problem.
Here are two examples of the problem:
{1, 1, 1, 2, 3, 4, 4, 4, 5, 5, 1, 0}, t = 1, r = 1, p = 2.
This works because we have the triple {4 4 4}, the run {1 2 3}, and the pairs {1 1} and {5 5}
{1, 1, 1, 2, 3, 3}, t = 1, r = 1, p = 1
This does not work because the only triple is {1 1 1} and the only run is {1 2 3} and the two overlap (They share a 1).
I am looking for a more efficient approach to this problem.
There is probably a faster way, but you can solve this with dynamic programming. Compute a recursive function F(t,r,p,n) which decides whether it is possible to have t triples, r runs, and p pairs in the sequence starting at position 1 and ending at n, and storing the last subset of the solution ending at position n if it is possible. If you can have a triple, run, or pair ending at position n then you have a recursive case, either. F(t-1,r,p,n-3) or F(t,r-1,p,n-3) or F(t,r,p-1,n-2), and you have the last subset stored, or otherwise you have a recursive case F(t,r,p,n-1). This looks like fourth power complexity but it really isn't, because the value of n is always decreasing so the complexity is actually O(n + TRP), where T is the total desired number of triples, R is the total desired number of runs, and P is the total desired number of pairs. So O(n^3) in the worst case.

Efficient data structure for a list of index sets

I am trying to explain by example:
Imagine a list of numbered elements E = [elem0, elem1, elem2, ...].
One index set could now be {42, 66, 128} refering to elements in E. The ordering in this set is not important, so {42, 66, 128} == {66, 128, 42}, but each element is at most once in any given index set (so it is an actual set).
What I want now is a space efficient data structure that gives me another ordered list M that contains index sets that refer to elements in E. Each index set in M will only occur once (so M is a set in this regard) but M must be indexable itself (so M is a List in this sense, whereby the precise index is not important). If necessary, index sets can be forced to all contain the same number of elements.
For example, M could look like:
0: {42, 66, 128}
1: {42, 66, 9999}
2: {1, 66, 9999}
I could now do the following:
for(i in M[2]) { element = E[i]; /* do something with E[1],E[66],and E[9999] */ }
You probably see where this is going: You may now have another map M2 that is an ordered list of sets pointing into M which ultimately point to elements in E.
As you can see in this example, index sets can be relatively similar (M[0] and M[1] share the first two entries, M[1] and M[2] share the last two) which makes me think that there must be something more efficient than the naive way of using an array-of-sets. However, I may not be able to come up with a good global ordering of index entries that guarantee good "sharing".
I could think of anything ranging from representing M as a tree (where M's index comes from the depth-first search ordering or something) to hash maps of union-find structures (no idea how that would work though:)
Pointers to any textbook datastructure for something like this are highly welcome (is there anything in the world of databases?) but I also appreciate if you propose a "self-made" solution or only random ideas.
Space efficiency is important for me because E may contain thousands or even few million elements, (some) index sets are potentially large, similarities between at least some index sets should be substantial, and there may be multiple layers of mappings.
Thanks a ton!
You may combine all numbers from M and remove duplicates and name it as UniqueM.
All M[X] collections convert to bit masks. For example int value may store 32 numbers (To support of unlimited count you should store array of ints, if array size is 10 totally we can store 320 different elements). long type may store 64 bits.
E: {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15}
M[0]: {6, 8, 1}
M[1]: {2, 8, 1}
M[2]: {6, 8, 5}
Will be converted to:
UniqueM: {6, 8, 1, 2, 5}
M[0]: 11100 {this is 7}
M[1]: 01110 {this is 14}
M[2]: 11001 {this is 19}
Note:
Also you may combine my and ring0 approaches, instead of rearrange E make new UniqueM and use intervals inside it.
It will be pretty hard to beat an index. You could save some space by using the right data type (eg in gnu C, short if less than 64k elements in E, int if < 4G...).
Besides,
Since you say the order in E is not important, you could sort E a way it maximizes the consecutive elements to match as much as possible the Ms.
For instance,
E: { 1,2,3,4,5,6,7,8 }
0: {1,3,5,7}
1: {1,3,5,8}
2: {3,5,7,8}
By re-arranging E
E: { 1,3,5,7,8,2,4,6 }
and using E indexes, not values, you could define the Ms based on subsets of E, giving indexes
0: {0-3} // E[0]: 1, E[1]: 3, E[2]: 5, E[3]: 7 etc...
1: {0-2,4}
2: {1-3,4}
this way
you use indexes instead of the raw numbers (indexes are usually smaller, no negative..)
the Ms are made of sub-sets, 0-3 meaning 0,1,2,3,
The difficult part is to make the algorithm to re-arrange E so that you maximize the subsets sizes - minimize the Ms sizes.
E rearrangement algo suggestion
sort all Ms
process all Ms:
algo to build a map, which gives for an element 'x' its list of neighbors 'y', along with points, number of times 'y' is just after 'x'
Map map (x,y) -> z
for m in Ms
for e,f in m // e and f are consecutive elements
if ( ! map(e,f)) map(e,f) = 1
else map(e,f)++
rof
rof
Get E rearranged
ER = {} // E rearranged
Map mas = sort_map(map) // mas(x) -> list(y) where 'y' are sorted desc based on 'z'
e = get_min_elem(mas) // init with lowest element (regardless its 'z' scores)
while (mas has elements)
ER += e // add element e to ER
f = mas(e)[0] // get most likely neighbor of e (in f), ie first in the list
if (empty(mas(e))
e = get_min_elem(mas) // Get next lowest remaining value
else
delete mas(e)[0] // set next e neighbour in line
e = f
fi
elihw
The algo (map) should be O(n*m) space, with n elements in E, m elements in all Ms.
Bit arrays may be used. They're arrays of elements a[i] which are 1 if i is in set and 0 if i is not in set. So every set would occupy exactly size(E) bits even if it contain a few or no members. Not so space efficient, but if you compress this array with some compression algorithm it will be much less in size (possibly reaching ultimate entropy limit). So you can try dynamic Markov coder or RLE or group Huffman and choose one most efficient for you. Then, iteration process could include on-the-fly decompression followed by linear scanning for 1 bits. For looong 0 runs you could modify decompression algorithm to detect such cases (RLE is simplest case for it).
If you found sets having small defference, you may store sets A and A xor B anstead of A and B saving space for common parts. In this case to iterate over B you'll have to unpack both A and A xor B then xor them.
Another useful solution:
E: {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15}
M[0]: {1, 2, 3, 4, 5, 10, 14, 15}
M[1]: {1, 2, 3, 4, 5, 11, 14, 15}
M[2]: {1, 2, 3, 4, 5, 12, 13}
Cache frequently used items:
Cache[1] = {1, 2, 3, 4, 5}
Cache[2] = {14, 15}
Cache[3] = {-2, 7, 8, 9} //Not used just example.
M[0]: {-1, 10, -2}
M[1]: {-1, 11, -2}
M[2]: {-1, 12, 13}
Mark links to cached list as negative numbers.

List all k-tuples with entries summing to n, ignoring rotations

Is there an efficient algorithm for finding all sequences of k non-negative integers that sum to n, while avoiding rotations (completely, if possible)? The order matters, but rotations are redundant for the problem I'm working on.
For example, with k = 3 and n = 3, I would want to get a list like the following:
(3, 0, 0), (2, 1, 0), (2, 0, 1), (1, 1, 1).
The tuple (0, 3, 0) should not be on the list, since it is a rotation of (3, 0, 0). However, (0, 3, 0) could be in the list instead of (3, 0, 0). Note that both (2, 1, 0) and (2, 0, 1) are on the list -- I do not want to avoid all permutations of a tuple, just rotations. Additionally, 0 is a valid entry -- I am not looking for partitions of n.
My current procedure is to loop from over 1 <= i <= n, set the first entry equal to i, and then recursively solve the problem for n' = n - i and k' = k - 1. I get some speed-up by mandating that no entry is strictly greater than the first, but this approach still generate a lot of rotations -- for example, given n = 4 and k = 3, both (2,2,0) and (2,0,2) are in the output list.
Edit: Added clarifications in bold. I apologize for not making these issues as clear as I should have in the original post.
You can first generate the partitions (which ignore order completely) as a tuple (x_1, x_2, ..., x_n)
where x_i = number of times i occurs.
So Sum i* x_i = n.
I believe you already know how to do this (from your comments).
Once you have a partition, you can now generate the permutations for this (viewing it as a multiset {1,1,...,2,2...,...n}, where i occurs x_i times) which ignore rotations, using the answer to this question:
Is there an algorithm to generate all unique circular permutations of a multiset?
Hope that helps.
You could just sort your solutions and eliminate rotations that way.
OR
you can try to make your recursive solution build tuples that will only ever be sorted
how? here's something I made up quickly
static list<tuple> tups;
void recurse(tuple l, int n, int k, int m)
{
if (k == 0 && n == 0)
{
tups.add(l);
return;
}
if (k == 0)
return;
if (k*m > n) //prunes out tuples that could not possibly be sorted
return;
else
for(int x = m; x <= n; x++)
recurse(l.add(x), n-x, k-1, x); //try only tuples that are increasing
}
call this with m = 0 and an empty list for the initial step.
here's a C# console app implementation : http://freetexthost.com/b0i05jkb4e
Oh, I see my mistake in the assumption of rotation, I thought you just meant permutations, not an actual rotation.
However, you can extend my solution to create non-rotational permutations of the unique increasing tuples. I'm working on it now
You need to generate Integer Partitions in lexicographical order.
Here is a very good paper with fast algorithms.
HTH.
Note that CAS programs usually implement these functions. For example in Mathematica:
Innput: IntegerPartitions[10, {3}]
Output: {{8, 1, 1}, {7, 2, 1}, {6, 3, 1},
{6, 2, 2}, {5, 4, 1}, {5, 3, 2},
{4, 4, 2}, {4, 3, 3}}

Resources