Dividing an integer array into two equal sized sub-arrays? - algorithm

I came across this question and couldn't find a reasonable solution.
How would you divide an unsorted integer array into 2 equal sized sub-arrays such that, difference between sub-array sums is minimum.
For example: given an integer array a[N] (unsorted), we want to split the array into be split into a1 and a2 where a1.length == a2.length i.e N/2 and (sum of all numbers in a1 - sum of all numbers in a2) should be minimum.
For the sake of simplicity, let's assume all numbers are positve but there might be repetitions.

While others have mentioned that this is a case of the partition problem with modification, I'd like to point out, more specifically, that it is actually a special case of the minimum makespan problem with two machines. Namely, if you solve the two-machine makespan problem and obtain a value m, you obtain the minimum difference 2*m - sum(i : i in arr)
As the wikipedia article states, the problem is NP-complete for more than 2 machines. However, in your case, the List scheduling algorithm, which in general provides an
approximate answer, is optimal and polynomial-time for the two-machine and three-machine case given a sorted list in non-increasing order.
For details, and some more theoretical results on this algorithm, see here.

Related

Complement of a set of intervals

I have a set of intervals inside the range [0,k].
How can I produce the complement set of this set of intervals?
I can come up with an algorithm, but it requires sorting the intervals.
Therefore, the complexity is O(nlogn), where n is the number of intervals.
Is there any faster algorithm to do this? If not, is there any way to prove that this is the optimal complexity?
Thank you.
In practice, let us assume that you have found an algorithm to perform this task (find the complement set) in O(n).
Then we can show that you have invented a new sort algorithm working in O(n).
To simplify, let us assume that the array to be sorted consists of natural numbers, and that there is no repetition.
If [a1 a2 ... an] need to be sorted, then consider the intervals [a1, a1+1) [a2, a2 + 1) ... [an, an + 1).
Applying you algorithm to generate the complement set of intervals in O(n), we get n intervals
​ [x1a + 1, x1b) [x2a + 1, x2b])... [xna + 1, xnb)
where the {xia, xib} corresponds to successive aj elements after sorting.
Let us assimilate this relation as a directed edge in a graph, connecting the two vertices xia and xib.
To get the original array in sorted array, we need to find the start of the graph, and then walking through the graph, which can be done in O(n).
The sort has been performed in O(n).
The fact that we did not consider repetitions is not too annoying from a theoretical point of view, if we consider for example that with hashing, we can suppress the repetitions in O(n).
The fact to not consider floating point values is a detail: finding a new O(n) sort algorithm for natural numbers
would be already a great result.

Count the total number of subsets that don't have consecutive elements

I'm trying to solve pretty complex problem with combinatorics and counting subsets. First of all let's say we have given set A = {1, 2, 3, ... N} where N <= 10^(18). Now we want to count subsets that don't have consecutive numbers in their representation.
Example
Let's say N = 3, and A = {1,2,3}. There are 2^3 total subsets but we don't want to count the subsets (1,2), (2,3) and (1,2,3). So in total for this question we want to answer 5 because we want to count only the remaining 5 subsets. Those subsets are (Empty subset), (1), (2), (3), (1,3). Also we want to print the result modulo 10^9 + 7.
What I've done so far
I was thinking that this should be solved using dynamical programming with two states (are we taking the i-th element or not), but then I saw that N could go up to 10^18, so I was thinking that this should be solved using mathematical formula. Can you please give me some hints where should I start to get the formula.
Thanks in advance.
Take a look at How many subsets contain no consecutive elements? on the Mathematics Stack Exchange.
They come to the conclusion that the number of non-consecutive subsets in the set {1,2,3...n} is the fib(n+2) where fib is the function computing the Fibonacci sequence for the number n+2. Your solution to n=3 conforms to this solution. If you can implement the Fibonacci algorithm, then you can solve this problem, but solving the question for a number as large as 10^18 will still be a challenge.
As mentioned in the comments here, you can check out the fast doubling algorithm on Hacker Earth.
It will find Fibonacci numbers in O(log n).

Algorithms for bucketizing integers into buckets with zero sums

Suppose we have an array of integers (both negative and positive) A[1 ... n] such that all the elements sum to zero. Now, whenever I have a bunch of integers that sum to zero, I will call them a group and I want to split A in as many disjoint groups as possible. Can you suggest any paper discussing this very same problem?
It sounds like your problem consists of two NP-Complete problems.
The first would be finding all subsets that solve the Subset Sum problem. This problem does have an exponential time complexity (as implied by amit in the comments), but it is a very reasonable extension of the Subset Sum problem from a theoretical standpoint. For example, if you can solve the Subset Sum problem by dynamic programming and generate the canonical 2D array as a result, this array will contain enough information to generate all possible solutions using a traceback.
The second NP-Complete problem embedded within your problem is the Integer Linear Programming problem. Given all possible subsets solving the Subset Sum problem, N total, we want to select select 0<=n<=N, such that the value of n is maximized and no element of A is repeated.
I doubt there is a publication devoted to describing this problem because it seems to involve a straightforward application of known theory.

How to generate random permutations fast

I read a question in an algorithm book:
"Given a positive integer n, choose 100 random permutations of [1,2,...,n],..."
I know how to generate random permutations with Knuth's algorithm. but does there exist any fast algorithms to generate large amount of permutations ?
Knuth shuffles require you to do n random swaps for a permutation of n elements (see http://en.wikipedia.org/wiki/Random_permutation#Knuth_shuffles) so the complexity is O(n) which is about the best you can expect if you receive a permutation on n elements.
Is this causing you a practical problem? If so, perhaps you could look at what you are doing with all those permutations in practice. Apart from simply getting by on less, you could think about deferring generating a permutation until you are sure you need it. If you need a permutation on n objects but only look at k of those n objects, perhaps you need a scheme for generating only those k elements. For small k, you could simply generate k random numbers in the range [0, n) at random, repeating generations which return numbers which have already come up. For small k, this would be unlikely.
There exist N! permutations of numbers from 1 to N. If you sort them lexicographically, like in a dictionary, it's possible to construct permutation knowing it order in a list of sorted permutations.
For example, let N=3, lexicographically sorted list of permutations is {123,132,213,231,312,321}. You generate number between 1 and 3!, for example 5. 5-th permutaion is 312. How to construct it?
Let's find the 1-st number of 5-th permutation. Let's divide permutations into blocks, criteria is 1-st number, I means such groups - {123,132},{213,231},{312,321}. Each group contains (n-1)! elements. The first number of permutation is the block number. 5-th permutation is in ceil(5/(3-1)!)=3 block. So, we've just found the first number of 5-th permutation it's 3.
Now I'm looking for not 5-th but (5-(n-1)!*(ceil(5/2)-1))=5-2*2=1-th permutation in
{3,1,2},{3,2,1}. 3 is determined and the same for all group members, so I'm actually searching for 1-th permutation in {1,2},{2,1} and N now is 2. Again, next_num = ceil(1/(new_N-1)!) = 1.
Continue it N times.
Hope you got the idea. Complexity is O(N) - because you constructing permutation elements one by one with arithmetical tricks.
UPDATE
When you got next number by arithmetical opearions you should also keep used array and instead of X take X-th unused Complexity becomes NlogN because logN needed for getting X-th unused element

Rewrite O(N W) in terms of N

I have this question that asks to rewrite the subset sum problem in terms of only N.
If unaware the problem is that given weights, each with cost 1 how would you find the optimal solution given a max weight to achieve.
So the O(NW) is the space and time costs, where space will be for the 2d matrix and in the use of dynamic programming. This problem is a special case of the knapsac problem.
I'm not sure how to approach this as I tried to think about it and only thing I thought of was find the sum of all weights and just have a general worst case scenario. Thanks
If the weight is not bounded, and so the complexity must depend solely on N, there is at least an O (2N) approach, which is trying all possible subsets of N elements and computing their sums.
If you are willing to use exponential space rather than polynomial space, you can solve the problem in O(n 2^(n/2)) time and O(2^(n/2)) space if you split your set of n weights into two sets A and B of roughly equal size and compute the sum of weights for all the subsets of the two sets, and then hash all sums of subsets in A and hash W - x for all sums x of subsets of B, and if you get a collision between a subset of A and a subset of B in the hash table then you have found a subset that sums to W.

Resources