Find algorithm to split sequence in 2 to minimize difference in sum [duplicate] - algorithm

This question already has answers here:
Is partitioning an array into halves with equal sums P or NP?
(5 answers)
Closed 9 years ago.
Here's the problem: given a sequence of numbers, split these numbers into 2 sequences, so that the difference between the two sequences is the minimum. For example, given the sequence: [5, 4, 3, 3, 3] the solution is:
[5, 4] -> sum is 9
[3, 3, 3] -> sum is 9
The difference is 0
In other terms, can you find an algorithm (C language preferred) that given an input vector (variable size) of integers, can output two vector where the difference between the two sum is minimum?
Brutal force algorithm should be avoided.
To be sure to get the right solution, should be nice to compare in a benchmark the results between your algorithm and a brutal force algorithm.

It sounds like a sub-arrays problem (which is my interpretation of "sequences").
Meaning the only possibilities for 5, 4, 3, 3, 3 are:
| 5, 4, 3, 3, 3 => 0 - 18 => 18
5 | 4, 3, 3, 3 => 5 - 13 => 8
5, 4 | 3, 3, 3 => 9 - 9 => 0
5, 4, 3 | 3, 3 => 12 - 6 => 6
5, 4, 3, 3 | 3 => 15 - 3 => 12
5, 4, 3, 3, 3 | => 18 - 0 => 18 (same as first)
It is as simple as just comparing the sums on either side of every index.
Code: (untested)
int total = 0;
for (int i = 0; i < n; i++)
total += arr[i];
int best = INT_MAX, bestPos = -1, current = 0;
for (int i = 0; i < n; i++)
{
current += arr[i];
int diff = abs(current - total);
if (diff < best)
{
best = diff;
bestPos = i;
}
// else break; - optimisation, may not work
}
printf("The best position is at %d\n", bestPos);
The above is O(n), logically, you can't do much better than that.
You can slightly optimize the above by doing a binary-search-like process on the sequence to get down to n + log n rather than 2n, but both are O(n). Basic pseudo-code:
sum[0] = arr[0]
// sum[i] represents sum from indices 0 to i
for (i = 1:n)
sum[i] = sum[i-1] + arr[i]
total = sum[n]
start = 0
end = n
best = MAX
repeat:
if (start == end) stop
mid = (start + end) / 2
sumFromMidToN = sum[n] - sum[mid]
best = max(best, abs(sumFromMidToN - sum[mid]))
if (sum[mid] > sumFromMidToN)
end = mid
else if (sum[mid] < sumFromMidToN)
start = mid
else
stop
If it's actually subsets, then, as already mentioned, it appears to be the optimization version of the Partition problem, which is a lot more difficult.

Related

Find greater number form self in left side and smaller number form self in right side

Consider an array a of n integers, indexed from 1 to n.
For every index i such that 1<i<n, define:
count_left(i) = number of indices j such that 1 <= j < i and a[j] > a[i];
count_right(i) = number of indices j such that i < j <= n and a[j] < a[i];
diff(i) = abs(count_left(i) - count_right(i)).
The problem is: given array a, find the maximum possible value of diff(i) for 1 < i < n.
I got solution by brute force. Can anyone give better solution?
Constraint: 3 < n <= 10^5
Example
Input Array: [3, 6, 9, 5, 4, 8, 2]
Output: 4
Explanation:
diff(2) = abs(0 - 3) = 3
diff(3) = abs(0 - 4) = 4
diff(4) = abs(2 - 2) = 0
diff(5) = abs(3 - 1) = 2
diff(6) = abs(1 - 1) = 0
maximum is 4.
O(nlogn) approach:
Walk through array left to right and add every element to augmented binary search tree (RB, AVL etc) containing fields of subtree size, initial index and temporary rank field. So immediately after adding we know rank of element in the current tree state.
lb = index - temprank
is number of left bigger elements - remember it in temprank field.
After filling the tree with all items traverse tree again, retrieving final element rank.
rs = finalrank - temprank
is number of right smaller elements. Now just get abs of difference of lb and rs
diff = abs(lb - rs) = abs(index - temprank - finalrank + temprank ) =
abs(index - finalrank)
But ... we can see that we don't need temprank at all.
Moreover - we don't need binary tree!
Just perform sorting of pairs (element; initial index) by element key and get max absolute difference of new_index - old_index (except for old indices 1 and n)
a 3, 6, 9, 5, 4, 8, 2
old 2 3 4 5 6
new 5 7 4 3 6
dif 3 4 0 2 0
Python code for concept checking
a = [3, 6, 9, 5, 4, 8, 2]
b = sorted([[e,i] for i,e in enumerate(a)])
print(b)
print(max([abs(n-o[1]) if 0<o[1]<len(a)-1 else 0 for n,o in enumerate(b)]))

Finding the maximum possible sum/product combination of integers

Given an input of a list of N integers always starting with 1, for example: 1, 4, 2, 3, 5. And some target integer T.
Processing the list in order, the algorithm decides whether to add or multiply the number by the current score to achieve the maximum possible output < T.
For example: [input] 1, 4, 2, 3, 5 T=40
1 + 4 = 5
5 * 2 = 10
10 * 3 = 30
30 + 5 = 35 which is < 40, so valid.
But
1 * 4 = 4
4 * 2 = 8
8 * 3 = 24
24 * 5 = 120 which is > 40, so invalid.
I'm having trouble conceptualizing this in an algorithm -- I'm just looking for advice on how to think about it or at most pseudo-code. How would I go about coding this?
My first instinct was to think about the +/* as 1/0, and then test permutations like 0000 (where length == N-1, I think), then 0001, then 0011, then 0111, then 1111, then 1000, etc. etc.
But I don't know how to put that into pseudo-code given a general N integers. Any help would be appreciated.
You can use recursive to implement the permutations. Python code below:
MINIMUM = -2147483648
def solve(input, T, index, temp):
# if negative value exists in input, remove below two lines
if temp >= T:
return MINIMUM
if index == len(input):
return temp
ans0 = solve(input, T, index + 1, temp + input[index])
ans1 = solve(input, T, index + 1, temp * input[index])
return max(ans0, ans1)
print(solve([1, 4, 2, 3, 5], 40, 1, 1))
But this method requires O(2^n) time complexity.

Algorithm for combination index in array

I know how to compute the binomial coefficient given n and k (and I use a library for that).
Now imagine you store all of those combinations inside an array. Each combination has an index. I don't actually need to store them in an array, but I need to know, if I were to store them, what the array index would be for each combination.
E.g. given an element of the C(n,k) set of possible combinations, I need a function that gives me its index i inside the array, withtout actually creating the whole array. The programming language does not matter, though I need to implement the function in java, but any (pseudo-)language algorithm will do, I will then translate it to java.
For example, in the case of n=5 and k=2, I empirically define this function f(n, k, [a,b]) => N as:
f(5, 2, [1,2]) = 0
f(5, 2, [1,3]) = 1
f(5, 2, [1,4]) = 2
f(5, 2, [1,5]) = 3
f(5, 2, [2,3]) = 4
f(5, 2, [2,4]) = 5
f(5, 2, [2,5]) = 6
f(5, 2, [3,4]) = 7
f(5, 2, [3,5]) = 8
f(5, 2, [4,5]) = 9
meaning that the (3,5) combination would occupy index 8 in the array. Another example with n=5 and k=3 is f(n, k, [a,b,c]) => N, empirically defined as
f(5, 3, [1,2,3]) = 0
f(5, 3, [1,2,4]) = 1
f(5, 3, [1,2,5]) = 2
f(5, 3, [1,3,4]) = 3
f(5, 3, [1,3,5]) = 4
f(5, 3, [1,4,5]) = 5
f(5, 3, [2,3,4]) = 6
f(5, 3, [2,3,5]) = 7
f(5, 3, [2,4,5]) = 8
f(5, 3, [3,4,5]) = 9
EDIT for clarification after comments:
In the example above [1,2,3], [2,4,5] and so on, are one of the elements of the C(n,k) set, e.g. one of the possible combinations of k numbers out of n possible numbers. The function needs them in order to compute their index in the array.
However I need to write this function for generic values of n and k and without creating the array. Maybe such a function already exists in some combinatorial calculus library, but I don't even know how it would be called.
You should have a look at the combinatorial number system, the section "Place of a combination in the ordering" in particular.
There is even an example, which might help you: National Lottery example in Excel. (I'm sorry, but I can't type any math in here.)
In your case this would be
f(5, 3, [2,3,4]) = binom(5,3) - 1 - binom(5-2,3) - binom(5-3,2) - binom(5-4,1) =
= 10 - 1 - 1 - 1 - 1 = 6
or if you accept a reversed order, you may omit the binom(5,3) - 1 part and calculate
f'(5, 3, [2,3,4]) = binom(5-2,3) + binom(5-3,2) + binom(5-4,1) - 1 =
= 1 + 1 + 1 - 1 = 2
(This should save you some time, as binom(5, 3) is not really necessary here.)
In Java the method could be
int f(int n, int k, int[] vars) {
int res = binom(n, k) - 1;
for(int i = 0; i < k; i++) {
res -= binom(n - vars[i], k - i);
}
return res;
}
or
int fPrime(int n, int k, int[] vars) {
int res = -1;
for(int i = 0; i < k; i++) {
res += binom(n - vars[i], k - i);
}
return res;
}
assuming a method int binom(int n, int k) and binom(n, k) = 0 for n < k.

given n, how to find the number of different ways to write n as the sum of 1, 3, 4 in ruby?

Problem: given n, find the number of different ways to write n as the sum of 1, 3, 4
Example:for n=5, the answer is 6
5=1+1+1+1+1
5=1+1+3
5=1+3+1
5=3+1+1
5=1+4
5=4+1
I have tried with permutation method,but its efficiency is very low,is there a more efficient way to do?
Using dynamic programming with a lookup table (implemented with a hash, as it makes the code simpler):
nums=[1,3,4]
n=5
table={0=>1}
1.upto(n) { |i|
table[i] = nums.map { |num| table[i-num].to_i }.reduce(:+)
}
table[n]
# => 6
Note: Just checking one of the other answers, mine was instantaneous for n=500.
def add_next sum, a1, a2
residue = a1.inject(sum, :-)
residue.zero? ? [a1] : a2.reject{|x| residue < x}.map{|x| a1 + [x]}
end
a = [[]]
until a == (b = a.flat_map{|a| add_next(5, a, [1, 3, 4])})
a = b
end
a:
[
[1, 1, 1, 1, 1],
[1, 1, 3],
[1, 3, 1],
[1, 4],
[3, 1, 1],
[4, 1]
]
a.length #=> 6
I believe this problem should be addressed in two steps.
Step 1
The first step is to determine the different numbers of 1s, 3s and 4s that sum to the given number. For n = 5, there are only 3, which we could write:
[[5,0,0], [2,1,0], [1,0,1]]
These 3 elements are respectively interpreted as "five 1s, zero 3s and zero 4s", "two 1s, one 3 and zero 4s" and "one 1, zero 3s and one 4".
To compute these combinations efficiently, I first I compute the possible combinations using only 1s, that sum to each number between zero and 5 (which of course is trivial). These values are saved in a hash, whose keys are the summands and the value is the numbers of 1's needed to sum to the value of the key:
h0 = { 0 => 0, 1 => 1, 2 => 2, 3 => 3, 4 => 4, 5 => 5 }
(If the first number had been 2, rather than 1, this would have been:
h0 = { 0 => 0, 2 => 1, 4 => 2 }
since there is no way to sum only 2s to equal 1 or 3.)
Next we consider using both 1 and 3 to sum to each value between 0 and 5. There are only two choices for the number of 3s used, zero or one. This gives rise to the hash:
h1 = { 0 => [[0,0]], 1 => [[1,0]], 2 => [[2,0]], 3 => [[3,0], [0,1]],
4 => [[4,0], [1,1]], 5 => [[5,0], [2,1]] }
This indicates, for example, that:
there is only 1 way to use 1 and 3 to sum to 1: 1 => [1,0], meaning one 1 and zero 3s.
there are two ways to sum to 4: 4 => [[4,0], [1,1]], meaning four 1s and zero 3s or one 1 and one 3.
Similarly, when 1, 3 and 4 can all be used, we obtain the hash:
h2 = { 5 => [[5,0,0], [2,1,0], [1,0,1]] }
Since this hash corresponds to the use of all three numbers, 1, 3 and 4, we are concerned only with the combinations that sum to 5.
In constructing h2, we can use zero 4s or one 4. If we use use zero 4s, we would use one 1s and 3s that sum to 5. We see from h1 that there are two combinations:
5 => [[5,0], [2,1]]
For h2 we write these as:
[[5,0,0], [2,1,0]]
If one 4 is used, 1s and 3s totalling 5 - 1*4 = 1 are used. From h1 we see there is just one combination:
1 => [[1,0]]
which for h2 we write as
[[1,0,1]]
so
the value for the key 5 in h2 is:
[[5,0,0], [2,1,0]] + [[1,0,1]] = [[5,0,0], [2,1,0]], [1,0,1]]
Aside: because of form of hashes I've chosen to represent hashes h1 and h2, it is actually more convenient to represent h0 as:
h0 = { 0 => [[0]], 1 => [[1]],..., 5 => [[5]] }
It should be evident how this sequential approach could be used for any collection of integers whose combinations are to be summed.
Step 2
The numbers of distinct arrangements of each array [n1, n3, n4] produced in Step 1 equals:
(n1+n3+n4)!/(n1!n3!n4!)
Note that if one of the n's were zero, these would be binomial coefficients. If fact, these are coefficients from the multinomial distribution, which is a generalization of the binomial distribution. The reasoning is simple. The numerator gives the number of permutations of all the numbers. The n1 1s can be permuted n1! ways for each distinct arrangement, so we divide by n1!. Same for n3 and n4
For the example of summing to 5, there are:
5!/5! = 1 distinct arrangement for [5,0,0]
(2+1)!/(2!1!) = 3 distinct arrangements for [2,1,0] and
(1+1)!/(1!1!) = 2 distinct arrangements for [1,0,1], for a total of:
1+3+2 = 6 distinct arrangements for the number 5.
Code
def count_combos(arr, n)
a = make_combos(arr,n)
a.reduce(0) { |tot,b| tot + multinomial(b) }
end
def make_combos(arr, n)
arr.size.times.each_with_object([]) do |i,a|
val = arr[i]
if i.zero?
a[0] = (0..n).each_with_object({}) { |t,h|
h[t] = [[t/val]] if (t%val).zero? }
else
first = (i==arr.size-1) ? n : 0
a[i] = (first..n).each_with_object({}) do |t,h|
combos = (0..t/val).each_with_object([]) do |p,b|
prev = a[i-1][t-p*val]
prev.map { |pr| b << (pr +[p]) } if prev
end
h[t] = combos unless combos.empty?
end
end
end.last[n]
end
def multinomial(arr)
(arr.reduce(:+)).factorial/(arr.reduce(1) { |tot,n|
tot * n.factorial })
end
and a helper:
class Fixnum
def factorial
return 1 if self < 2
(1..self).reduce(:*)
end
end
Examples
count_combos([1,3,4], 5) #=> 6
count_combos([1,3,4], 6) #=> 9
count_combos([1,3,4], 9) #=> 40
count_combos([1,3,4], 15) #=> 714
count_combos([1,3,4], 30) #=> 974169
count_combos([1,3,4], 50) #=> 14736260449
count_combos([2,3,4], 50) #=> 72581632
count_combos([2,3,4,6], 30) #=> 82521
count_combos([1,3,4], 500) #1632395546095013745514524935957247\
00017620846265794375806005112440749890967784788181321124006922685358001
(I broke the result the example (one long number) into two pieces, for display purposes.)
count_combos([1,3,4], 500) took about 2 seconds to compute; the others were essentially instantaneous.
#sawa's method and mine gave the same results for n between 6 and 9, so I'm confident they are both correct. sawa's solution times increase much more quickly with n than do mine, because he is computing and then counting all the permutations.
Edit: #Karole, who just posted an answer, and I get the same results for all my tests (including the last one!). Which answer do I prefer? Hmmm. Let me think about that.)
I don't know ruby so I am writing it in C++
say for your example n=5.
Use dynamic programming set
int D[n],n;
cin>>n;
D[0]=1;
D[1]=1;
D[2]=1;
D[3]=2;
for(i = 4; i <= n; i++)
D[i] = D[i-1] + D[i-3] + D[i-4];
cout<<D[i];

Link list algorithm to find pairs adding up to 10

Can you suggest an algorithm that find all pairs of nodes in a link list that add up to 10.
I came up with the following.
Algorithm: Compare each node, starting with the second node, with each node starting from the head node till the previous node (previous to the current node being compared) and report all such pairs.
I think this algorithm should work however its certainly not the most efficient one having a complexity of O(n2).
Can anyone hint at a solution which is more efficient (perhaps takes linear time). Additional or temporary nodes can be used by such a solution.
If their range is limited (say between -100 and 100), it's easy.
Create an array quant[-100..100] then just cycle through your linked list, executing:
quant[value] = quant[value] + 1
Then the following loop will do the trick.
for i = -100 to 100:
j = 10 - i
for k = 1 to quant[i] * quant[j]
output i, " ", j
Even if their range isn't limited, you can have a more efficient method than what you proposed, by sorting the values first and then just keeping counts rather than individual values (same as the above solution).
This is achieved by running two pointers, one at the start of the list and one at the end. When the numbers at those pointers add up to 10, output them and move the end pointer down and the start pointer up.
When they're greater than 10, move the end pointer down. When they're less, move the start pointer up.
This relies on the sorted nature. Less than 10 means you need to make the sum higher (move start pointer up). Greater than 10 means you need to make the sum less (end pointer down). Since they're are no duplicates in the list (because of the counts), being equal to 10 means you move both pointers.
Stop when the pointers pass each other.
There's one more tricky bit and that's when the pointers are equal and the value sums to 10 (this can only happen when the value is 5, obviously).
You don't output the number of pairs based on the product, rather it's based on the product of the value minus 1. That's because a value 5 with count of 1 doesn't actually sum to 10 (since there's only one 5).
So, for the list:
2 3 1 3 5 7 10 -1 11
you get:
Index a b c d e f g h
Value -1 1 2 3 5 7 10 11
Count 1 1 1 2 1 1 1 1
You start pointer p1 at a and p2 at h. Since -1 + 11 = 10, you output those two numbers (as above, you do it N times where N is the product of the counts). Thats one copy of (-1,11). Then you move p1 to b and p2 to g.
1 + 10 > 10 so leave p1 at b, move p2 down to f.
1 + 7 < 10 so move p1 to c, leave p2 at f.
2 + 7 < 10 so move p1 to d, leave p2 at f.
3 + 7 = 10, output two copies of (3,7) since the count of d is 2, move p1 to e, p2 to e.
5 + 5 = 10 but p1 = p2 so the product is 0 times 0 or 0. Output nothing, move p1 to f, p2 to d.
Loop ends since p1 > p2.
Hence the overall output was:
(-1,11)
( 3, 7)
( 3, 7)
which is correct.
Here's some test code. You'll notice that I've forced 7 (the midpoint) to a specific value for testing. Obviously, you wouldn't do this.
#include <stdio.h>
#define SZSRC 30
#define SZSORTED 20
#define SUM 14
int main (void) {
int i, s, e, prod;
int srcData[SZSRC];
int sortedVal[SZSORTED];
int sortedCnt[SZSORTED];
// Make some random data.
srand (time (0));
for (i = 0; i < SZSRC; i++) {
srcData[i] = rand() % SZSORTED;
printf ("srcData[%2d] = %5d\n", i, srcData[i]);
}
// Convert to value/size array.
for (i = 0; i < SZSORTED; i++) {
sortedVal[i] = i;
sortedCnt[i] = 0;
}
for (i = 0; i < SZSRC; i++)
sortedCnt[srcData[i]]++;
// Force 7+7 to specific count for testing.
sortedCnt[7] = 2;
for (i = 0; i < SZSORTED; i++)
if (sortedCnt[i] != 0)
printf ("Sorted [%3d], count = %3d\n", i, sortedCnt[i]);
// Start and end pointers.
s = 0;
e = SZSORTED - 1;
// Loop until they overlap.
while (s <= e) {
// Equal to desired value?
if (sortedVal[s] + sortedVal[e] == SUM) {
// Get product (note special case at midpoint).
prod = (s == e)
? (sortedCnt[s] - 1) * (sortedCnt[e] - 1)
: sortedCnt[s] * sortedCnt[e];
// Output the right count.
for (i = 0; i < prod; i++)
printf ("(%3d,%3d)\n", sortedVal[s], sortedVal[e]);
// Move both pointers and continue.
s++;
e--;
continue;
}
// Less than desired, move start pointer.
if (sortedVal[s] + sortedVal[e] < SUM) {
s++;
continue;
}
// Greater than desired, move end pointer.
e--;
}
return 0;
}
You'll see that the code above is all O(n) since I'm not sorting in this version, just intelligently using the values as indexes.
If the minimum is below zero (or very high to the point where it would waste too much memory), you can just use a minVal to adjust the indexes (another O(n) scan to find the minimum value and then just use i-minVal instead of i for array indexes).
And, even if the range from low to high is too expensive on memory, you can use a sparse array. You'll have to sort it, O(n log n), and search it for updating counts, also O(n log n), but that's still better than the original O(n2). The reason the binary search is O(n log n) is because a single search would be O(log n) but you have to do it for each value.
And here's the output from a test run, which shows you the various stages of calculation.
srcData[ 0] = 13
srcData[ 1] = 16
srcData[ 2] = 9
srcData[ 3] = 14
srcData[ 4] = 0
srcData[ 5] = 8
srcData[ 6] = 9
srcData[ 7] = 8
srcData[ 8] = 5
srcData[ 9] = 9
srcData[10] = 12
srcData[11] = 18
srcData[12] = 3
srcData[13] = 14
srcData[14] = 7
srcData[15] = 16
srcData[16] = 12
srcData[17] = 8
srcData[18] = 17
srcData[19] = 11
srcData[20] = 13
srcData[21] = 3
srcData[22] = 16
srcData[23] = 9
srcData[24] = 10
srcData[25] = 3
srcData[26] = 16
srcData[27] = 9
srcData[28] = 13
srcData[29] = 5
Sorted [ 0], count = 1
Sorted [ 3], count = 3
Sorted [ 5], count = 2
Sorted [ 7], count = 2
Sorted [ 8], count = 3
Sorted [ 9], count = 5
Sorted [ 10], count = 1
Sorted [ 11], count = 1
Sorted [ 12], count = 2
Sorted [ 13], count = 3
Sorted [ 14], count = 2
Sorted [ 16], count = 4
Sorted [ 17], count = 1
Sorted [ 18], count = 1
( 0, 14)
( 0, 14)
( 3, 11)
( 3, 11)
( 3, 11)
( 5, 9)
( 5, 9)
( 5, 9)
( 5, 9)
( 5, 9)
( 5, 9)
( 5, 9)
( 5, 9)
( 5, 9)
( 5, 9)
( 7, 7)
Create a hash set (HashSet in Java) (could use a sparse array if your numbers are well-bounded, i.e. you know they fall into +/- 100)
For each node, first check if 10-n is in the set. If so, you have found a pair. Either way, then add n to the set and continue.
So for example you have
1 - 6 - 3 - 4 - 9
1 - is 9 in the set? Nope
6 - 4? No.
3 - 7? No.
4 - 6? Yup! Print (6,4)
9 - 1? Yup! Print (9,1)
This is a mini subset sum problem, which is NP complete.
If you were to first sort the set, it would eliminate the pairs of numbers that needed to be evaluated.

Resources