Partition of n elements into k groups and the sum of these groups must be the same - algorithm

I need to partition n elements into k groups and the sum of these groups are all the same.
For instance:
I have a list of numbers going from 1 to 99 [1, 2, 3, 4, 5...] and I need to partition the list into 3 groups. The sum of all elements of each groups must be equal. In this example, n=99 and k=3.
What is an efficient and elegant algorithm to achieve this?
I'm just asking for algorithm suggestions to use; I don't want a solution.

Let me start by clarifying something: when you say you want an algorithm, but no solution, I have to assume that what you mean is that you want an algorithm that'll give you a solution for k=3 and n any positive integer.
I believe the problem has a solution for any n >= 5, provided either n or n+1 is divisible by 3.
This is because the sum 1 + ... + n equals n*(n+1)/2, which is divisible by 3 if and only if either n or n+1 is divisible by 3.
Assuming n satisfies these conditions, the algorithm goes like this.
Let s = [n*(n+1)/2]/3.
Find the number m in the decreasing sequence n, n-1, n-2, ..., such that
n + (n - 1) + (n - 2) + ... + m <= s < n + (n - 1) + (n - 2 ) + ... + m + (m - 1).
Let h = s - [n + (n - 1) + (n - 2) + ... + m].
Then h + [n + (n - 1) + (n - 2) + ... + m] = s, and we have our first sum.
Note that, given how m was obtained, h is guaranteed to be in the decreasing sequence m-2, m-3, ..., 1, hence is available for our first sum.
We now find the number q in the decreasing sequence m-1, m-2, ..., h+1, h, h-1, ..., 1, such that
m + (m - 1) + (m - 2) + ... + q <= s < m + (m - 1) + (m - 2) + ... + q + (q - 1),
keeping in mind that these sums may include h+1 or h-1, but not h. I am abusing the notational conventions for sums here.
This is where things are getting a bit hairy.
I conjecture that the number p = s - [m + (m - 1) + (m - 2) + ... + q] is among the numbers left over (unused) from the original sequence, but I won't prove it. (Exercise for the reader, as they say.)
Then p + [m + (m - 1) + (m - 2) + ... + q] = s, and we have have our second sum.
(Again: the number h may not be in the sum [m + (m - 1) + (m - 2) + ... + q].)
The last sum is that of all the remaining numbers, left over from the original sequence.
For example (well, yes, that is a solution --- but for illustrative purposes only...):
1 + ... + 99 = 99*100/2 = 3*550
= (99 + 98 + 97 + 96 + 95 + 65)
+ (94 + 93 + 92 + 91 + 90 + 89 + 1)
+ (88 + ... + 66 + 64 + ... + 2).
Here n = 99, m = 95, h = 65, q = 89, and p = 1.

Related

Finding sum in less than O(N)

Question:
In less O(n) find a number K in sequence 1,2,3...N such that sum of 1,2,3...K is exactly half of sum of 1,2,3..N
Maths:
I know that the sum of the sequence 1,2,3....N is N(N+1)/2.
Therefore our task is to find K such that:
K(K+1) = 1/2 * (N)(N+1)/2 if such a K exists.
Pseudo-Code:
sum1 = n(n+1)/2
sum2 = 0
for(i=1;i<n;i++)
{
sum2 += i;
if(sum2 == sum1)
{
index = i
break;
}
}
Problem: The solution is O(n) but I need better such as O(n), O(log(n))...
You're close with your equation, but you dropped the divide by 2 from the K side. You actually want
K * (K + 1) / 2 = N * (N + 1) / (2 * 2)
Or
2 * K * (K + 1) = N * (N + 1)
Plugging that into wolfram alpha gives the real solutions:
K = 1/2 * (-sqrt(2N^2 + 2N + 1) - 1)
K = 1/2 * (sqrt(2N^2 + 2N + 1) - 1)
Since you probably don't want the negative value, the second equation is what you're looking for. That should be an O(1) solution.
The other answers show the analytical solutions of the equation
k * (k + 1) = n * (n + 1) / 2 Where n is given
The OP needs k to be a whole number, though, and such value may not exist for every chosen n.
We can adapt the Newton's method to solve this equation using only integer arithmetics.
sum_n = n * (n + 1) / 2
k = n
repeat indefinitely // It usually needs only a few iterations, it's O(log(n))
f_k = k * (k + 1)
if f_k == sum_n
k is the solution, exit
if f_k < sum_n
there's no k, exit
k_n = (f_k - sum_n) / (2 * k + 1) // Newton step: f(k)/f'(k)
if k_n == 0
k_n = 1 // Avoid inifinite loop
k = k - k_n;
Here there is a C++ implementation.
We can find all the pairs (n, k) that satisfy the equation for 0 < k < n ≤ N adapting the algorithm posted in the question.
n = 1 // This algorithm compares 2 * k * (k + 1) and n * (n + 1)
sum_n = 1 // It finds all the pairs (n, k) where 0 < n ≤ N in O(N)
sum_2k = 1
for every n <= N // Note that n / k → sqrt(2) when n → ∞
while sum_n < sum_2k
n = n + 1 // This inner loop requires a couple of iterations,
sum_n = sum_n + n // at most.
if ( sum_n == sum_2k )
print n and k
k = k + 1
sum_2k = sum_2k + 2 * k
Here there is an implementation in C++ that can find the first pairs where N < 200,000,000:
N K K * (K + 1)
----------------------------------------------
3 2 6
20 14 210
119 84 7140
696 492 242556
4059 2870 8239770
23660 16730 279909630
137903 97512 9508687656
803760 568344 323015470680
4684659 3312554 10973017315470
27304196 19306982 372759573255306
159140519 112529340 12662852473364940
Of course it becomes impractical for too large values and eventually overflows.
Besides, there's a far better way to find all those pairs (have you noticed the patterns in the sequences of the last digits?).
We can start by manipulating this Diophantine equation:
2k(k + 1) = n(n + 1)
introducing u = n + 1 → n = u - 1
v = k + 1 k = v - 1
2(v - 1)v = (u - 1)u
2(v2 - v) = u2 + u
2(4v2 - 4v) = 4u2 + 4u
2(4v2 - 4v) + 2 = 4u2 - 4u + 2
2(4v2 - 4v + 1) = (4u2 - 4u + 1) + 1
2(2v - 1)2 = (2u - 1)2 + 1
substituting x = 2u - 1 → u = (x + 1)/2
y = 2v - 1 v = (y + 1)/2
2y2 = x2 + 1
x2 - 2y2 = -1
Which is the negative Pell's equation for 2.
It's easy to find its fundamental solutions by inspection, x1 = 1 and y1 = 1. Those would correspond to n = k = 0, a solution of the original Diophantine equation, but not of the original problem (I'm ignoring the sums of 0 terms).
Once those are known, we can calculate all the other ones with two simple recurrence relations
xi+1 = xi + 2yi
yi+1 = yi + xi
Note that we need to "skip" the even ys as they would lead to non integer solutions. So we can directly use theese
xi+2 = 3xi + 4yi → ui+1 = 3ui + 4vi - 3 → ni+1 = 3ni + 4ki + 3
yi+2 = 2xi + 3yi vi+1 = 2ui + 3vi - 2 ki+1 = 2ni + 3ki + 2
Summing up:
n k
-----------------------------------------------
3* 0 + 4* 0 + 3 = 3 2* 0 + 3* 0 + 2 = 2
3* 3 + 4* 2 + 3 = 20 2* 3 + 3* 2 + 2 = 14
3*20 + 4*14 + 3 = 119 2*20 + 3*14 + 2 = 84
...
It seems that the problem is asking to solve the diophantine equation
2K(K+1) = N(N+1).
By inspection, K=2, N=3 is a solution !
Note that technically this is an O(1) problem, because N has a finite value and does not vary (and if no solution exists, the dependency on N is even meanignless).
The condition you have is that the sum of 1..N is twice the sum of 1..K
So you have N(N+1) = 2K(K+1) or K^2 + K - (N^2 + N) / 2 = 0
Which means K = (-1 +/- sqrt(1 + 2(N^2 + N)))/2
Which is O(1)

Solving a recurrence relation using Smoothness Rule

Consider this recurrence relation: x(n) = x(n/2) + n, for n > 1 and x(1) = 0.
Now here the method of back substitution will struggle for values of n not powers of 2, so it is best known here is to use the smoothness rule to solve this types of questions, and when we use the smoothness rule, where we will solve for n = 2^k (for n = values powers of 2) we will have a solution of x(n) = 2n - 1.
However, if we use the method of backward substitution, this recurrence relation will have a solution!
x(n) = x(n/2) + n = x(n/4) + n/2 + n = x(n/8) + n/4 + n/2 + n = x(n/16) + n/8 + n/4 + n/2 + n = ....
where the pattern is
x(n) = x(n/i) + n/(i/2) + n/(i/4) + n/(i/8) + n/(i/16) + ...
which will stop when n = 1 (i.e when i = n) and in this case
x(n) = x(n/n) + n/(n/2) + n/(n/4) + n/(n/8) + n/(n/16) + ... = 1 + 2 + 4 + 8 + 16 + ... = 2^(n+1) - 1
which is two different answers!
So please I am so confused here because in the textbook (Introduction to Analysis and Design of Algorithms by Anany Levitin) it is mention that we should use here the smoothness rule, but as you can see I have solved it exactly by the method of backward substitution where the method was expected here to struggle but nothing has happened!
The transition 1 + 2 + 4 + 8 + 16 + ... = 2^(n+1) - 1 is false.
That is since the number of elements in the left series is log n so the sum is 2^(log n + 1) - 1, which is exactly 2n - 1.
The reason there are log n elements is that n/(2^i) = 1 (the last element of the series is 1) when i = log n.

is there any O(logn) solution for [1+2a+3a^2+4a^3+....+ba^(b-1)] MOD M

Suppose we have a series summation
s = 1 + 2a + 3a^2 + 4a^3 + .... + ba^(b-1)
i need to find s MOD M, where M is a prime number and b is relatively big integer.
I have found an O((log n)^2) divide and conquer solution.
where,
g(n) = (1 + a + a^2 + ... + a^n) MOD M
f(a, b) = [f(a, b/2) + a^b/2*(f(a,b/2) + b/2*g(b/2))] MOD M, where b is even number
f(a,b) = [f(a,b/2) + a^b/2*(f(a,b/2) + b/2*g(b/2)) + ba(b-1)] MOD M, where b is odd number
is there any O(log n) solution for this problem?
Yes. Observe that 1 + 2a + 3a^2 + ... + ba^(b-1) is the derivative in a of 1 + a + a^2 + a^3 + ... + a^b. (The field of formal power series covers a lot of tricks like this.) We can evaluate the latter with automatic differentiation with dual numbers in time O(log b) arithmetic ops. Something like this:
def fdf(a, b, m):
if b == 0:
return (1, 0)
elif b % 2 == 1:
f, df = fdf((a**2) % m, (b - 1) / 2, m)
df *= 2 * a
return ((1 + a) * f % m, (f + (1 + a) * df) % m)
else:
f, df = fdf((a**2) % m, (b - 2) / 2, m)
df *= 2 * a
return ((1 + (a + a**2) * f) % m, (
(1 + 2 * a) * f + (a + a**2) * df) % m)
The answer is fdf(a, b, m)[1]. Note the use of the chain rule when we go from the derivative with respect to a**2 to the derivative with respect to a.

time complexity for the following

int i=1,s=1;
while(s<=n)
{
i++;
s=s+i;
}
time complexity for this is O(root(n)).
I do not understood it how.
since the series is going like 1+2+...+k .
please help.
s(k) = 1 + 2 + 3 + ... k = (k + 1) * k / 2
for s(k) >= n you need at least k steps. n = (k + 1) * k / 2, thus k = -1/2 +- sqrt(1 + 4 * n)/2;
you ignore constants and coeficients and O(-1/2 + sqrt(1+4n)/2) = O(sqrt(n))
Let the loop execute x times. Now, the loop will execute as long as s is less than n.
We have :
After 1st iteration :
s = s + 1
After 2nd iteration :
s = s + 1 + 2
As it goes on for x iterations, Finally we will have
1 + 2 ... + x <= n
=> (x * (x + 1)) / 2 <= n
=> O(x^2) <= n
=> x= O (root(n))
This computes the sum s(k)=1+2+...+k and stops when s(k) > n.
Since s(k)=k*(k+1)/2, the number of iterations required for s(k) to exceed n is O(sqrt(n)).

Given a number N, find the number of ways to write it as a sum of two or more consecutive integers

Here is the problem that tagged as dynamic-programming (Given a number N, find the number of ways to write it as a sum of two or more consecutive integers) and example 15 = 7+8, 1+2+3+4+5, 4+5+6
I solved with math like that :
a + (a + 1) + (a + 2) + (a + 3) + ... + (a + k) = N
(k + 1)*a + (1 + 2 + 3 + ... + k) = N
(k + 1)a + k(k+1)/2 = N
(k + 1)*(2*a + k)/2 = N
Then check that if N divisible by (k+1) and (2*a+k) then I can find answer in O(sqrt(N)) time
Here is my question how can you solve this by dynamic-programming ? and what is the complexity (O) ?
P.S : excuse me, if it is a duplicate question. I searched but I can find
The accepted answer was great but the better approach wasn't clearly presented. Posting my java code as below for reference. It might be quite verbose, but explains the idea more clearly. This assumes that the consecutive integers are all positive.
private static int count(int n) {
int i = 1, j = 1, count = 0, sum = 1;
while (j<n) {
if (sum == n) { // matched, move sub-array section forward by 1
count++;
sum -= i;
i++;
j++;
sum +=j;
} else if (sum < n) { // not matched yet, extend sub-array at end
j++;
sum += j;
} else { // exceeded, reduce sub-array at start
sum -= i;
i++;
}
}
return count;
}
We can use dynamic programming to calculate the sums of 1+2+3+...+K for all K up to N. sum[i] below represents the sum 1+2+3+...+i.
sum = [0]
for i in 1..N:
append sum[i-1] + i to sum
With these sums we can quickly find all sequences of consecutive integers summing to N. The sum i+(i+1)+(i+2)+...j is equal to sum[j] - sum[i] + 1. If the sum is less than N, we increment j. If the sum is greater than N, we increment i. If the sum is equal to N, we increment our counter and both i and j.
i = 0
j = 0
count = 0
while j <= N:
cur_sum = sum[j] - sum[i] + 1
if cur_sum == N:
count++
if cur_sum <= N:
j++
if cur_sum >= N:
i++
There are better alternatives than using this dynamic programming solution though. The sum array can be calculated mathematically using the formula k(k+1)/2, so we could calculate it on-the-fly without need for the additional storage. Even better though, since we only ever shift the end-points of the sum we're working with by at most 1 in each iteration, we can calculate it even more efficiently on the fly by adding/subtracting the added/removed values.
i = 0
j = 0
sum = 0
count = 0
while j <= N:
cur_sum = sum[j] - sum[i] + 1
if cur_sum == N:
count++
if cur_sum <= N:
j++
sum += j
if cur_sum >= N:
sum -= i
i++
For odd N, this problem is equivalent to finding the number of divisors of N not exceeding sqrt(N). (For even N, there is a couple of twists.) That task takes O(sqrt(N)/ln(N)) if you have access to a list of primes, O(sqrt(N)) otherwise.
I don't see how dynamic programming can help here.
In order to solve the problem we will try all sums of consecutive integers in [1, M], where M is derived from M(M+1)/2 = N.int count = 0
for i in [1,M]
for j in [i, M]
s = sum(i, j) // s = i + (i+1) + ... + (j-1) + j
if s == N
count++
if s >= N
break
return count
Since we do not want to calculate sum(i, j) in every iteration from scratch we'll use a technique known as "memoization". Let's create a matrix of integers sum[M+1][M+1] and set sum[i][j] to i + (i+1) + ... + (j-1) + j.for i in [1, M]
sum[i][i] = i
int count = 0
for i in [1, M]
for j in [i + 1, M]
sum[i][j] = sum[i][j-1] + j
if sum[i][j] == N
count++
if sum[i][j] >= N
break
return count
The complexity is obviously O(M^2), i.e. O(N)
1) For n >= 0 an integer, the sum of integers from 0 to n is n*(n+1)/2. This is classic : write this sum first like this :
S = 0 + 1 + ... + n
and then like this :
S = n + (n-1) + ... + 0
You see that 2*S is equal to (0+n) + (1 + n-1)) + ... + (n+0) = (n+1)n, so that S = n(n+1)/2 indeed. (Well known but is prefered my answer to be self contained).
2) From 1, if we note cons(m,n) the sum m+(m+1)+...(n-1)+n the consecutive sum of integers between posiive (that is >=0) such that 1<=m<=n we see that :
cons(m,n) = (0+1+...+n) - (0+1+...+(m-1)) which gives from 1 :
cons(m,n) = n*(n+1)/ - m(m-1)/2
3) The question is then recasted into the following : in how many ways can we write N in the form N = cons(m,n) with m,n integers such that 1<=m<=n ? If we have N = cons(m,n), this is equivalent to m^2 - m + (2N -n^2 -n) = 0, that is, the real polynomial T^2 - m + (2N -n^2 -n) has a real root, m : its discriminant delta must then be a square. But we have :
delta = 1 - 3*(2*N - n^2 - n)
And this delta is an integer which must be a square. There exists therefore an integer M such that :
delta = 1 - 3*(2*N - n^2 - n) = M^2
that is
M^2 = 1 - 6*N + n(n+1)
n(n+1) is always dividible by 2 (it's for instance 2 times our S from the beginning, but here is a more trivial reason, among to consecutive integers, one must be even) and therefore M^2 is odd, implying that M must be odd.
4) Rewrite or previous equation as :
n^2 + n + (1-6*N - M^2) = 0
This show that the real polynomial X^2 + X + (1-6*N - M^2) has a real zero, n : its discriminant gamma must therefore be a square, but :
gamma = 1 - 4*(1-6*N-M^2)
and this must be a square, so that here again, there exist an integer G such that
G^2 = 1 - 4*(1-6*N-M^2)
G^2 = 1 + 4*(2*N + m*(m-1))
which shows that, as M is odd, G is odd also.
5) Substracting M^2 = 1 - 4*(2*N - n*(n+1)) to G^2 = 1 + 4*(2*N + m*(m-1))) yields to :
G^2 - M^2 = 4*(2*N + m*(m-1)) + 4*(2*N -n*(n+1))
= 16*N + 4*( m*(m-1) - n*(n+1) )
= 16*N - 8*N (because N = cons(m,n))
= 8*N
And finally this can be rewritten as :
(G-M)*(G+M) = 8*N, that is
[(G-M)/2]*[(G+M)/2] = 2*N
where (G-M)/2 and (G+M)/2 are integers (G-M and G+M are even since G and M are odd)
6) Thus, at each manner to write N as cons(m,n), we can associate a way (and only one way, as M and G are uniquely determined) to factor 2*N into the product x*y, with x = (G-M)/2 and y = (G+M)/2 where G and M are two odd integers. Since G = x + y and M = -x + y, as G and M are odd, we see that x and y should have opposite parities. Thus among x and y, one is even and the other is odd. Thus 2*N = x*y where among x and y, one is even and the other is odd. Lets c be the odd one among x and y, and d be the even one. Then 2*N = c*d, thus N = c*(d/2). So c is and odd number dividing N, and is uniquely determined by N, as soon as N = cons(m,n). Reciprocally, as soon as N has an odd divisor, one can reverse engineer all this stuff to find n and m.
7) *Conclusion : there exist a one to one correspondance between the number of ways of writing N = cons(m,n) (which is the number of ways of writing N as sum of consecutive integers, as we have seen) and the number of odd divisors of N.*
8) Finally, the number we are looking for is the number of odd divisors of n. I guess that solving this one by DP or whatever is easier than solving the previous one.
When you think it upside down (Swift)...
func cal(num : Int) -> Int {
let halfVal = Double(Double(num)/2.0).rounded(.up)
let endval = Int((halfVal/2).rounded(.down))
let halfInt : Int = Int(halfVal)
for obj in (endval...halfInt).reversed() {
var sum : Int = 0
for subVal in (1...obj).reversed() {
sum = sum + subVal
if sum > num {
break
}
if sum == num {
noInt += 1
break
}
}
}
return noInt
}

Resources