Calculation particular row mod 10^9+7 of pascal table

Calculation particular row mod 10^9+7 of pascal table - algorithm

what i want to store is a particular row of pascal table elements mod 10^9+7 in an array i tried to code it but it is failing somewhere when value is huge of like 10^5
here is the code. i have tried to apply modular inverse here and modular arithmetic here mod is 10^9+7
void pascal_row(ll n){
memset(soo,0,MAX);
soo[0] = 1; //First element is always 1
for(ll i=1; i<n/2+1; i++){ //Progress up, until reaching the middle value
soo[i] = ( ( soo[i-1] %mod ) * ((( (n-i+1)%mod * calcInverse(i,mod)%mod) % mod ))%mod)%mod;
}
for(ll i=n/2+1; i<=n; i++){ //Copy the inverse of the first part
soo[i] = soo[n-i]%mod;
}
}
here is what my modular inverse function look
long long calcInverse(long long a, long long n)
{
long long t = 0, newt = 1;
long long r = n, newr = a;
while (newr != 0) {
auto quotient = r /newr;
tie(t, newt) = make_tuple(newt, t- quotient * newt);
tie(r, newr) = make_tuple(newr, r - quotient * newr);
}
if (r > 1)
throw runtime_error("a is not invertible");
if (t < 0)
t += n;
return t;
}
Please tell what is the correct way of doing this Thanks

There's a relationship between the elements of Pascal's triangle and the choose function (where n choose r = n!/(r!*(n-r)!).) Specifically, starting from zero, the n'th row and r'th column of Pascal's triangle is n choose r. To find a particular row, you know what n you want, and then you should iterate over the possible values of r, find n choose r, and then take your modulus.
I'd recommend Java's BigInteger class for this, because it will handle any overflow errors you might be getting.

Related

How to print values in memoization method-Dynamic pragraming

I know for a problem that can be solved using DP, can be solved by either tabulation(bottom-up) approach or memoization(top-down) approach. personally i find memoization is easy and even efficient approach(analysis required just to get recursive formula,once recursive formula is obtained, a brute-force recursive method can easily be converted to store sub-problem's result and reuse it.) The only problem that i am facing in this approach is, i am not able to construct actual result from the table which i filled on demand.
For example, in Matrix Product Parenthesization problem ( to decide in which order to perform the multiplications on Matrices so that cost of multiplication is minimum) i am able to calculate minimum cost not not able to generate order in algo.
For example, suppose A is a 10 × 30 matrix, B is a 30 × 5 matrix, and C is a 5 × 60 matrix. Then,
(AB)C = (10×30×5) + (10×5×60) = 1500 + 3000 = 4500 operations
A(BC) = (30×5×60) + (10×30×60) = 9000 + 18000 = 27000 operations.
here i am able to get min-cost as 27000 but unable to get order which is A(BC).
I used this. Suppose F[i, j] represents least number of multiplication needed to multiply Ai.....Aj and an array p[] is given which represents the chain of matrices such that the ith matrix Ai is of dimension p[i-1] x p[i]. So
0 if i=j
F[i,j]=
min(F[i,k] + F[k+1,j] +P_i-1 * P_k * P_j where k∈[i,j)
Below is the implementation that i have created.
#include<stdio.h>
#include<limits.h>
#include<string.h>
#define MAX 4
int lookup[MAX][MAX];
int MatrixChainOrder(int p[], int i, int j)
{
if(i==j) return 0;
int min = INT_MAX;
int k, count;
if(lookup[i][j]==0){
// recursively calculate count of multiplcations and return the minimum count
for (k = i; k<j; k++) {
int gmin=0;
if(lookup[i][k]==0)
lookup[i][k]=MatrixChainOrder(p, i, k);
if(lookup[k+1][j]==0)
lookup[k+1][j]=MatrixChainOrder(p, k+1, j);
count = lookup[i][k] + lookup[k+1][j] + p[i-1]*p[k]*p[j];
if (count < min){
min = count;
printf("\n****%d ",k); // i think something has be done here to represent the correct answer ((AB)C)D where first mat is represented by A second by B and so on.
}
}
lookup[i][j] = min;
}
return lookup[i][j];
}
// Driver program to test above function
int main()
{
int arr[] = {2,3,6,4,5};
int n = sizeof(arr)/sizeof(arr[0]);
memset(lookup, 0, sizeof(lookup));
int width =10;
printf("Minimum number of multiplications is %d ", MatrixChainOrder(arr, 1, n-1));
printf("\n ---->");
for(int l=0;l<MAX;++l)
printf(" %*d ",width,l);
printf("\n");
for(int z=0;z<MAX;z++){
printf("\n %d--->",z);
for(int x=0;x<MAX;x++)
printf(" %*d ",width,lookup[z][x]);
}
return 0;
}
I know using tabulation approach printing the solution is much easy but i want to do it in memoization technique.
Thanks.

Your code correctly computes the minimum number of multiplications, but you're struggling to display the optimal chain of matrix multiplications.
There's two possibilities:
When you compute the table, you can store the best index found in another memoization array.
You can recompute the optimal splitting points from the results in the memoization array.
The first would involve creating the split points in a separate array:
int lookup_splits[MAX][MAX];
And then updating it inside your MatrixChainOrder function:
...
if (count < min) {
min = count;
lookup_splits[i][j] = k;
}
You can then generate the multiplication chain recursively like this:
void print_mult_chain(int i, int j) {
if (i == j) {
putchar('A' + i - 1);
return;
}
putchar('(');
print_mult_chain(i, lookup_splits[i][j]);
print_mult_chain(lookup_splits[i][j] + 1, j);
putchar(')');
}
You can call the function with print_mult_chain(1, n - 1) from main.
The second possibility is that you don't cache lookup_splits and recompute it as necessary.
int get_lookup_splits(int p[], int i, int j) {
int best = INT_MAX;
int k_best;
for (int k = i; k < j; k++) {
int count = lookup[i][k] + lookup[k+1][j] + p[i-1]*p[k]*p[j];
if (count < best) {
best = count;
k_best = k;
}
}
return k;
}
This is essentially the same computation you did inside MatrixChainOrder, so if you go with this solution you should factor the code appropriately to avoid having two copies.
With this function, you can adapt print_mult_chain above to use it rather than the lookup_splits array. (You'll need to pass the p array in).
[None of this code is tested, so you may need to edit the answer to fix bugs].

Effectively calculate function

Given
f(n) = 1+x+x^2+x^3+……+x^n, (n >=0 && n is a integer)
input x, n, how can we work out the result with a greater efficiency?

It's a geometric progression. Noting that
(x-1)f(n) = x^{n+1}-1
you get
f(n)=(x^{n+1}-1)/(x-1).

This does n multiplies and n increments. It's easy to put the sum into closed form, but computing the closed form requires evaluating xn+1, which could also end up doing n multiplies, but doesn't require a divide.
Although this is actually valid C, think of it as pseudocode. A real implementation would check for negative n rather than looping through half the int numberspace. If you needed to apply this to an integer x rather than a floating point x, this would definitely be the way to go.
double polysum(int n, double x) {
double a = 1;
while (n--) a = x * a + 1;
return a;
}

public class Test {
public static void main(String args[]) {
int x = 2, n = 10;
Double sum = new Double(0);
for (int i = 0 ; i <= n ; i++) {
sum = sum + Math.pow(x, i);
}
System.out.println(sum);
}
}

how to calculate combination of large numbers

I calculated permutation of numbers as:-
nPr = n!/(n-r)!
where n and r are given .
1<= n,r <= 100
i find p=(n-r)+1
and
for(i=n;i>=p;i--)
multiply digit by digit and store in array.
But how will I calculate the nCr = n!/[r! * (n-r)!] for the same range.?
I did this using recursion as follow :-
#include <stdio.h>
typedef unsigned long long i64;
i64 dp[100][100];
i64 nCr(int n, int r)
{
if(n==r) return dp[n][r] = 1;
if(r==0) return dp[n][r] = 1;
if(r==1) return dp[n][r] = (i64)n;
if(dp[n][r]) return dp[n][r];
return dp[n][r] = nCr(n-1,r) + nCr(n-1,r-1);
}
int main()
{
int n, r;
while(scanf("%d %d",&n,&r)==2)
{
r = (r<n-r)? r : n-r;
printf("%llu\n",nCr(n,r));
}
return 0;
}
but range for n <=100 , and this is not working for n>60 .

Consider using a BigInteger type of class to represnet your big numbers. BigInteger is available in Java and C# (version 4+ of the .NET Framework). From your question, it looks like you are using C++ (which you should always add as a tag). So try looking here and here for a usable C++ BigInteger class.
One of the best methods for calculating the binomial coefficient I have seen suggested is by Mark Dominus. It is much less likely to overflow with larger values for N and K than some other methods.
static long GetBinCoeff(long N, long K)
{
// This function gets the total number of unique combinations based upon N and K.
// N is the total number of items.
// K is the size of the group.
// Total number of unique combinations = N! / ( K! (N - K)! ).
// This function is less efficient, but is more likely to not overflow when N and K are large.
// Taken from: http://blog.plover.com/math/choose.html
//
if (K > N) return 0;
long r = 1;
long d;
for (d = 1; d <= K; d++)
{
r *= N--;
r /= d;
}
return r;
}
Just replace all the long definitions with BigInt and you should be good to go.

Fastest way to generate binomial coefficients

I need to calculate combinations for a number.
What is the fastest way to calculate nCp where n>>p?
I need a fast way to generate binomial coefficients for an polynomial equation and I need to get the coefficient of all the terms and store it in an array.
(a+b)^n = a^n + nC1 a^(n-1) * b + nC2 a^(n-2) * ............
+nC(n-1) a * b^(n-1) + b^n
What is the most efficient way to calculate nCp ??

You cau use dynamic programming in order to generate binomial coefficients
You can create an array and than use O(N^2) loop to fill it
C[n, k] = C[n-1, k-1] + C[n-1, k];
where
C[1, 1] = C[n, n] = 1
After that in your program you can get the C(n, k) value just looking at your 2D array at [n, k] indices
UPDATE smth like that
for (int k = 1; k <= K; k++) C[0][k] = 0;
for (int n = 0; n <= N; n++) C[n][0] = 1;
for (int n = 1; n <= N; n++)
for (int k = 1; k <= K; k++)
C[n][k] = C[n-1][k-1] + C[n-1][k];
where the N, K - maximum values of your n, k

If you need to compute them for all n, Ribtoks's answer is probably the best.
For a single n, you're better off doing like this:
C[0] = 1
for (int k = 0; k < n; ++ k)
C[k+1] = (C[k] * (n-k)) / (k+1)
The division is exact, if done after the multiplication.
And beware of overflowing with C[k] * (n-k) : use large enough integers.

If you want complete expansions for large values of n, FFT convolution might be the fastest way. In the case of a binomial expansion with equal coefficients (e.g. a series of fair coin tosses) and an even order (e.g. number of tosses) you can exploit symmetries thus:
Theory
Represent the results of two coin tosses (e.g. half the difference between the total number of heads and tails) with the expression A + A*cos(Pi*n/N). N is the number of samples in your buffer - a binomial expansion of even order O will have O+1 coefficients and require a buffer of N >= O/2 + 1 samples - n is the sample number being generated, and A is a scale factor that will usually be either 2 (for generating binomial coefficients) or 0.5 (for generating a binomial probability distribution).
Notice that, in frequency, this expression resembles the binomial distribution of those two coin tosses - there are three symmetrical spikes at positions corresponding to the number (heads-tails)/2. Since modelling the overall probability distribution of independent events requires convolving their distributions, we want to convolve our expression in the frequency domain, which is equivalent to multiplication in the time domain.
In other words, by raising our cosine expression for the result of two tosses to a power (e.g. to simulate 500 tosses, raise it to the power of 250 since it already represents a pair), we can arrange for the binomial distribution for a large number to appear in the frequency domain. Since this is all real and even, we can substitute the DCT-I for the DFT to improve efficiency.
Algorithm
decide on a buffer size, N, that is at least O/2 + 1 and can be conveniently DCTed
initialise it with the expression pow(A + A*cos(Pi*n/N),O/2)
apply the forward DCT-I
read out the coefficients from the buffer - the first number is the central peak where heads=tails, and subsequent entries correspond to symmetrical pairs successively further from the centre
Accuracy
There's a limit to how high O can be before accumulated floating-point rounding errors rob you of accurate integer values for the coefficients, but I'd guess the number is pretty high. Double-precision floating-point can represent 53-bit integers with complete accuracy, and I'm going to ignore the rounding loss involved in the use of pow() because the generating expression will take place in FP registers, giving us an extra 11 bits of mantissa to absorb the rounding error on Intel platforms. So assuming we use a 1024-point DCT-I implemented via the FFT, that means losing 10 bits' accuracy to rounding error during the transform and not much else, leaving us with ~43 bits of clean representation. I don't know what order of binomial expansion generates coefficients of that size, but I dare say it's big enough for your needs.
Asymmetrical expansions
If you want the asymmetrical expansions for unequal coefficients of a and b, you'll need to use a two-sided (complex) DFT and a complex pow() function. Generate the expression A*A*e^(-Pi*i*n/N) + A*B + B*B*e^(+Pi*i*n/N) [using the complex pow() function to raise it to the power of half the expansion order] and DFT it. What you have in the buffer is, again, the central point (but not the maximum if A and B are very different) at offset zero, and it is followed by the upper half of the distribution. The upper half of the buffer will contain the lower half of the distribution, corresponding to heads-minus-tails values that are negative.
Notice that the source data is Hermitian symmetrical (the second half of the input buffer is the complex conjugate of the first), so this algorithm is not optimal and can be performed using a complex-to-complex FFT of half the required size for optimum efficiency.
Needless to say, all the complex exponentiation will chew more CPU time and hurt accuracy compared to the purely real algorithm for symmetrical distributions above.

This is my version:
def binomial(n, k):
if k == 0:
return 1
elif 2*k > n:
return binomial(n,n-k)
else:
e = n-k+1
for i in range(2,k+1):
e *= (n-k+i)
e /= i
return e

I recently wrote a piece of code that needed to call for a binary coefficient about 10 million times. So I did a combination lookup-table/calculation approach that's still not too wasteful of memory. You might find it useful (and my code is in the public domain). The code is at
http://www.etceterology.com/fast-binomial-coefficients
It's been suggested that I inline the code here. A big honking lookup table seems like a waste, so here's the final function, and a Python script that generates the table:
extern long long bctable[]; /* See below */
long long binomial(int n, int k) {
int i;
long long b;
assert(n >= 0 && k >= 0);
if (0 == k || n == k) return 1LL;
if (k > n) return 0LL;
if (k > (n - k)) k = n - k;
if (1 == k) return (long long)n;
if (n <= 54 && k <= 54) {
return bctable[(((n - 3) * (n - 3)) >> 2) + (k - 2)];
}
/* Last resort: actually calculate */
b = 1LL;
for (i = 1; i <= k; ++i) {
b *= (n - (k - i));
if (b < 0) return -1LL; /* Overflow */
b /= i;
}
return b;
}
#!/usr/bin/env python3
import sys
class App(object):
def __init__(self, max):
self.table = [[0 for k in range(max + 1)] for n in range(max + 1)]
self.max = max
def build(self):
for n in range(self.max + 1):
for k in range(self.max + 1):
if k == 0: b = 1
elif k > n: b = 0
elif k == n: b = 1
elif k == 1: b = n
elif k > n-k: b = self.table[n][n-k]
else:
b = self.table[n-1][k] + self.table[n-1][k-1]
self.table[n][k] = b
def output(self, val):
if val > 2**63: val = -1
text = " {0}LL,".format(val)
if self.column + len(text) > 76:
print("\n ", end = "")
self.column = 3
print(text, end = "")
self.column += len(text)
def dump(self):
count = 0
print("long long bctable[] = {", end="");
self.column = 999
for n in range(self.max + 1):
for k in range(self.max + 1):
if n < 4 or k < 2 or k > n-k:
continue
self.output(self.table[n][k])
count += 1
print("\n}}; /* {0} Entries */".format(count));
def run(self):
self.build()
self.dump()
return 0
def main(args):
return App(54).run()
if __name__ == "__main__":
sys.exit(main(sys.argv))

If you really only need the case where n is much larger than p, one way to go would be to use the Stirling's formula for the factorials. (if n>>1 and p is order one, Stirling approximate n! and (n-p)!, keep p! as it is etc.)

The fastest reasonable approximation in my own benchmarking is the approximation used by the Apache Commons Maths library: http://commons.apache.org/proper/commons-math/apidocs/org/apache/commons/math3/special/Gamma.html#logGamma(double)
My colleagues and I tried to see if we could beat it, while using exact calculations rather than approximates. All approaches failed miserably (many orders slower) except one, which was 2-3 times slower. The best performing approach uses https://math.stackexchange.com/a/202559/123948, here is the code (in Scala):
var i: Int = 0
var binCoeff: Double = 1
while (i < k) {
binCoeff *= (n - i) / (k - i).toDouble
i += 1
}
binCoeff
The really bad approaches where various attempts at implementing Pascal's Triangle using tail recursion.

nCp = n! / ( p! (n-p)! ) =
( n * (n-1) * (n-2) * ... * (n - p) * (n - p - 1) * ... * 1 ) /
( p * (p-1) * ... * 1 * (n - p) * (n - p - 1) * ... * 1 )
If we prune the same terms of the numerator and the denominator, we are left with minimal multiplication required. We can write a function in C to perform 2p multiplications and 1 division to get nCp:
int binom ( int p, int n ) {
if ( p == 0 ) return 1;
int num = n;
int den = p;
while ( p > 1 ) {
p--;
num *= n - p;
den *= p;
}
return num / den;
}

I was looking for the same thing and couldn't find it, so wrote one myself that seems optimal for any Binomial Coeffcient for which the endresult fits into a Long.
// Calculate Binomial Coefficient
// Jeroen B.P. Vuurens
public static long binomialCoefficient(int n, int k) {
// take the lowest possible k to reduce computing using: n over k = n over (n-k)
k = java.lang.Math.min( k, n - k );
// holds the high number: fi. (1000 over 990) holds 991..1000
long highnumber[] = new long[k];
for (int i = 0; i < k; i++)
highnumber[i] = n - i; // the high number first order is important
// holds the dividers: fi. (1000 over 990) holds 2..10
int dividers[] = new int[k - 1];
for (int i = 0; i < k - 1; i++)
dividers[i] = k - i;
// for every dividers there is always exists a highnumber that can be divided by
// this, the number of highnumbers being a sequence that equals the number of
// dividers. Thus, the only trick needed is to divide in reverse order, so
// divide the highest divider first trying it on the highest highnumber first.
// That way you do not need to do any tricks with primes.
for (int divider: dividers) {
boolean eliminated = false;
for (int i = 0; i < k; i++) {
if (highnumber[i] % divider == 0) {
highnumber[i] /= divider;
eliminated = true;
break;
}
}
if(!eliminated) throw new Error(n+","+k+" divider="+divider);
}
// multiply remainder of highnumbers
long result = 1;
for (long high : highnumber)
result *= high;
return result;
}

If I understand the notation in the question, you don't just want nCp, you actually want all of nC1, nC2, ... nC(n-1). If this is correct, we can leverage the following relationship to make this fairly trivial:
for all k>0: nCk = prod_{from i=1..k}( (n-i+1)/i )
i.e. for all k>0: nCk = nC(k-1) * (n-k+1) / k
Here's a python snippet implementing this approach:
def binomial_coef_seq(n, k):
"""Returns a list of all binomial terms from choose(n,0) up to choose(n,k)"""
b = [1]
for i in range(1,k+1):
b.append(b[-1] * (n-i+1)/i)
return b
If you need all coefficients up to some k > ceiling(n/2), you can use symmetry to reduce the number of operations you need to perform by stopping at the coefficient for ceiling(n/2) and then just backfilling as far as you need.
import numpy as np
def binomial_coef_seq2(n, k):
"""Returns a list of all binomial terms from choose(n,0) up to choose(n,k)"""
k2 = int(np.ceiling(n/2))
use_symmetry = k > k2
if use_symmetry:
k = k2
b = [1]
for i in range(1, k+1):
b.append(b[-1] * (n-i+1)/i)
if use_symmetry:
v = k2 - (n-k)
b2 = b[-v:]
b.extend(b2)
return b

Time Complexity : O(denominator)
Space Complexity : O(1)
public class binomialCoeff {
static double binomialcoeff(int numerator, int denominator)
{
double res = 1;
//invalid numbers
if (denominator>numerator || denominator<0 || numerator<0) {
res = -1;
return res;}
//default values
if(denominator==numerator || denominator==0 || numerator==0)
return res;
// Since C(n, k) = C(n, n-k)
if ( denominator > (numerator - denominator) )
denominator = numerator - denominator;
// Calculate value of [n * (n-1) *---* (n-k+1)] / [k * (k-1) *----* 1]
while (denominator>=1)
{
res *= numerator;
res = res / denominator;
denominator--;
numerator--;
}
return res;
}
/* Driver program to test above function*/
public static void main(String[] args)
{
int numerator = 120;
int denominator = 20;
System.out.println("Value of C("+ numerator + ", " + denominator+ ") "
+ "is" + " "+ binomialcoeff(numerator, denominator));
}
}

3-PARTITION problem

here is another dynamic programming question (Vazirani ch6)
Consider the following 3-PARTITION
problem. Given integers a1...an, we
want to determine whether it is
possible to partition of {1...n} into
three disjoint subsets I, J, K such
that
sum(I) = sum(J) = sum(K) = 1/3*sum(ALL)
For example, for input (1; 2; 3; 4; 4;
5; 8) the answer is yes, because there
is the partition (1; 8), (4; 5), (2;
3; 4). On the other hand, for input
(2; 2; 3; 5) the answer is no. Devise
and analyze a dynamic programming
algorithm for 3-PARTITION that runs in
time poly- nomial in n and (Sum a_i)
How can I solve this problem? I know 2-partition but still can't solve it

It's easy to generalize 2-sets solution for 3-sets case.
In original version, you create array of boolean sums where sums[i] tells whether sum i can be reached with numbers from the set, or not. Then, once array is created, you just see if sums[TOTAL/2] is true or not.
Since you said you know old version already, I'll describe only difference between them.
In 3-partition case, you keep array of boolean sums, where sums[i][j] tells whether first set can have sum i and second - sum j. Then, once array is created, you just see if sums[TOTAL/3][TOTAL/3] is true or not.
If original complexity is O(TOTAL*n), here it's O(TOTAL^2*n).
It may not be polynomial in the strictest sense of the word, but then original version isn't strictly polynomial too :)

I think by reduction it goes like this:
Reducing 2-partition to 3-partition:
Let S be the original set, and A be its total sum, then let S'=union({A/2},S).
Hence, perform a 3-partition on the set S' yields three sets X, Y, Z.
Among X, Y, Z, one of them must be {A/2}, say it's set Z, then X and Y is a 2-partition.
The witnesses of 3-partition on S' is the witnesses of 2-partition on S, thus 2-partition reduces to 3-partition.

If this problem is to be solvable; then sum(ALL)/3 must be an integer. Any solution must have SUM(J) + SUM(K) = SUM(I) + sum(ALL)/3. This represents a solution to the 2-partition problem over concat(ALL, {sum(ALL)/3}).
You say you have a 2-partition implementation: use it to solve that problem. Then (at least) one of the two partitions will contain the number sum(ALL)/3 - remove the number from that partion, and you've found I. For the other partition, run 2-partition again, to split J from K; after all, J and K must be equal in sum themselves.
Edit: This solution is probably incorrect - the 2-partition of the concatenated set will have several solutions (at least one for each of I, J, K) - however, if there are other solutions, then the "other side" may not consist of the union of two of I, J, K, and may not be splittable at all. You'll need to actually think, I fear :-).
Try 2: Iterate over the multiset, maintaining the following map: R(i,j,k) :: Boolean which represents the fact whether up to the current iteration the numbers permit division into three multisets that have sums i, j, k. I.e., for any R(i,j,k) and next number n in the next state R' it holds that R'(i+n,j,k) and R'(i,j+n,k) and R'(i,j,k+n). Note that the complexity (as per the excersize) depends on the magnitude of the input numbers; this is a pseudo-polynomialtime algorithm. Nikita's solution is conceptually similar but more efficient than this solution since it doesn't track the third set's sum: that's unnecessary since you can trivially compute it.

As I have answered in same another question like this, the C++ implementation would look something like this:
int partition3(vector<int> &A)
{
int sum = accumulate(A.begin(), A.end(), 0);
if (sum % 3 != 0)
{
return false;
}
int size = A.size();
vector<vector<int>> dp(sum + 1, vector<int>(sum + 1, 0));
dp[0][0] = true;
// process the numbers one by one
for (int i = 0; i < size; i++)
{
for (int j = sum; j >= 0; --j)
{
for (int k = sum; k >= 0; --k)
{
if (dp[j][k])
{
dp[j + A[i]][k] = true;
dp[j][k + A[i]] = true;
}
}
}
}
return dp[sum / 3][sum / 3];
}

Let's say you want to partition the set $X = {x_1, ..., x_n}$ in $k$ partitions.
Create a $ n \times k $ table. Assume the cost $M[i,j]$ be the maximum sum of $i$ elements in $j$ partitions. Just recursively use the following optimality criterion to fill it:
M[n,k] = min_{i\leq n} max ( M[i, k-1], \sum_{j=i+1}^{n} x_i )
Using these initial values for the table:
M[i,1] = \sum_{j=1}^{i} x_i and M[1,j] = x_j
The running time is $O(kn^2)$ (polynomial )

Create a three dimensional array, where size is count of elements, and part is equal to to sum of all elements divided by 3. So each cell of array[seq][sum1][sum2] tells can you create sum1 and sum2 using max seq elements from given array A[] or not. So compute all values of array, result will be in cell array[using all elements][sum of all element / 3][sum of all elements / 3], if you can create two sets without crossing equal to sum/3, there will be third set.
Logic of checking: exlude A[seq] element to third sum(not stored), check cell without element if it has same two sums; OR include to sum1 - if it is possible to get two sets without seq element, where sum1 is smaller by value of element seq A[seq], and sum2 isn't changed; OR include to sum2 check like previous.
int partition3(vector<int> &A)
{
int part=0;
for (int a : A)
part += a;
if (part%3)
return 0;
int size = A.size()+1;
part = part/3+1;
bool array[size][part][part];
//sequence from 0 integers inside to all inside
for(int seq=0; seq<size; seq++)
for(int sum1=0; sum1<part; sum1++)
for(int sum2=0;sum2<part; sum2++) {
bool curRes;
if (seq==0)
if (sum1 == 0 && sum2 == 0)
curRes = true;
else
curRes= false;
else {
int curInSeq = seq-1;
bool excludeFrom = array[seq-1][sum1][sum2];
bool includeToSum1 = (sum1>=A[curInSeq]
&& array[seq-1][sum1-A[curInSeq]][sum2]);
bool includeToSum2 = (sum2>=A[curInSeq]
&& array[seq-1][sum1][sum2-A[curInSeq]]);
curRes = excludeFrom || includeToSum1 || includeToSum2;
}
array[seq][sum1][sum2] = curRes;
}
int result = array[size-1][part-1][part-1];
return result;
}

Another example in C++ (based on the previous answers):
bool partition3(vector<int> const &A) {
int sum = 0;
for (int i = 0; i < A.size(); i++) {
sum += A[i];
}
if (sum % 3 != 0) {
return false;
}
vector<vector<vector<int>>> E(A.size() + 1, vector<vector<int>>(sum / 3 + 1, vector<int>(sum / 3 + 1, 0)));
for (int i = 1; i <= A.size(); i++) {
for (int j = 0; j <= sum / 3; j++) {
for (int k = 0; k <= sum / 3; k++) {
E[i][j][k] = E[i - 1][j][k];
if (A[i - 1] <= k) {
E[i][j][k] = max(E[i][j][k], E[i - 1][j][k - A[i - 1]] + A[i - 1]);
}
if (A[i - 1] <= j) {
E[i][j][k] = max(E[i][j][k], E[i - 1][j - A[i - 1]][k] + A[i - 1]);
}
}
}
}
return (E.back().back().back() / 2 == sum / 3);
}

You really want Korf's Complete Karmarkar-Karp algorithm (http://ac.els-cdn.com/S0004370298000861/1-s2.0-S0004370298000861-main.pdf, http://ijcai.org/papers09/Papers/IJCAI09-096.pdf). A generalization to three-partitioning is given. The algorithm is surprisingly fast given the complexity of the problem, but requires some implementation.
The essential idea of KK is to ensure that large blocks of similar size appear in different partitions. One groups pairs of blocks, which can then be treated as a smaller block of size equal to the difference in sizes that can be placed as normal: by doing this recursively, one ends up with small blocks that are easy to place. One then does a two-coloring of the block groups to ensure that the opposite placements are handled. The extension to 3-partition is a bit complicated. The Korf extension is to use depth-first search in KK order to find all possible solutions or to find a solution quickly.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio