how to calculate combination of large numbers - algorithm

I calculated permutation of numbers as:-
nPr = n!/(n-r)!
where n and r are given .
1<= n,r <= 100
i find p=(n-r)+1
and
for(i=n;i>=p;i--)
multiply digit by digit and store in array.
But how will I calculate the nCr = n!/[r! * (n-r)!] for the same range.?
I did this using recursion as follow :-
#include <stdio.h>
typedef unsigned long long i64;
i64 dp[100][100];
i64 nCr(int n, int r)
{
if(n==r) return dp[n][r] = 1;
if(r==0) return dp[n][r] = 1;
if(r==1) return dp[n][r] = (i64)n;
if(dp[n][r]) return dp[n][r];
return dp[n][r] = nCr(n-1,r) + nCr(n-1,r-1);
}
int main()
{
int n, r;
while(scanf("%d %d",&n,&r)==2)
{
r = (r<n-r)? r : n-r;
printf("%llu\n",nCr(n,r));
}
return 0;
}
but range for n <=100 , and this is not working for n>60 .

Consider using a BigInteger type of class to represnet your big numbers. BigInteger is available in Java and C# (version 4+ of the .NET Framework). From your question, it looks like you are using C++ (which you should always add as a tag). So try looking here and here for a usable C++ BigInteger class.
One of the best methods for calculating the binomial coefficient I have seen suggested is by Mark Dominus. It is much less likely to overflow with larger values for N and K than some other methods.
static long GetBinCoeff(long N, long K)
{
// This function gets the total number of unique combinations based upon N and K.
// N is the total number of items.
// K is the size of the group.
// Total number of unique combinations = N! / ( K! (N - K)! ).
// This function is less efficient, but is more likely to not overflow when N and K are large.
// Taken from: http://blog.plover.com/math/choose.html
//
if (K > N) return 0;
long r = 1;
long d;
for (d = 1; d <= K; d++)
{
r *= N--;
r /= d;
}
return r;
}
Just replace all the long definitions with BigInt and you should be good to go.

Related

Count integer partions with k parts, each below some threshold m

I want to count the number of ways we can partition the number n, into k distinct parts where each part is not larger than m.
For k := 2 i have following algorithm:
public int calcIntegerPartition(int n, int k, int m) {
int cnt=0;
for(int i=1; i <= m;i++){
for(int j=i+1; j <= m; j++){
if(i+j == n){
cnt++;
break;
}
}
}
return cnt;
}
But how can i count integer partitions with k > 2? Usually I have n > 100000, k := 40, m < 10000.
Thank you in advance.
Let's start by choosing the k largest legal numbers: m, m-1, m-2, ..., m-(k-1). This adds up to k*m - k(k-1)/2. If m < k, there are no solutions because the smallest partition would be <= 0. Let's assume m >= k.
Let's say p = (km - k(k-1)/2) - n.
If p < 0, there are no solutions because the largest number we can make is less than n. Let's assume p >= 0. Note that if p = 0 there is exactly one solution, so let's assume p > 0.
Now, imagine we start by choosing the k largest distinct legal integers, and we then correct this to get a solution. Our correction involves moving values to the left (on the number line) 1 slot, into empty slots, exactly p times. How many ways can we do this?
The smallest value to start with is m-(k-1), and it can move as far down as 1, so up to m-k times. After this, each successive value can move up to its predecessor's move.
Now the problem is, how many nonincreasing integer sequences with a max value of m-k sum to p? This is the partition problem. I.e., how many ways can we partition p (into at most k partitions). This is no closed-form solution to this.
Someone has already written up a nice answer of this problem here (which will need slight modification to meet your restrictions):
Is there an efficient algorithm for integer partitioning with restricted number of parts?
As #Dave alludes to, there is already a really nice answer to the simple restricted integer case (found here (same link as #Dave): Is there an efficient algorithm for integer partitioning with restricted number of parts?).
Below is a variant in C++ which takes into account the maximum value of each restricted part. First, here is the workhorse:
#include <vector>
#include <algorithm>
#include <iostream>
int width;
int blockSize;
static std::vector<double> memoize;
double pStdCap(int n, int m, int myMax) {
if (myMax * m < n || n < m) return 0;
if (myMax * m == n || n <= m + 1) return 1;
if (m < 2) return m;
const int block = myMax * blockSize + (n - m) * width + m - 2;
if (memoize[block]) return memoize[block];
int niter = n / m;
if (m == 2) {
if (myMax * 2 >= n) {
myMax = std::min(myMax, n - 1);
return niter - (n - 1 - myMax);
} else {
return 0;
}
}
double count = 0;
for (; niter--; n -= m, --myMax) {
count += (memoize[myMax * blockSize + (n - m) * width + m - 3] = pStdCap(n - 1, m - 1, myMax));
}
return count;
}
As you can see pStdCap is very similar to the linked solution. The one noticeable difference are the 2 additional checks at the top:
if (myMax * m < n || n < m) return 0;
if (myMax * m == n || n <= m + 1) return 1;
And here is the function that sets up the recursion:
double CountPartLenCap(int n, int m, int myMax) {
if (myMax * m < n || n < m) return 0;
if (myMax * m == n || n <= m + 1) return 1;
if (m < 2) return m;
if (m == 2) {
if (myMax * 2 >= n) {
myMax = std::min(myMax, n - 1);
return n / m - (n - 1 - myMax);
} else {
return 0;
}
}
width = m;
blockSize = m * (n - m + 1);
memoize = std::vector<double>((myMax + 1) * blockSize, 0.0);
return pStdCap(n, m, myMax);
}
Explanation of the parameters:
n is the integer that you are partitioning
m is the length of each partition
myMax is the maximum value that can appear in a given partition. (the OP refers to this as the threshold)
Here is a live demonstration https://ideone.com/c3WohV
And here is a non memoized version of pStdCap which is a bit easier to understand. This is originally found in this answer to Is there an efficient way to generate N random integers in a range that have a given sum or average?
int pNonMemoStdCap(int n, int m, int myMax) {
if (myMax * m < n) return 0;
if (myMax * m == n) return 1;
if (m < 2) return m;
if (n < m) return 0;
if (n <= m + 1) return 1;
int niter = n / m;
int count = 0;
for (; niter--; n -= m, --myMax) {
count += pNonMemoStdCap(n - 1, m - 1, myMax);
}
return count;
}
If you actually intend to calculate the number of partitions for numbers as large as 10000, you are going to need a big int library as CountPartLenCap(10000, 40, 300) > 3.2e37 (Based off the OP's requirement).

Finding kth element in the nth order of Farey Sequence

Farey sequence of order n is the sequence of completely reduced fractions, between 0 and 1 which when in lowest terms have denominators less than or equal to n, arranged in order of increasing size. Detailed explanation here.
Problem
The problem is, given n and k, where n = order of seq and k = element index, can we find the particular element from the sequence. For examples answer for (n=5, k =6) is 1/2.
Lead
There are many less than optimal solution available, but am looking for a near-optimal one. One such algorithm is discussed here, for which I am unable to understand the logic hence unable to apply the examples.
Question
Can some please explain the solution with more detail, preferably with an example.
Thank you.
I've read the method provided in your link, and the accepted C++ solution to it. Let me post them, for reference:
Editorial Explanation
Several less-than-optimal solutions exist. Using a priority queue, one
can iterate through the fractions (generating them one by one) in O(K
log N) time. Using a fancier math relation, this can be reduced to
O(K). However, neither of these solution obtains many points, because
the number of fractions (and thus K) is quadratic in N.
The “good” solution is based on meta-binary search. To construct this
solution, we need the following subroutine: given a fraction A/B
(which is not necessarily irreducible), find how many fractions from
the Farey sequence are less than this fraction. Suppose we had this
subroutine; then the algorithm works as follows:
Determine a number X such that the answer is between X/N and (X+1)/N; such a number can be determined by binary searching the range
1...N, thus calling the subroutine O(log N) times.
Make a list of all fractions A/B in the range X/N...(X+1)/N. For any given B, there is at most one A in this range, and it can be
determined trivially in O(1).
Determine the appropriate order statistic in this list (doing this in O(N log N) by sorting is good enough).
It remains to show how we can construct the desired subroutine. We
will show how it can be implemented in O(N log N), thus giving a O(N
log^2 N) algorithm overall. Let us denote by C[j] the number of
irreducible fractions i/j which are less than X/N. The algorithm is
based on the following observation: C[j] = floor(X*B/N) – Sum(C[D],
where D divides j). A direct implementation, which tests whether any D
is a divisor, yields a quadratic algorithm. A better approach,
inspired by Eratosthene’s sieve, is the following: at step j, we know
C[j], and we subtract it from all multiples of j. The running time of
the subroutine becomes O(N log N).
Relevant Code
#include <cassert>
#include <algorithm>
#include <fstream>
#include <iostream>
#include <vector>
using namespace std;
const int kMaxN = 2e5;
typedef int int32;
typedef long long int64_x;
// #define int __int128_t
// #define int64 __int128_t
typedef long long int64;
int64 count_less(int a, int n) {
vector<int> counter(n + 1, 0);
for (int i = 2; i <= n; i += 1) {
counter[i] = min(1LL * (i - 1), 1LL * i * a / n);
}
int64 result = 0;
for (int i = 2; i <= n; i += 1) {
for (int j = 2 * i; j <= n; j += i) {
counter[j] -= counter[i];
}
result += counter[i];
}
return result;
}
int32 main() {
// ifstream cin("farey.in");
// ofstream cout("farey.out");
int64_x n, k; cin >> n >> k;
assert(1 <= n);
assert(n <= kMaxN);
assert(1 <= k);
assert(k <= count_less(n, n));
int up = 0;
for (int p = 29; p >= 0; p -= 1) {
if ((1 << p) + up > n)
continue;
if (count_less((1 << p) + up, n) < k) {
up += (1 << p);
}
}
k -= count_less(up, n);
vector<pair<int, int>> elements;
for (int i = 1; i <= n; i += 1) {
int b = i;
// find a such that up/n < a / b and a / b <= (up+1) / n
int a = 1LL * (up + 1) * b / n;
if (1LL * up * b < 1LL * a * n) {
} else {
continue;
}
if (1LL * a * n <= 1LL * (up + 1) * b) {
} else {
continue;
}
if (__gcd(a, b) != 1) {
continue;
}
elements.push_back({a, b});
}
sort(elements.begin(), elements.end(),
[](const pair<int, int>& lhs, const pair<int, int>& rhs) -> bool {
return 1LL * lhs.first * rhs.second < 1LL * rhs.first * lhs.second;
});
cout << (int64_x)elements[k - 1].first << ' ' << (int64_x)elements[k - 1].second << '\n';
return 0;
}
Basic Methodology
The above editorial explanation results in the following simplified version. Let me start with an example.
Let's say, we want to find 7th element of Farey Sequence with N = 5.
We start with writing a subroutine, as said in the explanation, that gives us the "k" value (how many Farey Sequence reduced fractions there exist before a given fraction - the given number may or may not be reduced)
So, take your F5 sequence:
k = 0, 0/1
k = 1, 1/5
k = 2, 1/4
k = 3, 1/3
k = 4, 2/5
k = 5, 1/2
k = 6, 3/5
k = 7, 2/3
k = 8, 3/4
k = 9, 4/5
k = 10, 1/1
If we can find a function that finds the count of the previous reduced fractions in Farey Sequence, we can do the following:
int64 k_count_2 = count_less(2, 5); // result = 4
int64 k_count_3 = count_less(3, 5); // result = 6
int64 k_count_4 = count_less(4, 5); // result = 9
This function is written in the accepted solution. It uses the exact methodology explained in the last paragraph of the editorial.
As you can see, the count_less() function generates the same k values as in our hand written list.
We know the values of the reduced fractions for k = 4, 6, 9 using that function. What about k = 7? As explained in the editorial, we will list all the reduced fractions in range X/N and (X+1)/N, here X = 3 and N = 5.
Using the function in the accepted solution (its near bottom), we list and sort the reduced fractions.
After that we will rearrange our k values, as in to fit in our new array as such:
k = -, 0/1
k = -, 1/5
k = -, 1/4
k = -, 1/3
k = -, 2/5
k = -, 1/2
k = -, 3/5 <-|
k = 0, 2/3 | We list and sort the possible reduced fractions
k = 1, 3/4 | in between these numbers
k = -, 4/5 <-|
k = -, 1/1
(That's why there is this piece of code: k -= count_less(up, n);, it basically remaps the k values)
(And we also subtract one more during indexing, i.e.: cout << (int64_x)elements[k - 1].first << ' ' << (int64_x)elements[k - 1].second << '\n';. This is just to basically call the right position in the generated array.)
So, for our new re-mapped k values, for N = 5 and k = 7 (original k), our result is 2/3.
(We select the value k = 0, in our new map)
If you compile and run the accepted solution, it will give you this:
Input: 5 7 (Enter)
Output: 2 3
I believe this is the basic point of the editorial and accepted solution.

How to print values in memoization method-Dynamic pragraming

I know for a problem that can be solved using DP, can be solved by either tabulation(bottom-up) approach or memoization(top-down) approach. personally i find memoization is easy and even efficient approach(analysis required just to get recursive formula,once recursive formula is obtained, a brute-force recursive method can easily be converted to store sub-problem's result and reuse it.) The only problem that i am facing in this approach is, i am not able to construct actual result from the table which i filled on demand.
For example, in Matrix Product Parenthesization problem ( to decide in which order to perform the multiplications on Matrices so that cost of multiplication is minimum) i am able to calculate minimum cost not not able to generate order in algo.
For example, suppose A is a 10 × 30 matrix, B is a 30 × 5 matrix, and C is a 5 × 60 matrix. Then,
(AB)C = (10×30×5) + (10×5×60) = 1500 + 3000 = 4500 operations
A(BC) = (30×5×60) + (10×30×60) = 9000 + 18000 = 27000 operations.
here i am able to get min-cost as 27000 but unable to get order which is A(BC).
I used this. Suppose F[i, j] represents least number of multiplication needed to multiply Ai.....Aj and an array p[] is given which represents the chain of matrices such that the ith matrix Ai is of dimension p[i-1] x p[i]. So
0 if i=j
F[i,j]=
min(F[i,k] + F[k+1,j] +P_i-1 * P_k * P_j where k∈[i,j)
Below is the implementation that i have created.
#include<stdio.h>
#include<limits.h>
#include<string.h>
#define MAX 4
int lookup[MAX][MAX];
int MatrixChainOrder(int p[], int i, int j)
{
if(i==j) return 0;
int min = INT_MAX;
int k, count;
if(lookup[i][j]==0){
// recursively calculate count of multiplcations and return the minimum count
for (k = i; k<j; k++) {
int gmin=0;
if(lookup[i][k]==0)
lookup[i][k]=MatrixChainOrder(p, i, k);
if(lookup[k+1][j]==0)
lookup[k+1][j]=MatrixChainOrder(p, k+1, j);
count = lookup[i][k] + lookup[k+1][j] + p[i-1]*p[k]*p[j];
if (count < min){
min = count;
printf("\n****%d ",k); // i think something has be done here to represent the correct answer ((AB)C)D where first mat is represented by A second by B and so on.
}
}
lookup[i][j] = min;
}
return lookup[i][j];
}
// Driver program to test above function
int main()
{
int arr[] = {2,3,6,4,5};
int n = sizeof(arr)/sizeof(arr[0]);
memset(lookup, 0, sizeof(lookup));
int width =10;
printf("Minimum number of multiplications is %d ", MatrixChainOrder(arr, 1, n-1));
printf("\n ---->");
for(int l=0;l<MAX;++l)
printf(" %*d ",width,l);
printf("\n");
for(int z=0;z<MAX;z++){
printf("\n %d--->",z);
for(int x=0;x<MAX;x++)
printf(" %*d ",width,lookup[z][x]);
}
return 0;
}
I know using tabulation approach printing the solution is much easy but i want to do it in memoization technique.
Thanks.
Your code correctly computes the minimum number of multiplications, but you're struggling to display the optimal chain of matrix multiplications.
There's two possibilities:
When you compute the table, you can store the best index found in another memoization array.
You can recompute the optimal splitting points from the results in the memoization array.
The first would involve creating the split points in a separate array:
int lookup_splits[MAX][MAX];
And then updating it inside your MatrixChainOrder function:
...
if (count < min) {
min = count;
lookup_splits[i][j] = k;
}
You can then generate the multiplication chain recursively like this:
void print_mult_chain(int i, int j) {
if (i == j) {
putchar('A' + i - 1);
return;
}
putchar('(');
print_mult_chain(i, lookup_splits[i][j]);
print_mult_chain(lookup_splits[i][j] + 1, j);
putchar(')');
}
You can call the function with print_mult_chain(1, n - 1) from main.
The second possibility is that you don't cache lookup_splits and recompute it as necessary.
int get_lookup_splits(int p[], int i, int j) {
int best = INT_MAX;
int k_best;
for (int k = i; k < j; k++) {
int count = lookup[i][k] + lookup[k+1][j] + p[i-1]*p[k]*p[j];
if (count < best) {
best = count;
k_best = k;
}
}
return k;
}
This is essentially the same computation you did inside MatrixChainOrder, so if you go with this solution you should factor the code appropriately to avoid having two copies.
With this function, you can adapt print_mult_chain above to use it rather than the lookup_splits array. (You'll need to pass the p array in).
[None of this code is tested, so you may need to edit the answer to fix bugs].

Calculation particular row mod 10^9+7 of pascal table

what i want to store is a particular row of pascal table elements mod 10^9+7 in an array i tried to code it but it is failing somewhere when value is huge of like 10^5
here is the code. i have tried to apply modular inverse here and modular arithmetic here mod is 10^9+7
void pascal_row(ll n){
memset(soo,0,MAX);
soo[0] = 1; //First element is always 1
for(ll i=1; i<n/2+1; i++){ //Progress up, until reaching the middle value
soo[i] = ( ( soo[i-1] %mod ) * ((( (n-i+1)%mod * calcInverse(i,mod)%mod) % mod ))%mod)%mod;
}
for(ll i=n/2+1; i<=n; i++){ //Copy the inverse of the first part
soo[i] = soo[n-i]%mod;
}
}
here is what my modular inverse function look
long long calcInverse(long long a, long long n)
{
long long t = 0, newt = 1;
long long r = n, newr = a;
while (newr != 0) {
auto quotient = r /newr;
tie(t, newt) = make_tuple(newt, t- quotient * newt);
tie(r, newr) = make_tuple(newr, r - quotient * newr);
}
if (r > 1)
throw runtime_error("a is not invertible");
if (t < 0)
t += n;
return t;
}
Please tell what is the correct way of doing this Thanks
There's a relationship between the elements of Pascal's triangle and the choose function (where n choose r = n!/(r!*(n-r)!).) Specifically, starting from zero, the n'th row and r'th column of Pascal's triangle is n choose r. To find a particular row, you know what n you want, and then you should iterate over the possible values of r, find n choose r, and then take your modulus.
I'd recommend Java's BigInteger class for this, because it will handle any overflow errors you might be getting.

Another Pollard Rho Implementation

In an attempt to solve the 3rd problem on project Euler (https://projecteuler.net/problem=3), I decided to implement Pollard's Rho algorithm (at least part of it, I'm planning on including the cycling later). The odd thing is that it works for numbers such as: 82123(factor = 41) and 16843009(factor 257). However when I try the project Euler number: 600851475143, I end up getting 71 when the largest prime factor is 6857. Here's my implementation(sorry for wall of code and lack of type casting):
#include <iostream>
#include <math.h>
#include <vector>
using namespace std;
long long int gcd(long long int a,long long int b);
long long int f(long long int x);
int main()
{
long long int i, x, y, N, factor, iterations = 0, counter = 0;
vector<long long int>factors;
factor = 1;
x = 631;
N = 600851475143;
factors.push_back(x);
while (factor == 1)
{
y = f(x);
y = y % N;
factors.push_back(y);
cout << "\niteration" << iterations << ":\t";
i = 0;
while (factor == 1 && (i < factors.size() - 1))
{
factor = gcd(abs(factors.back() - factors[i]), N);
cout << factor << " ";
i++;
}
x = y;
//factor = 2;
iterations++;
}
system("PAUSE");
return 0;
}
long long int gcd(long long int a, long long int b)
{
long long int remainder;
do
{
remainder = a % b;
a = b;
b = remainder;
} while (remainder != 0);
return a;
}
long long int f(long long int x)
{
//x = x*x * 1024 + 32767;
x = x*x + 1;
return x;
}
Pollard's rho algorithm guarantees nothing. It doesn't guarantee to find the largest factor. It doesn't guarantee that any factor it finds is prime. It doesn't even guarantee to find a factor at all. The rho algorithm is probabilistic; it will probably find a factor, but not necessarily. Since your function returns a factor, it works.
That said, your implementation isn't very good. It's not necessary to store all previous values of the function, and compute the gcd to each every time through the loop. Here is pseudocode for a better version of the function:
function rho(n)
for c from 1 to infinity
h, t := 1, 1
repeat
h := (h*h+c) % n # the hare runs ...
h := (h*h+c) % n # ... twice as fast
t := (t*t+c) % n # as the tortoise
g := gcd(t-h, n)
while g == 1
if g < n then return g
This function returns a single factor of n, which may be either prime or composite. It stores only two values of the random sequence, and stops when it finds a cycle (when g == n), restarting with a different random sequence (by incrementing c). Otherwise it keeps going until it finds a factor, which shouldn't take too long as long as you limit the input to 64-bit integers. Find more factors by applying rho to the remaining cofactor, or if the factor that is found is composite, stopping when all the prime factors have been found.
By the way, you don't need Pollard's rho algorithm to solve Project Euler #3; simple trial division is sufficient. This algorithm finds all the prime factors of a number, from which you can extract the largest:
function factors(n)
f := 2
while f * f <= n
while n % f == 0
print f
n := n / f
f := f + 1
if n > 1 then print n

Resources