minimum slice position - Order N algorithm - algorithm

A non-empty zero-indexed array A consisting of N integers is given. A pair of integers (P, Q), such that 0 ≤ P < Q < N, is called a slice of array A
(notice that the slice contains at least two elements). The average of a slice (P, Q) is the sum of A[P] + A[P + 1] + ... + A[Q] divided by the
length of the slice. To be precise, the average equals (A[P] + A[P + 1] + ... + A[Q]) / (Q − P + 1).
Write a function:
int solution(int A[], int N);
that, given a non-empty zero-indexed array A consisting of N integers, returns the starting position of the slice with the minimal average.
If there is more than one slice with a minimal average, you should return the smallest starting position of such a slice.
Assume that:
N is an integer within the range [2..100,000];
each element of array A is an integer within the range [−10,000..10,000].
Complexity:
expected worst-case time complexity is O(N);
expected worst-case space complexity is O(N), beyond input storage (not counting the storage required for input arguments).
Can you post only solutions with order N only?

If A had only positive numbers, you could get away with this:
pos = 0
min_avg = A[0] + A[1]
for (i=2; i<N; i++)
m = A[i-1] + A[i]
if (m < min_avg)
min_avg = m
pos = i-1
return pos
This is only taking an average of a slice of two numbers, because a larger slice cannot have a smaller average than the minimum of a smaller slice.
If A has negative numbers, you could adjust all values upwards first:
offset = min(A)
for (i=0; i<N; i++)
A[i] -= offset
Combined with the previous algorithm:
offset = min(A) * 2 (because we're adding two numbers below)
pos = 0
min_avg = A[0] + A[1] - offset
for (i=2; i<N; i++)
m = A[i-1] + A[i] - offset
if (m < min_avg)
min_avg = m
pos = i-1
return pos

I think you're right, the best I can do is an O(N2) solution (this is in Python):
from random import randint
N = 1000
A = [randint(-10000, 10000) for _ in xrange(N)]
def solution(A, N):
min_avg = 10001
for p in xrange(N):
s = A[p]
for q in xrange(1,N-p):
s += A[p+q]
a = s / (q+1.)
if a < min_avg:
min_avg = a
pos = (p, q+1)
return pos
print solution(A, N)
However, averages of larger slices tend towards the mean (middle) value of the original range. In this case, the average is zero, halfway between -10000 and 10000. Most of the time, the smallest average is of a slice of two values, but sometimes it can be a slice of three values and rarely it can be even more values. So I think my previous answer works in most (>90%) of the cases. It really depends on the data values.

#include <assert.h>
struct Slice { unsigned P, Q; };
struct Slice MinSlice( int A[], unsigned N ) {
assert( N>=2 );
// find min slice of length 2
unsigned P = 0;
double min_sum = A[P] + A[P+1];
for (unsigned i = 1; i < N-1; ++i)
if ( min_sum > A[i] +A[i+1] ) {
P = i;
min_sum = A[P] + A[P+1];
}
unsigned Q = P+1;
double min_avg = min_sum / 2;
//extend the min slice if the avg can be reduced.
//(in the direction that most reduces the avg)
for (;;) {
if ( P > 0 && ( Q >= N-1 || A[P-1] <= A[Q+1] ) ) {
//reducing P might give the best reduction in avg
double new_sum = A[P-1] + min_sum;
double new_avg = new_sum / (Q - P + 2);
if ( min_avg < new_avg )
break;
min_sum = new_sum;
min_avg = new_avg;
--P;
} else if ( Q < N-1 && ( P <= 0 || A[P-1] >= A[Q+1] ) ) {
//increasing Q might give the best reduction in avg
double new_sum = min_sum + A[Q+1];
double new_avg = new_sum / (Q - P + 2);
if ( min_avg < new_avg )
break;
min_sum = new_sum;
min_avg = new_avg;
++Q;
} else
break;
}
struct Slice slice = { .P = P, .Q= Q };
return slice;
}

Related

Count integer partions with k parts, each below some threshold m

I want to count the number of ways we can partition the number n, into k distinct parts where each part is not larger than m.
For k := 2 i have following algorithm:
public int calcIntegerPartition(int n, int k, int m) {
int cnt=0;
for(int i=1; i <= m;i++){
for(int j=i+1; j <= m; j++){
if(i+j == n){
cnt++;
break;
}
}
}
return cnt;
}
But how can i count integer partitions with k > 2? Usually I have n > 100000, k := 40, m < 10000.
Thank you in advance.
Let's start by choosing the k largest legal numbers: m, m-1, m-2, ..., m-(k-1). This adds up to k*m - k(k-1)/2. If m < k, there are no solutions because the smallest partition would be <= 0. Let's assume m >= k.
Let's say p = (km - k(k-1)/2) - n.
If p < 0, there are no solutions because the largest number we can make is less than n. Let's assume p >= 0. Note that if p = 0 there is exactly one solution, so let's assume p > 0.
Now, imagine we start by choosing the k largest distinct legal integers, and we then correct this to get a solution. Our correction involves moving values to the left (on the number line) 1 slot, into empty slots, exactly p times. How many ways can we do this?
The smallest value to start with is m-(k-1), and it can move as far down as 1, so up to m-k times. After this, each successive value can move up to its predecessor's move.
Now the problem is, how many nonincreasing integer sequences with a max value of m-k sum to p? This is the partition problem. I.e., how many ways can we partition p (into at most k partitions). This is no closed-form solution to this.
Someone has already written up a nice answer of this problem here (which will need slight modification to meet your restrictions):
Is there an efficient algorithm for integer partitioning with restricted number of parts?
As #Dave alludes to, there is already a really nice answer to the simple restricted integer case (found here (same link as #Dave): Is there an efficient algorithm for integer partitioning with restricted number of parts?).
Below is a variant in C++ which takes into account the maximum value of each restricted part. First, here is the workhorse:
#include <vector>
#include <algorithm>
#include <iostream>
int width;
int blockSize;
static std::vector<double> memoize;
double pStdCap(int n, int m, int myMax) {
if (myMax * m < n || n < m) return 0;
if (myMax * m == n || n <= m + 1) return 1;
if (m < 2) return m;
const int block = myMax * blockSize + (n - m) * width + m - 2;
if (memoize[block]) return memoize[block];
int niter = n / m;
if (m == 2) {
if (myMax * 2 >= n) {
myMax = std::min(myMax, n - 1);
return niter - (n - 1 - myMax);
} else {
return 0;
}
}
double count = 0;
for (; niter--; n -= m, --myMax) {
count += (memoize[myMax * blockSize + (n - m) * width + m - 3] = pStdCap(n - 1, m - 1, myMax));
}
return count;
}
As you can see pStdCap is very similar to the linked solution. The one noticeable difference are the 2 additional checks at the top:
if (myMax * m < n || n < m) return 0;
if (myMax * m == n || n <= m + 1) return 1;
And here is the function that sets up the recursion:
double CountPartLenCap(int n, int m, int myMax) {
if (myMax * m < n || n < m) return 0;
if (myMax * m == n || n <= m + 1) return 1;
if (m < 2) return m;
if (m == 2) {
if (myMax * 2 >= n) {
myMax = std::min(myMax, n - 1);
return n / m - (n - 1 - myMax);
} else {
return 0;
}
}
width = m;
blockSize = m * (n - m + 1);
memoize = std::vector<double>((myMax + 1) * blockSize, 0.0);
return pStdCap(n, m, myMax);
}
Explanation of the parameters:
n is the integer that you are partitioning
m is the length of each partition
myMax is the maximum value that can appear in a given partition. (the OP refers to this as the threshold)
Here is a live demonstration https://ideone.com/c3WohV
And here is a non memoized version of pStdCap which is a bit easier to understand. This is originally found in this answer to Is there an efficient way to generate N random integers in a range that have a given sum or average?
int pNonMemoStdCap(int n, int m, int myMax) {
if (myMax * m < n) return 0;
if (myMax * m == n) return 1;
if (m < 2) return m;
if (n < m) return 0;
if (n <= m + 1) return 1;
int niter = n / m;
int count = 0;
for (; niter--; n -= m, --myMax) {
count += pNonMemoStdCap(n - 1, m - 1, myMax);
}
return count;
}
If you actually intend to calculate the number of partitions for numbers as large as 10000, you are going to need a big int library as CountPartLenCap(10000, 40, 300) > 3.2e37 (Based off the OP's requirement).

How can I solve this dynamic programing problem?

I was stuck in a problem studying dynamic programming.
I have a string of numbers. You need to find the length of the longest substring of the substrings in this string that has the sum of the first half of the numbers and the second half of the numbers.
For example,
Input string: 142124
Output : 6
When the input string is "142124", the sum of the numbers of the first half (142) and the number of the second half (124) is the same, so the entire given string becomes the longest substring we find. Therefore, the output is 6, the length of the entire string.
Input string: 9430723
Output: 4
The longest substring in this string that has the sum of the first half and the second half becomes "4307".
I solved this problem this way
int maxSubStringLength(char* str){
int n = strlen(str);
int maxLen = 0;
int sum[n][n];
for(int i=0; i<n; i++)
sum[i][i] = str[i] - '0';
for(int len =2; len <=n; len++){
for(int i = 0; i < n - len + 1; i++){
int j = i + len - 1;
int k = len / 2;
sum[i][j] = sum[i][j-k] + sum[j-k+1][j];
if(len%2 == 0 && sum[i][j-k] == sum[j-k+1][j] && len > maxLen)
maxLen = len;
}
}
return maxLen;
}
This code has a time complexity of O (n * n) and a space complexity of O (n * n).
However, this problem requires solving with O (1) space complexity with O (n * n) time complexity.
Is it possible to solve this problem with the space complexity of O (1)?
You can easily solve this problem with O(1) space complexity and O(n^2) time complexity.
Here is one aproach:
Go from m = 0 to n-2. This denotes the middle of the string (you split after the mth character).
For i = 1 to n (break if you get out of bounds). Build the left and right sums, if they are equal, compare i to best so far and update it if better.
Solution is 2 times best (because it denotes the half string).
In Java it would be something like this:
public int maxSubstringLength(String s) {
int best = 0;
for (int m = 0; m < s.length() - 1; m++) {
int l = 0; // left sum
int r = 0; // right sum
for (int i = 1; m - i + 1 >= 0 && m + i < s.length(); i++) {
l += s.charAt(m - i + 1);
r += s.charAt(m + i);
if (l == r && i > best)
best = i;
}
}
return 2 * best;
}

Count of co-prime pairs from two arrays in less than O(n^2) complexity

I came to this problem in a challenge.
There are two arrays A and B both of size of N and we need to return the count of pairs (A[i],B[j]) where gcd(A[i],B[j])==1 and A[i] != B[j].
I could only think of brute force approach which exceeded time limit for few test cases.
for(int i=0; i<n; i++) {
for(int j=0; j<n; j++) {
if(__gcd(a[i],b[j])==1) {
printf("%d %d\n", a[i], b[j]);
}
}
}
Can you advice time efficient algorithm to solve this.
Edit: Not able to share question link as this was from a hiring challenge. Adding the constraints and input/output format as I remember.
Input -
First line will contain N, the number of elements present in both arrays.
Second line will contain N space separated integers, elements of array A.
Third line will contain N space separated integers, elements of array B.
Output -
The count of pairs A[i],A[j] as per the conditions.
Constraints -
1 <= N <= 10^5
1 < A[i],B[j] <= 10^9 where i,j < N
The first step is to use Eratosthenes sieve to calculate the prime numbers up to sqrt(10^9). This sieve can then be used to quickly find all prime factors of any number less than 10^9 (see the getPrimeFactors(...) function in the code sample below).
Next, for each A[i] with prime factors p0, p1, ..., pk, we compute all possible sub-products X - p0, p1, p0p1, p2, p0p2, p1p2, p0p1p2, p3, p0p3, ..., p0p1p2...pk and count them in map cntp[X]. Effectively, the map cntp[X] tells us the number of elements A[i] divisible by X, where X is a product of prime numbers to the power of 0 or 1. So for example, for the number A[i] = 12, the prime factors are 2, 3. We will count cntp[2]++, cntp[3]++ and cntp[6]++.
Finally, for each B[j] with prime factors p0, p1, ..., pk, we again compute all possible sub-products X and use the Inclusion-exclusion principle to count all non-coprime pairs C_j (i.e. the number of A[i]s that share at least one prime factor with B[j]). The numbers C_j are then subtracted from the total number of pairs - N*N to get the final answer.
Note: the Inclusion-exclusion principle looks like this:
C_j = (cntp[p0] + cntp[p1] + ... + cntp[pk]) -
(cntp[p0p1] + cntp[p0p2] + ... + cntp[pk-1pk]) +
(cntp[p0p1p2] + cntp[p0p1p3] + ... + cntp[pk-2pk-1pk]) -
...
and accounts for the fact that in cntp[X] and cntp[Y] we could have counted the same number A[i] twice, given that it is divisible by both X and Y.
Here is a possible C++ implementation of the algorithm, which produces the same results as the naive O(n^2) algorithm by OP:
// get prime factors of a using pre-generated sieve
std::vector<int> getPrimeFactors(int a, const std::vector<int> & primes) {
std::vector<int> f;
for (auto p : primes) {
if (p > a) break;
if (a % p == 0) {
f.push_back(p);
do {
a /= p;
} while (a % p == 0);
}
}
if (a > 1) f.push_back(a);
return f;
}
// find coprime pairs A_i and B_j
// A_i and B_i <= 1e9
void solution(const std::vector<int> & A, const std::vector<int> & B) {
// generate prime sieve
std::vector<int> primes;
primes.push_back(2);
for (int i = 3; i*i <= 1e9; ++i) {
bool isPrime = true;
for (auto p : primes) {
if (i % p == 0) {
isPrime = false;
break;
}
}
if (isPrime) {
primes.push_back(i);
}
}
int N = A.size();
struct Entry {
int n = 0;
int64_t p = 0;
};
// cntp[X] - number of times the product X can be expressed
// with prime factors of A_i
std::map<int64_t, int64_t> cntp;
for (int i = 0; i < N; i++) {
auto f = getPrimeFactors(A[i], primes);
// count possible products using non-repeating prime factors of A_i
std::vector<Entry> x;
x.push_back({ 0, 1 });
for (auto p : f) {
int k = x.size();
for (int i = 0; i < k; ++i) {
int nn = x[i].n + 1;
int64_t pp = x[i].p*p;
++cntp[pp];
x.push_back({ nn, pp });
}
}
}
// use Inclusion–exclusion principle to count non-coprime pairs
// and subtract them from the total number of prairs N*N
int64_t cnt = N; cnt *= N;
for (int i = 0; i < N; i++) {
auto f = getPrimeFactors(B[i], primes);
std::vector<Entry> x;
x.push_back({ 0, 1 });
for (auto p : f) {
int k = x.size();
for (int i = 0; i < k; ++i) {
int nn = x[i].n + 1;
int64_t pp = x[i].p*p;
x.push_back({ nn, pp });
if (nn % 2 == 1) {
cnt -= cntp[pp];
} else {
cnt += cntp[pp];
}
}
}
}
printf("cnt = %d\n", (int) cnt);
}
Live example
I cannot estimate the complexity analytically, but here are some profiling result on my laptop for different N and uniformly random A[i] and B[j]:
For N = 1e2, takes ~0.02 sec
For N = 1e3, takes ~0.05 sec
For N = 1e4, takes ~0.38 sec
For N = 1e5, takes ~3.80 sec
For comparison, the O(n^2) approach takes:
For N = 1e2, takes ~0.00 sec
For N = 1e3, takes ~0.15 sec
For N = 1e4, takes ~15.1 sec
For N = 1e5, takes too long, didn't wait to finish
Python Implementation:
import math
from collections import defaultdict
def sieve(MAXN):
spf = [0 for i in range(MAXN)]
spf[1] = 1
for i in range(2, MAXN):
spf[i] = i
for i in range(4, MAXN, 2):
spf[i] = 2
for i in range(3, math.ceil(math.sqrt(MAXN))):
if (spf[i] == i):
for j in range(i * i, MAXN, i):
if (spf[j] == j):
spf[j] = i
return(spf)
def getFactorization(x,spf):
ret = list()
while (x != 1):
ret.append(spf[x])
x = x // spf[x]
return(list(set(ret)))
def coprime_pairs(N,A,B):
MAXN=max(max(A),max(B))+1
spf=sieve(MAXN)
cntp=defaultdict(int)
for i in range(N):
f=getFactorization(A[i],spf)
x=[[0,1]]
for p in f:
k=len(x)
for i in range(k):
nn=x[i][0]+1
pp=x[i][1]*p
cntp[pp]+=1
x.append([nn,pp])
cnt=0
for i in range(N):
f=getFactorization(B[i],spf)
x=[[0,1]]
for p in f:
k=len(x)
for i in range(k):
nn=x[i][0]+1
pp=x[i][1]*p
x.append([nn,pp])
if(nn%2==1):
cnt+=cntp[pp]
else:
cnt-=cntp[pp]
return(N*N-cnt)
import random
N=10001
A=[random.randint(1,N) for _ in range(N)]
B=[random.randint(1,N) for _ in range(N)]
print(coprime_pairs(N,A,B))

Variant of Subset-Sum

Given 3 positive integers n, k, and sum, find exactly k number of distinct elements a_i, where
a_i \in S, 1 <= i <= k, and a_i \neq a_j for i \neq j
and, S is the set
S = {1, 2, 3, ..., n}
such that
\sum_{i=1}^{k}{a_i} = sum
I don't want to apply brute force (checking all possible combinations) to solve the problem due to exponential complexity. Can someone give me a hint towards another approach in solving this problem? Also, how can we exploit the fact the set S is sorted?
Is it possible to have complexity of O(k) in this problem?
An idea how to exploit 1..n set properties:
Sum of k continuous members of natural row starting from a is
sum = k*(2*a + (k-1))/2
To get sum of such subsequence about needed s, we can solve
a >= s/k - k/2 + 1/2
or
a <= s/k - k/2 + 1/2
compare s and sum values and make corrections.
For example, having s=173, n=40 and k=5, we can find
a <= 173/5 - 5/2 + 1/2 = 32.6
for starting number 32 we have sequence 32,33,34,35,36 with sum = 170, and for correction by 3 we can just change 36 with 39, or 34,35,36 with 35,36,37 and so on.
Seems that using this approach we get O(1) complexity (of course, there might exist some subtleties that I did miss)
It's possible to modify the pseudo-polynomial algorithm for subset sum.
Prepare a matrix P with dimension k X sum, and initialize all elements to 0. The meaning of P[p, q] == 1 is that there is a subset of p numbers summing to q, and P[p, q] == 0 means that such a subset has not yet been found.
Now iterate over i = 1, ..., n. In each iteration:
If i ≤ sum, set P[1, i] = 1 (there is a subset of size 1 that achieves i).
For any entry P[p, q] == 1, you now know that P[p + 1, q + i] should now be 1 too. If (p + 1, q + i) is within the boundaries of the matrix, set P[p + 1, q + i] = 1.
Finally, check if P[k, sum] == 1.
The complexity, assuming that all integer math operations is constant, is Θ(n2 sum).
There is a O(1) (so to speak) solution. What follows is a formal enough (I hope) development of the idea by #MBo.
It is sufficient to assume that S is a set of all integers and find a minimal solution. Solution K is smaller than K' iff max(K) < max(K'). If max(K) <= n, then K is also a solution to the original problem; otherwise, the original problem has no solution.
So we disregard n and find K, a minimal solution. Let g = max(K) = ceil(sum/k + (k - 1)/2) and s = g + (g-1) + (g-2) + ... (g-k+1) and s' = (g-1) + (g-2) + ... + (g-k). That is, s' is s shifted down by 1. Note s' = s - k.
Obviously s >= sum and (because K is minimal) s' < sum.
If s == sum the solution is K and we're done. Otherwise consider the set K+ = {g, g-1, ..., g-k}. We know that \sum(K+ \setminus {g}) < sum and \sum(K+ \setminus {g-k}) > sum, therefore, there's a single element g_i of K+ such that \sum (K+ \setminus {g_i}) = sum. The solution isK+ \setminus {\sum(K+)-sum}.
The solution in the form of 4 integers a, b, c, d where the actual set is understood to be [a..b] \setunion [c..d] can be computed in O(1).
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
unsigned long int arithmeticSum(unsigned long int a, unsigned long int k, unsigned long int n, unsigned long int *A);
void printSubset(unsigned long int k, unsigned long int *A);
int main(void)
{
unsigned long int n, k, sum;
// scan the respective values of sum, n, and k
scanf("%lu %lu %lu", &sum, &n, &k);
// find the starting element using the formula for the sum of an A.P. having 'k' terms
// starting at 'a', common difference 'd' ( = 1 in this problem), having 'sum' = sum
// sum = [k/2][2*a + (k-1)*d]
unsigned long startElement = (long double)sum/k - (long double)k/2 + (long double)1/2;
// exit if the arithmetic progression formed at the startElement is not within the required bounds
if(startElement < 1 || startElement + k - 1 > n)
{
printf("-1\n");
return 0;
}
// we now work on the k-element set [startElement, startElement + k - 1]
// create an array to store the k elements
unsigned long int *A = malloc(k * sizeof(unsigned long int));
// calculate the sum of k elements in the arithmetic progression [a, a + 1, a + 2, ..., a + (k - 1)]
unsigned long int currentSum = arithmeticSum(startElement, k, n, A);
// if the currentSum is equal to the required sum, then print the array A, and we are done
if(currentSum == sum)
{
printSubset(k, A);
}
// we enter into this block only if currentSum < sum
// i.e. we need to add 'something' to the currentSum in order to make it equal to sum
// i.e. we need to remove an element from the k-element set [startElement, startElement + k - 1]
// and replace it with an element of higher magnitude
// i.e. we need to replace an element in the set [startElement, startElement + k - 1] and replace
// it with an element in the range [startElement + k, n]
else
{
long int j;
bool done;
// calculate the amount which we need to add to the currentSum
unsigned long int difference = sum - currentSum;
// starting from A[k-1] upto A[0] do the following...
for(j = k - 1, done = false; j >= 0; j--)
{
// check if adding the "difference" to A[j] results in a number in the range [startElement + k, n]
// if it does then replace A[j] with that element, and we are done
if(A[j] + difference <= n && A[j] + difference > A[k-1])
{
A[j] += difference;
printSubset(k, A);
done = true;
break;
}
}
// if no such A[j] is found then, exit with fail
if(done == false)
{
printf("-1\n");
}
}
return 0;
}
unsigned long int arithmeticSum(unsigned long int a, unsigned long int k, unsigned long int n, unsigned long int *A)
{
unsigned long int currentSum;
long int j;
// calculate the sum of the arithmetic progression and store the each member in the array A
for(j = 0, currentSum = 0; j < k; j++)
{
A[j] = a + j;
currentSum += A[j];
}
return currentSum;
}
void printSubset(unsigned long int k, unsigned long int *A)
{
long int j;
for(j = 0; j < k; j++)
{
printf("%lu ", A[j]);
}
printf("\n");
}

Given an integer z<=10^100, find the smallest row of Pascal's triangle that contains z [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 8 years ago.
Improve this question
How can I find an algorithm to solve this problem using C++: given an integer z<=10^100, find the smallest row of Pascal's triangle that contains the number z.
1
1 1
1 2 1
1 3 3 1
1 4 6 4 1
For example if z=6 => result is on the 4th row.
Another way to describe the problem: given integer z<=10^100, find the smallest integer n: exist integer k so that C(k,n) = z.
C(k,n) is combination of n things taken k at a time without repetition
EDIT This solution needs Logarithmic time, it's O(Log z). Or maybe O( (Log z)^2 ).
Say you are looking for n,k where Binomial(n,k)==z for a given z.
Each row has its largest value in the middle, so starting from n=0 you increase the row number, n, as long as the middle value is smaller than the given number. Actually, 10^100 isn't that big, so before row 340 you find a position n0,k0=n0/2 where the value from the triangle is larger than or equal to z: Binomial(n0,k0)>=z
You walk to the left, i.e. you decrease the column number k, until eventually you find a value smaller than z. If there was a matching value in that row you would have hit it by now. k is not very large, less than 170, so this step won't be executed more often than that and does not present a performance problem.
From here you walk down, increasing n. Here you will find a steadily increasing value of Binomial[n,k]. Continue with 3 until the value gets bigger than or equal to z, then goto 2.
EDIT: This step 3 can loop for a very long time when the row number n is large, so instead of checking each n linearly you can do a binary search for n with Binomial(n,k) >= z > Binomial(n-1,k), then it only needs Log(n) time.
A Python implementation looks like this, C++ is similar but somewhat more cumbersome because you need to use an additional library for arbitrary precision integers:
# Calculate (n-k+1)* ... *n
def getnk( n, k ):
a = n
for u in range( n-k+1, n ):
a = a * u
return a
# Find n such that Binomial(n,k) >= z and Binomial(n-1,k) < z
def find_n( z, k, n0 ):
kfactorial = k
for u in range(2, k):
kfactorial *= u
xk = z * kfactorial
nk0 = getnk( n0, k )
n1=n0*2
nk1 = getnk( n1, k )
# duplicate n while the value is too small
while nk1 < xk:
nk0=nk1
n0=n1
n1*=2
nk1 = getnk( n1, k )
# do a binary search
while n1 > n0 + 1:
n2 = (n0+n1) // 2
nk2 = getnk( n2, k )
if nk2 < xk:
n0 = n2
nk0 = nk2
else:
n1 = n2
nk1 = nk2
return n1, nk1 // kfactorial
def find_pos( z ):
n=0
k=0
nk=1
# start by finding a row where the middle value is bigger than z
while nk < z:
# increase n
n = n + 1
nk = nk * n // (n-k)
if nk >= z:
break
# increase both n and k
n = n + 1
k = k + 1
nk = nk * n // k
# check all subsequent rows for a matching value
while nk != z:
if nk > z:
# decrease k
k = k - 1
nk = nk * (k+1) // (n-k)
else:
# increase n
# either linearly
# n = n + 1
# nk = nk * n // (n-k)
# or using binary search:
n, nk = find_n( z, k, n )
return n, k
z = 56476362530291763837811509925185051642180136064700011445902684545741089307844616509330834616
print( find_pos(z) )
It should print
(5864079763474581, 6)
Stirling estimation for n! can be used to find first row in triangle with binomial coefficient bigger or equal to a given x. Using this estimation we can derive lower and upper bound for
and then by observation that this is the maximum coefficient in row that expands 2n:
P( 2n, 0), P( 2n, 1), P( 2n, 2), ..., P( 2n, 2n -1), P( 2n, 2n)
we can find first row with maximum binomial coefficient bigger or equal to a given x. This is the first row in which x can be looking for, this is not possible that x can be found in the row smaller than this. Note: this may be right hint and give an answer immediately in some cases. At the moment I cannot see other way than to start a brute force search from this row.
template <class T>
T binomial_coefficient(unsigned long n, unsigned long k) {
unsigned long i;
T b;
if (0 == k || n == k) {
return 1;
}
if (k > n) {
return 0;
}
if (k > (n - k)) {
k = n - k;
}
if (1 == k) {
return n;
}
b = 1;
for (i = 1; i <= k; ++i) {
b *= (n - (k - i));
if (b < 0) return -1; /* Overflow */
b /= i;
}
return b;
}
Stirling:
double stirling_lower_bound( int n) {
double n_ = n / 2.0;
double res = pow( 2.0, 2 * n_);
res /= sqrt( n_ * M_PI);
return res * exp( ( -1.0) / ( 6 * n_));
}
double stirling_upper_bound( int n) {
double n_ = n / 2.0;
double res = pow( 2.0, 2 * n_) ;
res /= sqrt( n_ * M_PI);
return res * exp( 1.0 / ( 24 * n_));
}
int stirling_estimate( double x) {
int n = 1;
while ( stirling_lower_bound( n) <= x) {
if ( stirling_upper_bound( n) > x) return n;
++n;
}
return n;
}
usage:
long int search_coefficient( unsigned long int &n, unsigned long int x) {
unsigned long int k = n / 2;
long long middle_coefficient = binomial_coefficient<long long>( n, k);
if( middle_coefficient == x) return k;
unsigned long int right = binomial_coefficient<unsigned long>( n, ++k);
while ( x != right) {
while( x < right || x < ( right * ( n + 1) / ( k + 1))) {
right = right * ( n + 1) / ( ++k) - right;
}
if ( right == x) return k;
right = right * ( ++n) / ( ++k);
if( right > x) return -1;
}
return k;
}
/*
*
*/
int main(int argc, char** argv) {
long long x2 = 1365;
unsigned long int n = stirling_estimate( x2);
long int k = search_coefficient( n, x2);
std::cout << "row:" << n <<", column: " << k;
return 0;
}
output:
row:15, column: 11

Resources