Iterating over all subsets of a given size - performance

I know that iterating over all subsets of a set of size n is a performance nightmare and will take O(2^n) time.
How about iterating over all subsets of size k (for (0 <= k <= n))? Is that a performance nightmare? I know there are (n, k) = n! / k! (n - k)! possibilities. I know that if k is very close to 0 or very close to n, this is a small number.
What is the worst case performance in terms of n and k? Is there a simpler way of saying it other than just O(n! / k! (n - k)!)? Is this asymptotically smaller than O(n!) or the same?

You want Gosper's hack:
int c = (1<<k)-1;
while (c < (1<<n)) {
dostuff(c);
int a = c&-c, b = c+a;
c = (c^b)/4/a|b;
}
Explanation:
Finding the next number with as many bits set basically reduces to the case of numbers with exactly one "block of ones" --- numbers having a bunch of zeroes, then a bunch of ones, then a bunch of zeroes again in their binary expansions.
The way to deal with such a "block of ones" number is to move the highest bit left by one and slam all the others as low as possible. Gosper's hack works by finding the lowest set bit (a), finding the "high part" comprising the bits we don't touch and the "carry bit" (b), then producing a block of ones of the appropriate size that begins at the least-significant bit.

It's easy to show that for a fixed n, (n, k) has a maximum at k = n/2. If I haven't misapplied Sterling's approximation, the asymptotic behavior for (n, n/2) is exponential.
For constant k, (n, k) is O(n^k). Keep in mind that the combinatorial function is symmetric, so it's the same for (n, n-k). It's polynomial, so it's way smaller than O(n!).

Related

How do we find the logsumexp of all subsets (or some approximation to this)?

I have a set of numbers n_1, n_2, ... n_k. I need to find the sum (or mean, its the same) of the logsumexp of all possible subsets of this set of numbers. Is there a way to approximate this or compute it exactly?
Note, the logsumexp of a, b, c is log(e_a + e_b + e_c) (exp, followed by sum, followed by log)
I don’t know if this will be accurate enough, but log sum exp is sort of a smooth analog of max, so one possibility would be to sort so that n1 ≥ n2 ≥ … ≥ nk and return ∑i (2k−i / (2k − 1)) ni, which under-approximates the true mean by an error term between 0 and log k.
You could also use a sample mean of the log sum exp of {ni} ∪ (a random subset of {ni+1, ni+2, …, nk}) instead of ni in the sum. By using enough samples, you can make the approximation as good as you like (though obviously at some point it’s cheaper to evaluate with brute force).
(I’m assuming that the empty set is omitted.)
On a different tack, here’s a deterministic scheme whose additive error is at most ε. As in my other answer, we sort n1 ≥ … ≥ nk, define an approximation f(i) of the mean log sum exp over nonempty subsets where the minimum index is i, and evaluate ∑i (2k−i / (2k − 1)) f(i). If each f(i) is within ε, then so is the result.
Fix i and for 1 ≤ j ≤ k−i define dj = ni+j − ni. Note that dj ≤ 0. We define f(i) = ni + g(i) where g(i) will approximate the mean log one plus sum exp of subsets of {dj | j}.
I don’t want to write the next part formally since I believe that it will be harder to understand, so the idea is that with log sum exp being commutative and associative, we’re going to speed up the brute force algorithm that initializes the list of results as [0] and then, for each j, doubles the size of the list the list by appending the log sum exp with dj of each of its current elements. I claim that, if we round each intermediate result down to the nearest multiple of δ = ε/k, then the result will under-approximate at most ε. By switching from the list to its histogram, we can do each step in time proportional to the number of distinct entries. Finally, we set g(i) to the histogram average.
To analyze the running time, in the worst case, we have dj = 0 for all j, making the largest possible result log k. This means that the list can have at most (log k)/δ = ε−1 k log k + 1 entries, making the total running time O(ε−1 k3 log k). (This can undoubtedly be improved slightly with a faster convolution algorithm and/or by rounding the dj and taking advantage.)

Binary vs Linear searches for unsorted N elements

I try to understand a formula when we should use quicksort. For instance, we have an array with N = 1_000_000 elements. If we will search only once, we should use a simple linear search, but if we'll do it 10 times we should use sort array O(n log n). How can I detect threshold when and for which size of input array should I use sorting and after that use binary search?
You want to solve inequality that rougly might be described as
t * n > C * n * log(n) + t * log(n)
where t is number of checks and C is some constant for sort implementation (should be determined experimentally). When you evaluate this constant, you can solve inequality numerically (with uncertainty, of course)
Like you already pointed out, it depends on the number of searches you want to do. A good threshold can come out of the following statement:
n*log[b](n) + x*log[2](n) <= x*n/2 x is the number of searches; n the input size; b the base of the logarithm for the sort, depending on the partitioning you use.
When this statement evaluates to true, you should switch methods from linear search to sort and search.
Generally speaking, a linear search through an unordered array will take n/2 steps on average, though this average will only play a big role once x approaches n. If you want to stick with big Omicron or big Theta notation then you can omit the /2 in the above.
Assuming n elements and m searches, with crude approximations
the cost of the sort will be C0.n.log n,
the cost of the m binary searches C1.m.log n,
the cost of the m linear searches C2.m.n,
with C2 ~ C1 < C0.
Now you compare
C0.n.log n + C1.m.log n vs. C2.m.n
or
C0.n.log n / (C2.n - C1.log n) vs. m
For reasonably large n, the breakeven point is about C0.log n / C2.
For instance, taking C0 / C2 = 5, n = 1000000 gives m = 100.
You should plot the complexities of both operations.
Linear search: O(n)
Sort and binary search: O(nlogn + logn)
In the plot, you will see for which values of n it makes sense to choose the one approach over the other.
This actually turned into an interesting question for me as I looked into the expected runtime of a quicksort-like algorithm when the expected split at each level is not 50/50.
the first question I wanted to answer was for random data, what is the average split at each level. It surely must be greater than 50% (for the larger subdivision). Well, given an array of size N of random values, the smallest value has a subdivision of (1, N-1), the second smallest value has a subdivision of (2, N-2) and etc. I put this in a quick script:
split = 0
for x in range(10000):
split += float(max(x, 10000 - x)) / 10000
split /= 10000
print split
And got exactly 0.75 as an answer. I'm sure I could show that this is always the exact answer, but I wanted to move on to the harder part.
Now, let's assume that even 25/75 split follows an nlogn progression for some unknown logarithm base. That means that num_comparisons(n) = n * log_b(n) and the question is to find b via statistical means (since I don't expect that model to be exact at every step). We can do this with a clever application of least-squares fitting after we use a logarithm identity to get:
C(n) = n * log(n) / log(b)
where now the logarithm can have any base, as long as log(n) and log(b) use the same base. This is a linear equation just waiting for some data! So I wrote another script to generate an array of xs and filled it with C(n) and ys and filled it with n*log(n) and used numpy to tell me the slope of that least squares fit, which I expect to equal 1 / log(b). I ran the script and got b inside of [2.16, 2.3] depending on how high I set n to (I varied n from 100 to 100'000'000). The fact that b seems to vary depending on n shows that my model isn't exact, but I think that's okay for this example.
To actually answer your question now, with these assumptions, we can solve for the cutoff point of when: N * n/2 = n*log_2.3(n) + N * log_2.3(n). I'm just assuming that the binary search will have the same logarithm base as the sorting method for a 25/75 split. Isolating N you get:
N = n*log_2.3(n) / (n/2 - log_2.3(n))
If your number of searches N exceeds the quantity on the RHS (where n is the size of the array in question) then it will be more efficient to sort once and use binary searches on that.

Finding the modified factorial N modulo some number coprime to N

Given two numbers N and p, let k be the maximum power of p such that p^k divides N! and let d = N!/(p^k). So d and p are coprimes.
How do I find d mod p? Direct iterations will be impractical as N! will be very high when N is high. A more efficient algorithm is required to find the expression.
Here is O(N) algorithm:
int d=1;
for(int i=1;i<=N;++i)
{
d*=i;
while(d%p==0)
d/=p;
d=d%p;
}
It doesn't require storing huge numbers, so may be acceptable. I suspect that O(p) algorithm is possible (because numbers will repeat after every k*p), but the code will be a bit more complex.

Is there a polynomial time algorithm to test whether a number exponent of some number?

Just study the famous paper PRIMES is in P and get confused.
First step of the proposed algorithm is If (n=a^b for nature number a and b>1), output COMPOSITE. Since the whole algorithm runs in polynomial time, this step must also complete in O((log n)^c)(given input size is O(log n). However, I can't figure out any algorithm to hit the target after some googling.
QUESTION:
Is there any algorithm available to test whether a number exponent of some other number in polynomial time?
Thanks and Best Regards!
If n=a^b (for a > 1) then b ≤ log2 n, we can check for all b's smaller than log n to test this, we can iterate for finding b from 2 to log n, and for finding a we should do binary search between 1..sqrt(n). But binary search takes O(logn) time for iteration, finally in each step of search(for any found a for checking) we should check that whether ab == n and this takes O(log n), so total search time will be O(log3n). may be there is a faster way but by knowing that AKS is O(log6n) this O(log3n) doesn't harm anything.
A number n is a perfect power if there exists b and e for which b^e = n. For instance 216 = 6^3 = 2^3 * 3^3 is a perfect power, but 72 = 2^3 * 3^2 is not. The trick to determining if a number is a perfect power is to know that, if the number is a perfect power, then the exponent e must be less than log2 n, because if e is greater then 2^e will be greater than n. Further, it is only necessary to test prime e, because if a number is a perfect power to a composite exponent it will also be a perfect power to the prime factors of the composite component; for instance, 2^15 = 32768 = 32^3 = 8^5 is a perfect cube root and also a perfect fifth root. Thus, the algorithm is to make a list of primes less than log2 n and test each one. Since log2 n is small, and the list of primes is even smaller, this isn't much work, even for large n.
You can see an implementation here.
public boolean isPerfectPower(int a) {
if(a == 1) return true;
for(int i = 2; i <= (int)Math.sqrt(a); i++){
double pow = Math.log10(a)/Math.log10(i);
if(pow == Math.floor(pow) && pow > 1) return true;
}
return false;
}

Time complexity of Euclid's Algorithm

I am having difficulty deciding what the time complexity of Euclid's greatest common denominator algorithm is. This algorithm in pseudo-code is:
function gcd(a, b)
while b ≠ 0
t := b
b := a mod b
a := t
return a
It seems to depend on a and b. My thinking is that the time complexity is O(a % b). Is that correct? Is there a better way to write that?
One trick for analyzing the time complexity of Euclid's algorithm is to follow what happens over two iterations:
a', b' := a % b, b % (a % b)
Now a and b will both decrease, instead of only one, which makes the analysis easier. You can divide it into cases:
Tiny A: 2a <= b
Tiny B: 2b <= a
Small A: 2a > b but a < b
Small B: 2b > a but b < a
Equal: a == b
Now we'll show that every single case decreases the total a+b by at least a quarter:
Tiny A: b % (a % b) < a and 2a <= b, so b is decreased by at least half, so a+b decreased by at least 25%
Tiny B: a % b < b and 2b <= a, so a is decreased by at least half, so a+b decreased by at least 25%
Small A: b will become b-a, which is less than b/2, decreasing a+b by at least 25%.
Small B: a will become a-b, which is less than a/2, decreasing a+b by at least 25%.
Equal: a+b drops to 0, which is obviously decreasing a+b by at least 25%.
Therefore, by case analysis, every double-step decreases a+b by at least 25%. There's a maximum number of times this can happen before a+b is forced to drop below 1. The total number of steps (S) until we hit 0 must satisfy (4/3)^S <= A+B. Now just work it:
(4/3)^S <= A+B
S <= lg[4/3](A+B)
S is O(lg[4/3](A+B))
S is O(lg(A+B))
S is O(lg(A*B)) //because A*B asymptotically greater than A+B
S is O(lg(A)+lg(B))
//Input size N is lg(A) + lg(B)
S is O(N)
So the number of iterations is linear in the number of input digits. For numbers that fit into cpu registers, it's reasonable to model the iterations as taking constant time and pretend that the total running time of the gcd is linear.
Of course, if you're dealing with big integers, you must account for the fact that the modulus operations within each iteration don't have a constant cost. Roughly speaking, the total asymptotic runtime is going to be n^2 times a polylogarithmic factor. Something like n^2 lg(n) 2^O(log* n). The polylogarithmic factor can be avoided by instead using a binary gcd.
The suitable way to analyze an algorithm is by determining its worst case scenarios.
Euclidean GCD's worst case occurs when Fibonacci Pairs are involved.
void EGCD(fib[i], fib[i - 1]), where i > 0.
For instance, let's opt for the case where the dividend is 55, and the divisor is 34 (recall that we are still dealing with fibonacci numbers).
As you may notice, this operation costed 8 iterations (or recursive calls).
Let's try larger Fibonacci numbers, namely 121393 and 75025. We can notice here as well that it took 24 iterations (or recursive calls).
You can also notice that each iterations yields a Fibonacci number. That's why we have so many operations. We can't obtain similar results only with Fibonacci numbers indeed.
Hence, the time complexity is going to be represented by small Oh (upper bound), this time. The lower bound is intuitively Omega(1): case of 500 divided by 2, for instance.
Let's solve the recurrence relation:
We may say then that Euclidean GCD can make log(xy) operation at most.
There's a great look at this on the wikipedia article.
It even has a nice plot of complexity for value pairs.
It is not O(a%b).
It is known (see article) that it will never take more steps than five times the number of digits in the smaller number. So the max number of steps grows as the number of digits (ln b). The cost of each step also grows as the number of digits, so the complexity is bound by O(ln^2 b) where b is the smaller number. That's an upper limit, and the actual time is usually less.
See here.
In particular this part:
Lamé showed that the number of steps needed to arrive at the greatest common divisor for two numbers less than n is
So O(log min(a, b)) is a good upper bound.
Here's intuitive understanding of runtime complexity of Euclid's algorithm. The formal proofs are covered in various texts such as Introduction to Algorithms and TAOCP Vol 2.
First think about what if we tried to take gcd of two Fibonacci numbers F(k+1) and F(k). You might quickly observe that Euclid's algorithm iterates on to F(k) and F(k-1). That is, with each iteration we move down one number in Fibonacci series. As Fibonacci numbers are O(Phi ^ k) where Phi is golden ratio, we can see that runtime of GCD was O(log n) where n=max(a, b) and log has base of Phi. Next, we can prove that this would be the worst case by observing that Fibonacci numbers consistently produces pairs where the remainders remains large enough in each iteration and never become zero until you have arrived at the start of the series.
We can make O(log n) where n=max(a, b) bound even more tighter. Assume that b >= a so we can write bound at O(log b). First, observe that GCD(ka, kb) = GCD(a, b). As biggest values of k is gcd(a,c), we can replace b with b/gcd(a,b) in our runtime leading to more tighter bound of O(log b/gcd(a,b)).
Here is the analysis in the book Data Structures and Algorithm Analysis in C by Mark Allen Weiss (second edition, 2.4.4):
Euclid's algorithm works by continually computing remainders until 0 is reached. The last nonzero remainder is the answer.
Here is the code:
unsigned int Gcd(unsigned int M, unsigned int N)
{
unsigned int Rem;
while (N > 0) {
Rem = M % N;
M = N;
N = Rem;
}
Return M;
}
Here is a THEOREM that we are going to use:
If M > N, then M mod N < M/2.
PROOF:
There are two cases. If N <= M/2, then since the remainder is smaller
than N, the theorem is true for this case. The other case is N > M/2.
But then N goes into M once with a remainder M - N < M/2, proving the
theorem.
So, we can make the following inference:
Variables M N Rem
initial M N M%N
1 iteration N M%N N%(M%N)
2 iterations M%N N%(M%N) (M%N)%(N%(M%N)) < (M%N)/2
So, after two iterations, the remainder is at most half of its original value. This would show that the number of iterations is at most 2logN = O(logN).
Note that, the algorithm computes Gcd(M,N), assuming M >= N.(If N > M, the first iteration of the loop swaps them.)
Worst case will arise when both n and m are consecutive Fibonacci numbers.
gcd(Fn,Fn−1)=gcd(Fn−1,Fn−2)=⋯=gcd(F1,F0)=1 and nth Fibonacci number is 1.618^n, where 1.618 is the Golden ratio.
So, to find gcd(n,m), number of recursive calls will be Θ(logn).
The worst case of Euclid Algorithm is when the remainders are the biggest possible at each step, ie. for two consecutive terms of the Fibonacci sequence.
When n and m are the number of digits of a and b, assuming n >= m, the algorithm uses O(m) divisions.
Note that complexities are always given in terms of the sizes of inputs, in this case the number of digits.
Gabriel Lame's Theorem bounds the number of steps by log(1/sqrt(5)*(a+1/2))-2, where the base of the log is (1+sqrt(5))/2. This is for the the worst case scenerio for the algorithm and it occurs when the inputs are consecutive Fibanocci numbers.
A slightly more liberal bound is: log a, where the base of the log is (sqrt(2)) is implied by Koblitz.
For cryptographic purposes we usually consider the bitwise complexity of the algorithms, taking into account that the bit size is given approximately by k=loga.
Here is a detailed analysis of the bitwise complexity of Euclid Algorith:
Although in most references the bitwise complexity of Euclid Algorithm is given by O(loga)^3 there exists a tighter bound which is O(loga)^2.
Consider; r0=a, r1=b, r0=q1.r1+r2 . . . ,ri-1=qi.ri+ri+1, . . . ,rm-2=qm-1.rm-1+rm rm-1=qm.rm
observe that: a=r0>=b=r1>r2>r3...>rm-1>rm>0 ..........(1)
and rm is the greatest common divisor of a and b.
By a Claim in Koblitz's book( A course in number Theory and Cryptography) is can be proven that: ri+1<(ri-1)/2 .................(2)
Again in Koblitz the number of bit operations required to divide a k-bit positive integer by an l-bit positive integer (assuming k>=l) is given as: (k-l+1).l ...................(3)
By (1) and (2) the number of divisons is O(loga) and so by (3) the total complexity is O(loga)^3.
Now this may be reduced to O(loga)^2 by a remark in Koblitz.
consider ki= logri +1
by (1) and (2) we have: ki+1<=ki for i=0,1,...,m-2,m-1 and ki+2<=(ki)-1 for i=0,1,...,m-2
and by (3) the total cost of the m divisons is bounded by: SUM [(ki-1)-((ki)-1))]*ki for i=0,1,2,..,m
rearranging this: SUM [(ki-1)-((ki)-1))]*ki<=4*k0^2
So the bitwise complexity of Euclid's Algorithm is O(loga)^2.
For the iterative algorithm, however, we have:
int iterativeEGCD(long long n, long long m) {
long long a;
int numberOfIterations = 0;
while ( n != 0 ) {
a = m;
m = n;
n = a % n;
numberOfIterations ++;
}
printf("\nIterative GCD iterated %d times.", numberOfIterations);
return m;
}
With Fibonacci pairs, there is no difference between iterativeEGCD() and iterativeEGCDForWorstCase() where the latter looks like the following:
int iterativeEGCDForWorstCase(long long n, long long m) {
long long a;
int numberOfIterations = 0;
while ( n != 0 ) {
a = m;
m = n;
n = a - n;
numberOfIterations ++;
}
printf("\nIterative GCD iterated %d times.", numberOfIterations);
return m;
}
Yes, with Fibonacci Pairs, n = a % n and n = a - n, it is exactly the same thing.
We also know that, in an earlier response for the same question, there is a prevailing decreasing factor: factor = m / (n % m).
Therefore, to shape the iterative version of the Euclidean GCD in a defined form, we may depict as a "simulator" like this:
void iterativeGCDSimulator(long long x, long long y) {
long long i;
double factor = x / (double)(x % y);
int numberOfIterations = 0;
for ( i = x * y ; i >= 1 ; i = i / factor) {
numberOfIterations ++;
}
printf("\nIterative GCD Simulator iterated %d times.", numberOfIterations);
}
Based on the work (last slide) of Dr. Jauhar Ali, the loop above is logarithmic.
Yes, small Oh because the simulator tells the number of iterations at most. Non Fibonacci pairs would take a lesser number of iterations than Fibonacci, when probed on Euclidean GCD.
At every step, there are two cases
b >= a / 2, then a, b = b, a % b will make b at most half of its previous value
b < a / 2, then a, b = b, a % b will make a at most half of its previous value, since b is less than a / 2
So at every step, the algorithm will reduce at least one number to at least half less.
In at most O(log a)+O(log b) step, this will be reduced to the simple cases. Which yield an O(log n) algorithm, where n is the upper limit of a and b.
I have found it here

Resources