I was trying to analyse karatsuba algorithm for multiplying an m and an n digit integer. As i understand, it will be most efficient if the integers are divided into m/2 and n/2 digit sub problems. The issues are as follows:-
Can we apply gauss trick in this case and do we need some adjustments to apply it. Padding the smaller integer to match the size may be a solution but will it affect my running time.
In karatsuba algorithm, application of gauss trick ie (a+b)*(c+d) requires T(n/2 +1) as the size of the subproblem may increase by 1 digit. Can i some which way limit the size of sub problem to strictly n/2.
Actually, Karatsuba algo complexity, using Master method is T(n) = 3 * T(n / 2) + O(n),
where N is a maximum of number's length.
Actually, even if size of one sub problem is bigger for 1 digit, we don't care. Since it's just small constant.
Also, I'm not sure, what Gauss trick you're talking about? Usually this is a trick where we split first number on two parts (a, b) and second number on parts (c, d).
So (10 ^ (n / 2) * a + b) * (10 ^ (n / 2) * c + d) = 10^n * a * c + 10 ^ (n / 2) * a * d + 10 ^ (n / 2) * b * c + b * d;
So, we need to calculate ac, bd and ad + bc = (a + b) * (c + d) - ac - bd. So we need to calculate only 3 products of smaller size. And, yes - we could apply it in Karatsuba algorithm.
For more information - check this lecture from Standford Univercity (http://openclassroom.stanford.edu/MainFolder/VideoPage.php?course=IntroToAlgorithms&video=CS161L1P9)
Related
We're asked to provide a $ n+4![\sqrt{n}] =O(n) $ with having a good argumentation and a logical build up for it but it's not said how a good argumentation would look like, so I know that $2n+4\sqrt{n}$ always bigger for n=1 but i wouldn't know how to argue about it and how to logically build it since i just thought about it and it happened to be true. Can someone help out with this example so i would know how to do it?
You should look at the following site https://en.wikipedia.org/wiki/Big_O_notation
For the O big notation we would say that if a function is the following: X^3+X^2+100X = O(x^3). This is with idea that if X-> some very big number, the X^3 term will become the dominant factor in the equation.
You can use the same logic to your equation. Which term will become dominant in your equation.
If this is not clear you should try to plot both terms and see how they scale. This could be more clarifying.
A proof is a convincing, logical argument. When in doubt, a good way to write a convincing, logical argument is to use an accepted template for your argument. Then, others can simply check that you have used the template correctly and, if so, the validity of your argument follows.
A useful template for showing asymptotic bounds is mathematical induction. To use this, you show that what you are trying to prove is true for specific simple cases, called base cases, then you assume it is true in all cases up to a certain size (the induction hypothesis) and you finish the proof by showing the hypothesis implies the claim is true for cases of the very next size. If done correctly, you will have shown the claim (parameterized by a natural number n) is true for a fixed n and for all larger n. This is what is exactly what is required for proving asymptotic bounds.
In your case: we want to show that n + 4 * sqrt(n) = O(n). Recall that the (one?) formal definition of big-Oh is the following:
A function f is bound from above by a function g, written f(n) = O(g(n)), if there exist constants c > 0 and n0 > 0 such that for all n > n0, f(n) <= c * g(n).
Consider the case n = 0. We have n + 4 * sqrt(n) = 0 + 4 * 0 = 0 <= 0 = c * 0 = c * n for any constant c. If we now assume the claim is true for all n up to and including k, can we show it is true for n = k + 1? This would require (k + 1) + 4 * sqrt(k + 1) <= c * (k + 1). There are now two cases:
k + 1 is not a perfect square. Since we are doing analysis of algorithms it is implied that we are using integer math, so sqrt(k + 1) = sqrt(k) in this case. Therefore, (k + 1) + 4 * sqrt(k + 1) = (k + 4 * sqrt(k)) + 1 <= (c * k) + 1 <= c * (k + 1) by the induction hypothesis provided that c > 1.
k + 1 is a perfect square. Since we are doing analysis of algorithms it is implied that we are using integer math, so sqrt(k + 1) = sqrt(k) + 1 in this case. Therefore, (k + 1) + 4 * sqrt(k + 1) = (k + 4 * sqrt(k)) + 5 <= (c * k) + 5 <= c * (k + 1) by the induction hypothesis provided that c >= 5.
Because these two cases cover all possibilities and in each case the claim is true for n = k + 1 when we choose c >= 5, we see that n + 4 * sqrt(n) <= 5 * n for all n >= 0 = n0. This concludes the proof that n + 4 * sqrt(n) = O(n).
If n can be as large as 1M and r something like 100, then what is the most efficient way to calculate nPr.
P(n,r) = n! / (n-r)!
We easily can remove (n-r)! from nominator and denominator. Now formula is
P(n,r) = n*(n-1)*..(n-r+1)
.
.
.
Old answer, not for this question, about combinations:
C(n,k) = n! / (k! * (n-k)!)
We easily can remove (n-k)! from nominator and denominator. Now formula is
C(n,k) = n*(n-1)*..(n-k+1) / k! = n*(n-1)*..(n-k+1) / (1 * 2 * ...*k)
If we first calculate full nominator, it will very big.
But we can alternate steps - take n, then divide by the first term of denominator (1), multiplication by (n-1) - division by the second denominator term (2) and so on.
C(n,k) = n / 1 * (n-1) / 2 * (n-2) / 3 .. * (n-k+1) / k
Note that partial nominator product always i divisible by partial denominator product of the same length.
With this approach intermediate result is not very big, and calculations with big numbers (long arithmetics) will be faster.
I am reading this O(n) solution of the 2D peak finding problem. The author says that an important detail is to
split by the maximum direction. For square arrays this means that
split directions will be alternating.
Why is this necessary?
This is not a necessary. The alternating direction gives O(N) for any arrays.
Let's count the number of comparisons for an array M × N.
First iteration gives 3×M, second gives 3×N/2, third gives 3×M/4, fourth gives 3×N/8, i.e.:
3 * (M + M/4 + M/16 + ...) + 3 * (N/2 + N/8 + N/32 + ...)
We got two geometric series. Cause both of these series has common ratio 1/4, we can calculate the upper limit:
3 * (4 * M/3 + 2 * N/3)
Cause O(const × N) = O(N) and O(M + N) = O(N), we have O(N) algorithm.
If we always choose the vertical direction, then performance of algorithm is O(logM × N). If M much more N, then this algorithm will be faster. F.e. if M = 1025 and N = 3, then count of comparisons in the first algorithm is comparable to 1000, and in the second algorithm is comparable to 30.
Splitting an array by the maximum direction, we got the faster algorithm for specific values of M and N. Is this algorithm O(N)? Yes, cause even comparing both vertical and horizontal sections at each step we have 3 × (M + M/2 + M/4 + ...) + 3 × (N + N/2 + N/4 + ...) = 3 * (2 × M + 2 × N) comparisons, i.e. O(M + N) = O(N). But we always choose only one direction at each step.
Splitting the longer side ensures that the length of the split is at most sqrt(area). He could have also gone through the proof noticing that he's halving area with each call and he looks at most 3 sqrt(area) cells to do so.
I was curious if there was a good way to do this. My current code is something like:
def factorialMod(n, modulus):
ans=1
for i in range(1,n+1):
ans = ans * i % modulus
return ans % modulus
But it seems quite slow!
I also can't calculate n! and then apply the prime modulus because sometimes n is so large that n! is just not feasible to calculate explicitly.
I also came across http://en.wikipedia.org/wiki/Stirling%27s_approximation and wonder if this can be used at all here in some way?
Or, how might I create a recursive, memoized function in C++?
n can be arbitrarily large
Well, n can't be arbitrarily large - if n >= m, then n! ≡ 0 (mod m) (because m is one of the factors, by the definition of factorial).
Assuming n << m and you need an exact value, your algorithm can't get any faster, to my knowledge. However, if n > m/2, you can use the following identity (Wilson's theorem - Thanks #Daniel Fischer!)
to cap the number of multiplications at about m-n
(m-1)! ≡ -1 (mod m)
1 * 2 * 3 * ... * (n-1) * n * (n+1) * ... * (m-2) * (m-1) ≡ -1 (mod m)
n! * (n+1) * ... * (m-2) * (m-1) ≡ -1 (mod m)
n! ≡ -[(n+1) * ... * (m-2) * (m-1)]-1 (mod m)
This gives us a simple way to calculate n! (mod m) in m-n-1 multiplications, plus a modular inverse:
def factorialMod(n, modulus):
ans=1
if n <= modulus//2:
#calculate the factorial normally (right argument of range() is exclusive)
for i in range(1,n+1):
ans = (ans * i) % modulus
else:
#Fancypants method for large n
for i in range(n+1,modulus):
ans = (ans * i) % modulus
ans = modinv(ans, modulus)
ans = -1*ans + modulus
return ans % modulus
We can rephrase the above equation in another way, that may or may-not perform slightly faster. Using the following identity:
we can rephrase the equation as
n! ≡ -[(n+1) * ... * (m-2) * (m-1)]-1 (mod m)
n! ≡ -[(n+1-m) * ... * (m-2-m) * (m-1-m)]-1 (mod m)
(reverse order of terms)
n! ≡ -[(-1) * (-2) * ... * -(m-n-2) * -(m-n-1)]-1 (mod m)
n! ≡ -[(1) * (2) * ... * (m-n-2) * (m-n-1) * (-1)(m-n-1)]-1 (mod m)
n! ≡ [(m-n-1)!]-1 * (-1)(m-n) (mod m)
This can be written in Python as follows:
def factorialMod(n, modulus):
ans=1
if n <= modulus//2:
#calculate the factorial normally (right argument of range() is exclusive)
for i in range(1,n+1):
ans = (ans * i) % modulus
else:
#Fancypants method for large n
for i in range(1,modulus-n):
ans = (ans * i) % modulus
ans = modinv(ans, modulus)
#Since m is an odd-prime, (-1)^(m-n) = -1 if n is even, +1 if n is odd
if n % 2 == 0:
ans = -1*ans + modulus
return ans % modulus
If you don't need an exact value, life gets a bit easier - you can use Stirling's approximation to calculate an approximate value in O(log n) time (using exponentiation by squaring).
Finally, I should mention that if this is time-critical and you're using Python, try switching to C++. From personal experience, you should expect about an order-of-magnitude increase in speed or more, simply because this is exactly the sort of CPU-bound tight-loop that natively-compiled code excels at (also, for whatever reason, GMP seems much more finely-tuned than Python's Bignum).
Expanding my comment to an answer:
Yes, there are more efficient ways to do this. But they are extremely messy.
So unless you really need that extra performance, I don't suggest to try to implement these.
The key is to note that the modulus (which is essentially a division) is going to be the bottleneck operation. Fortunately, there are some very fast algorithms that allow you to perform modulus over the same number many times.
Division by Invariant Integers using Multiplication
Montgomery Reduction
These methods are fast because they essentially eliminate the modulus.
Those methods alone should give you a moderate speedup. To be truly efficient, you may need to unroll the loop to allow for better IPC:
Something like this:
ans0 = 1
ans1 = 1
for i in range(1,(n+1) / 2):
ans0 = ans0 * (2*i + 0) % modulus
ans1 = ans1 * (2*i + 1) % modulus
return ans0 * ans1 % modulus
but taking into account for an odd # of iterations and combining it with one of the methods I linked to above.
Some may argue that loop-unrolling should be left to the compiler. I will counter-argue that compilers are currently not smart enough to unroll this particular loop. Have a closer look and you will see why.
Note that although my answer is language-agnostic, it is meant primarily for C or C++.
n! mod m can be computed in O(n1/2 + ε) operations instead of the naive O(n). This requires use of FFT polynomial multiplication, and is only worthwhile for very large n, e.g. n > 104.
An outline of the algorithm and some timings can be seen here: http://fredrikj.net/blog/2012/03/factorials-mod-n-and-wilsons-theorem/
If we want to calculate M = a*(a+1) * ... * (b-1) * b (mod p), we can use the following approach, if we assume we can add, substract and multiply fast (mod p), and get a running time complexity of O( sqrt(b-a) * polylog(b-a) ).
For simplicity, assume (b-a+1) = k^2, is a square. Now, we can divide our product into k parts, i.e. M = [a*..*(a+k-1)] *...* [(b-k+1)*..*b]. Each of the factors in this product is of the form p(x)=x*..*(x+k-1), for appropriate x.
By using a fast multiplication algorithm of polynomials, such as Schönhage–Strassen algorithm, in a divide & conquer manner, one can find the coefficients of the polynomial p(x) in O( k * polylog(k) ). Now, apparently there is an algorithm for substituting k points in the same degree-k polynomial in O( k * polylog(k) ), which means, we can calculate p(a), p(a+k), ..., p(b-k+1) fast.
This algorithm of substituting many points into one polynomial is described in the book "Prime numbers" by C. Pomerance and R. Crandall. Eventually, when you have these k values, you can multiply them in O(k) and get the desired value.
Note that all of our operations where taken (mod p).
The exact running time is O(sqrt(b-a) * log(b-a)^2 * log(log(b-a))).
Expanding on my comment, this takes about 50% of the time for all n in [100, 100007] where m=(117 | 1117):
Function facmod(n As Integer, m As Integer) As Integer
Dim f As Integer = 1
For i As Integer = 2 To n
f = f * i
If f > m Then
f = f Mod m
End If
Next
Return f
End Function
I found this following function on quora:
With f(n,m) = n! mod m;
function f(n,m:int64):int64;
begin
if n = 1 then f:= 1
else f:= ((n mod m)*(f(n-1,m) mod m)) mod m;
end;
Probably beat using a time consuming loop and multiplying large number stored in string. Also, it is applicable to any integer number m.
The link where I found this function : https://www.quora.com/How-do-you-calculate-n-mod-m-where-n-is-in-the-1000s-and-m-is-a-very-large-prime-number-eg-n-1000-m-10-9+7
If n = (m - 1) for prime m then by http://en.wikipedia.org/wiki/Wilson's_theorem n! mod m = (m - 1)
Also as has already been pointed out n! mod m = 0 if n > m
Assuming that the "mod" operator of your chosen platform is sufficiently fast, you're bounded primarily by the speed at which you can calculate n! and the space you have available to compute it in.
Then it's essentially a 2-step operation:
Calculate n! (there are lots of fast algorithms so I won't repeat any here)
Take the mod of the result
There's no need to complexify things, especially if speed is the critical component. In general, do as few operations inside the loop as you can.
If you need to calculate n! mod m repeatedly, then you may want to memoize the values coming out of the function doing the calculations. As always, it's the classic space/time tradeoff, but lookup tables are very fast.
Lastly, you can combine memoization with recursion (and trampolines as well if needed) to get things really fast.
Is there faster algo to calculate (n! modulo m).
faster than reduction at every multiplication step.
And also
Is there faster algo to calculate (a^p modulo m) better than right-left binary method.
here is my code:
n! mod m
ans=1
for(int i=1;i<=n;i++)
ans=(ans*i)%m;
a^p mod m
result=1;
while(p>0){
if(p%2!=0)
result=(result*a)%m;
p=(p>>1);
a=(a*a)%m;
}
Now the a^n mod m is a O(logn), It's the Modular Exponentiation Algorithm.
Now for the other one, n! mod m, the algorithm you proposed is clearly O(n), So obviously the first algorithm is faster.
The standard trick for computing a^p modulo m is to use successive square. The idea is to expand p into binary, say
p = e0 * 2^0 + e1 * 2^1 + ... + en * 2^n
where (e0,e1,...,en) are binary (0 or 1) and en = 1. Then use laws of exponents to get the following expansion for a^p
a^p = a^( e0 * 2^0 + e1 * 2^1 + ... + en * 2^n )
= a^(e0 * 2^0) * a^(e1 * 2^1) * ... * a^(en * 2^n)
= (a^(2^0))^e0 * (a^(2^1))^e1 * ... * (a^(2^n))^en
Remember that each ei is either 0 or 1, so these just tell you which numbers to take. So the only computations that you need are
a, a^2, a^4, a^8, ..., a^(2^n)
You can generate this sequence by squaring the previous term. Since you want to compute the answer mod m, you should do the modular arithmetic first. This means you want to compute the following
A0 = a mod m
Ai = (Ai)^2 mod m for i>1
The answer is then
a^p mod m = A0^e0 + A1^e1 + ... + An^en
Therefore the computation takes log(p) squares and calls to mod m.
I'm not certain whether or not there is an analog for factorials, but a good place to start looking would be at Wilson's Theorem. Also, you should put in a test for m <= n, in which case n! mod m = 0.
For the first computation, you should only bother with the mod operator if ans > m:
ans=1
for(int i=1;i<=n;i++) {
ans *= i;
if (ans > m) ans %= m;
}
For the second computation, using (p & 1) != 0 will probably be a lot faster than using p%2!=0 (unless the compiler recognizes this special case and does it for you). Then the same comment applies about avoiding the % operator unless necessary.