Consider function y=1/((1-x^5)(1-x^7)(1-x^11))
WolframAlpha computes first 1000 elements of the MacLaurin series expansion in a few seconds: https://www.wolframalpha.com/input/?i=maclaurin+series+1%2F%28%281-x%5E5%29%281-x%5E7%29%281-x%5E11%29%29
Out of curiosity I wrote a very naive java program to do the same using BigInteger for polynomial coefficients. In pseudocode it would be something like:
BigInt next=1;
BigInt factorial=1;
while(true){
function=function.differentiate();
factorial*=++next;
print("Next coefficient is: " + function(0)/factorial);
}
This program crashes with java.lang.outofmemory exception after computing first seven, or so, coefficients, because numerator and denominator of the fraction become enormously long polynomials. Suppose my code is inefficient, but still it does not seem like Wolfram is using the same technique they show you if the first year calculus class. The question is: what does Wolfram use?
For comparison Wolfram takes quite a bit more time to just compute tenth derivative of the same function than it takes to get the first 1000 terms of polynomial, which, if done naively, would require differentiating the function 1000 times.
https://www.wolframalpha.com/input/?i=tenth+derivative+1%2F%28%281-x%5E5%29%281-x%5E7%29%281-x%5E11%29%29
tl;dr: The coefficient of xN is the number of ways that N can be partitioned using only 5, 7, and 11.
I’m not sure how Wolfram does it, but for this function, it is possible to more efficiently compute the coefficients (using techniques you would see at the end of your first year in calculus). As a power series, 1/(1-x)=∑k=0∞ xk. But we can replace x with xn, and the relation will still hold. This means that
1/((1-x5)(1-x7)(1-x11)) = (∑k=0∞x5k)(∑k=0∞x7k)(∑k=0∞x11k)
Multiplying this out would be a pain. But all of the coefficients are 1, so we only need to look at the exponents, which add together. For example, Wolfram shows that the coefficient of x40 is 4, which is from (x5·1)(x7·5)(x0·11)+(x5·0)(x7·1)(x11·3)+(x5·3)(x7·2)(x11·1)+(x5·8)(x7·0)(x11·0).
But if we only need to add the exponents, then we don’t need to care about the coefficients or the variable x. In the end, the coefficient of xN is the number of ways that N can be written as a sum of 5s, 7s, and 11s. This is a restricted version of the partition problem, but the same ideas would still hold. In particular, a dynamic programming approach would be able to calculate coefficients in linear time and space.
Not sure about the fraction's numerator, but I can see why its denominator is growing much too fast:
factorial*=factorial+1;
is not how you calculate a factorial. That more than squares the "factorial" value on the denominator with each iteration! So you will get 1, 2, 6, 42, 1806, 3263442... By contrast, factorials go 1, 2, 6, 24, 120, 720...
To calculate the factorial incrementally, maintain a loop counter, and multiply factorial by that each time.
Rational functions (and this particular one) need neither differentiation nor factorials. One way to calculate the series is to expand each factor into the series of its own (e.g. 1/(1 - x^5) = sum(n=[0,inf] x^(5n)), and then multiply results as polynomials.
You can do each operation on formal power series. Given power series for f,g you can find a recurrence relation for power series of f(z)+g(z), f(z)g(z), f(z)/g(z), f(g(z)), and even f^-1(z). Using these methods you can compute the power series of practically any function in polynomial time.
In special cases there are more efficient methods. If f(z) has a power series, then coefficients of the power series of f(z)/(1 - z) are simply the partial sums of the power series of f(z). So if f_n is the series for f, then the series for g(z) = f(z)/(1 - z) is given by g_n = f_n + g_(n-1).
You can extend this to division by any polynomial. The algorithm is basically the same as long division for polynomials. For example, let's compute 1/(1 - z^2). We add and subtract z^2 to the numerator to get (1 - z^2 + z^2)/(1 - z^2) = 1 + z^2/(1 - z^2). Then we add and subtract z^4 to get (z^2 - z^4 + z^4)/(1 - z^2) = z^2 + z^4/(1 - z^2). Going on like that you find 1/(1 - z^2) = 1 + z^2 + z^4 + z^6 and so on.
When you do this for a general polynomial of degree n you always have a numerator with less than n terms. You can store the coefficients of those terms in an array and use that as your state. From a state you can compute the next term in the power series and the next state in time O(n). This gives you a O(nk) time algorithm to find the first k terms in the power series of 1/p(z).
Note that computing a power series at a point z=z0 is the same as finding all derivatives at z=z0, so the two problems are equivalent. You can compute power series at a symbolic variable point to find a formula for the derivative, so there is theoretically no reason why Wolfram should be so much slower at finding n-th derivatives.
Related
P(x,y,z){
print x
if(y!=x) print y
if(z!=x && z!=y) print z
}
Trivial Algorithm here, values x,y,z are chosen randomly from {1,...r} with r >= 1.
I'm trying to determine the average case complexity of this algorithm and I measure complexity based on the number of print statements.
The best case here is T(n) = 1 or O(1), when x=y=z and the probability of that is 1/3.
The worst case here is still T(n) = 3 or still O(1) when x!=y!=z and the probability is 2/3.
But when it comes to mathematically deriving the average case:
Sample Space is n possible inputs, Probability over Sample Space is 1/n chance
So, how do I calculate average case complexity? (This is where I draw a blank..)
Your algorithm has three cases:
All three numbers are equal. The probability of this is 1/r, since
once you choose x, there's only one choice for y and for z. The cost
for this case is 1.
x != y, but x == z or y == z. The probability of this is 1/r * (1/(r - 1))* 1/2,
since once you choose x, you only have r -1 choices left for y, and z can only be
one of these two choices. Cost = 2.
All three numbers are distinct. Probability that all three are distinct is
1/r * (1/(r - 1))*(1/(r - 2)). Cost = 3.
Thus, the average case can be computed as:
1/r + 1/r * (1/(r - 1)) + 1/r * (1/(r - 1))*(1/(r - 2)) * 3 == O(1)
Edit: The above expression is O(1), since the whole expression is made up of constants.
The average case will be somewhere between the best and worst cases; for this particular problem, that's all you need (at least as far as big-O).
1) Can you program the general case at least? Write the (pseudo)-code and analyze it, it might be readily apparent. You may actually program it suboptimally and there may exist a better solution. This is very typical and it's part of the puzzle-solving of the mathematics end of computer science, e.g. it's hard to discover quicksort on your own if you're just trying to code up a sort.
2) If you can, then run a monte carlo simulation and graph the results. i.e., for N = 1, 5, 10, 20, ..., 100, 1000, or whatever sample is realistic, run 10000 trials and plot the average time. If you're lucky X = sample size, Y = avg. time for 10000 runs at that sample size will graph out a nice line, or parabola, or some easy-to-model curve.
So I'm not sure if you need help on (1) finding or coding the algorithm or (2) analyzing it, you will probably want to revise your question to specify this.
P(x,y,z){
1.print x
2.if(y!=x)
3. print y
4.if(z!=x && z!=y)
5. print z
}
Line 1: takes a constant time c1 (c1:print x)
Line 2: takes a constant time c2 (c2:condition test)
Line 3 :takes a constant time c3 (c3 :print y)
Line 3: takes a constant time c4 (c4:condition test)
Line 4: takes a constant time c5 (c5:print z)
Analysis :
Unless your function P(x,y,z) does not depend on input size " r" the program will take a constant amount of time to run since Time Taken :T(c1)+T(c2+c3)+T(c4+c5) ..summing up the Big O of the function P(x,y,z) is O(1) where 1 is a constant and indicates constant amount of time since T(c1),T(c2),..T(c5) all take constant amount of time.. and say if the function P(x,y,z) iterates from 1 to r..then the complexity of your snippet would have changed and will be in terms of the input size i.e "r"
Best Case : O(1)
Average Case : O(1)
worst Case : O(1)
Pollard Rho factorization method uses a function generator f(x) = x^2-a(mod n) or f(x) = x^2+a(mod n) , is the choice of this function (parabolic) has got any significance or we may use any function (cubic , polynomial or even linear) as we have to identify or find the numbers belonging to same congruence class modulo n to find the non trivial divisor ?
In Knuth Vol II (The Art Of Computer Programming - Seminumerical Algorithms) section 4.5.4 Knuth says
Furthermore if f(y) mod p behaves as a random mapping from the set {0,
1, ... p-1} into itself, exercise 3.1-12 shows that the average value
of the least such m will be of order sqrt(p)... From the theory in
Chapter 3, we know that a linear polynomial f(x) = ax + c will not be
sufficiently random for our purpose. The next simplest case is
quadratic, say f(x) = x^2 + 1. We don't know that this function is
sufficiently random, but our lack of knowledge tends to support the
hypothesis of randomness, and empirical tests show that this f does
work essentially as predicted
The probability theory that says that f(x) has a cycle of length about sqrt(p) assumes in particular that there can be two values y and z such that f(y) = f(z) - since f is chosen at random. The rho in Pollard Rho contains such a junction, with the cycle containing multiple lines leading on to it. For a linear function f(x) = ax + b then for gcd(a, p) = 1 mod p (which is likely since p is prime) f(y) = f(z) means that y = z mod p, so there are no such junctions.
If you look at http://www.agner.org/random/theory/chaosran.pdf you will see that the expected cycle length of a random function is about the sqrt of the state size, but the expected cycle length of a random bijection is about the state size. If you think of generating the random function only as you evaluate it you can see that if the function is entirely random then every value seen so far is available to be chosen again at random to find a cycle, so the odds of closing the cycle increase with the cycle length, but if the function has to be invertible the only way to close the cycle is to generate the starting point, which is much less likely.
Knowing that we can use Divide-and-Conquer algorithm to compute large exponents, for example 2 exp 100 = 2 exp(50) * 2 exp(50), which is quite more efficient, is this method efficient using roots? For example 2 exp (1/100) = (2 exp(1/50)) exp(1/50)?
In other words, I'm wondering if (n exp(1/x)) is more efficient to (n exp(1/y)) for x < y and where x and y are integers.
I don't think that a divide and conquer method is used when you have non-integer exponentials. I would assume that a taylor polynomial is used to compute x^y as e^(y ln(x)). You can compute the integer part of y, using divide and conquer then multiply it by the real part. But it doesn't make sense to divide it in two otherwise. Also:
2 exp (1/100) = (2 exp(1/50)) exp(1/50)
This is not true.
(2 exp(1/50))exp(1/50) = 2 exp(1/50+1/50)= 2*exp(1/25) != 2 exp(1/100)
You would be doing:
2 exp(1/100)= 2*exp(1/200)* exp(1/200)
As x,y are floating point numbers exp(1/x) might not be more efficient than exp(1/y) for all x<y.
But point of divide and conquer algorithms is that
if we have something like exp(1/x) we won't calculate it again i.e. we divide 2^N into two same problems of smaller size 2^(N/2) * 2^(N/2) and we calculate 2^(N/2) only once.
Similarly for exp(2/x) can be divided into exp(1/x)*exp(1/x) and we will have to calculate exp(1/x) only once. This should improve performance.
Also having smaller number in denominator should help.
So I think this should work fine.
Multiplying two binary numbers takes n^2 time, yet squaring a number can be done more efficiently somehow. (with n being the number of bits) How could that be?
Or is it not possible? This is insanity!
There exist algorithms more efficient than O(N^2) to multiply two numbers (see Karatsuba, Pollard, Schönhage–Strassen, etc.)
The two problems "multiply two arbitrary N-bit numbers" and "Square an arbitrary N-bit number" have the same complexity.
We have
4*x*y = (x+y)^2 - (x-y)^2
So if squaring N-bit integers takes O(f(N)) time, then the product of two arbitrary N-bit integers can be obtained in O(f(N)) too. (that is 2x N-bit sums, 2x N-bit squares, 1x 2N-bit sum, and 1x 2N-bit shift)
And obviously we have
x^2 = x * x
So if multiplying two N-bit integers takes O(f(N)), then squaring a N-bit integer can be done in O(f(N)).
Any algorithm computing the product (resp the square) provides an algorithm to compute the square (resp the product) with the same asymptotic cost.
As noted in other answers, the algorithms used for fast multiplication can be simplified in the case of squaring. The gain will be on the constant in front of the f(N), and not on f(N) itself.
Squaring an n digit number may be faster than multiplying two random n digit numbers. Googling I found this article. It is about arbitrary precision arithmetic but it may be relevant to what your asking. In it the authors say this:
In squaring a large integer, i.e. X^2
= (xn-1, xn-2, ... , x1, x0)^2 many cross-product terms of the form xi *
xj and xj * xi are equivalent. They
need to be computed only once and then
left shifted in order to be doubled.
An n-digit squaring operation is
performed using only (n^2 + n)/2
single-precision multiplications.
Like others have pointed out, squaring can only be about 1.5X or 2X faster than regular multiplication between arbitrary numbers. Where does the computational advantage come from? It's symmetry. Let's calculate the square of 1011 and try to spot a pattern that we can exploit. u0:u3 represent the bits in the number from the most significant to the least significant.
1011 // u3 * u0 : u3 * u1 : u3 * u2 : u3 * u3
1011 // u2 * u0 : u2 * u1 : u2 * u2 : u2 * u3
0000 // u1 * u0 : u1 * u1 : u1 * u2 : u1 * u3
1011 // u0 * u0 : u0 * u1 : u0 * u2 : u0 * u3
If you consider the elements ui * ui for i=0, 1, ..., 4 to form the diagonal and ignore them, you'll see that the elements ui * uj for i ≠ j are repeated twice.
Therefore, all you need to do is calculate the product sum for elements below the diagonal and double it, with a left shift. You'd finally add the diagonal elements. Now you can see where the 2X speed up comes from. In practice, the speed-up is about 1.5X because of the diagonal and extra operations.
I believe you may be referring to exponentiation by squaring . This technique isn't used for multiplying, but for raising to a power x^n, where n may be large. Rather than multiply x
times itself N times, one performs a series of squaring and adding operations which can be mapped to the binary representation of N. The number of multiplication operations (which are more expensive than additions for large numbers) is reduced from N to log(N) with respect to the naive exponentiation algorithm.
Do you mean multiplying a number by a power of 2? This is usually quicker than multiplying any two random numbers since the result can be calculated by simple bit shifting. However, bear in mind that modern microprocessors dedicate lots of brute force silicon to these types of calculations and most arithmetic is performed with blinding speed compared to older microprocessors
I have it!
2 * 2
is more expensive than
2 << 1
(The caveat being it only works for one case.)
Suppose you want to expand out the multiplication (a+b)×(c+d). It splits up into four individual multiplications: a×c + a×d + b×c + b×d.
But if you want to expand out (a+b)², then it only needs three multiplications (and a doubling): a² + 2ab + b².
(Note also that two of the multiplications are themselves squares.)
Hopefully this just begins to give an insight into some of the speedups that are possible when performing a square over a regular multiplication.
First of all great question! I wish there were more questions like this.
So it turns out that the method I came up with is O(n log n) for general multiplication in the arithmetic complexity only. You can represent any number X as
X = x_{n-1} 2^{n-1} + ... + x_1 2^1 + x_0 2^0
Y = y_{m-1} 2^{m-1} + ... + y_1 2^1 + y_0 2^0
where
x_i, y_i \in {0,1}
then
XY = sum _ {k=0} ^ m+n r_k 2^k
where
r_k = sum _ {i=0} ^ k x_i y_{k-i}
which is just a straight forward application of FFT to find the values of r_k for each k in (n +m) log( n + m) time.
Then for each r_k you must determine how big the overflow is and add it up accordingly. For squaring a number this means O(n log n) arithmetic operations.
You can add up the r_k values more efficiently using the Schönhage–Strassen algorithm to obtain a O(n log n log log n) bit operation bound.
The exact answer to your question is already posted by Eric Bainville.
However, you can get a much better bound than O(n^2) for squaring a number simply because there exist much better bounds for multiplying integers!
If you assume fixed length to the word size of the machine and that the number to be squared is in memory, a squaring operation requires only one load from memory, so could be faster.
For arbitrary length integers, multiplication is typically O(N²) but there are algorithms which reduce this for large integers.
If you assume the simple O(N²) approach to multiply a by b, then for each bit in a you have to shift b and add it to an accumulator if that bit is one. For each bit in a you need 3N shifts and additions.
Note that
( x - y )² = x² - 2 xy + y²
Hence
x² = ( x - y )² + 2 xy - y²
If each y is the largest power of two not greater than x, this gives a reduction to a lower square, two shifts and two additions. As N is reduced on each iteration, you may get an efficiency gain ( the symmetry means it visits each point in a triangle rather than a rectangle ), but it's still O(N²).
There may be another better symmetry to exploit.
a^2
(a+b)*(a+b)+b^2 eg. 66^2 = (66+6)(66-6)+6^2 = 72*60+36= 4356
for a^n just use the power rule
66^4 = 4356^2
I would want to solve the problem by N bit multiplication
for a number
A the bits be A(n-1)A(n-2)........A(1)A(0).
B the bits be B(n-1)B(n-2)........B(1)B(0).
for the square of number A the unique multiplication bits generated will be
for A(0)->A(0)....A(n-1)
A(1)->A(1)....A(n-1) and so on
so the total operations will be
OP = n + n-1 + n-2 ....... + 1
Therefore OP = n^2+n/2;
so the Asymptotic notation will be O(n^2)
and for multiplication of A and B n^2 unique multiplications will be generated
so the Asymptotic notation will be O(n^2)
The square root of 2n is 2n / 2 or 2n >> 1, so if your number is a power of two everything is totally simple once you know the power. To multiply is even simplier: 24 * 28 is 24+8. There's no sense in this statements you've done.
If you have a binary number A, it can (always, proof left to the eager reader) be expressed as (2^n + B), this can be squared as 2^2n + 2^(n+1)B + B^2. We can then repeat the expansion, until such a point that B equals zero. I haven't looked too hard at it, but intuitively, it feels as if you should be able to make a squaring function take fewer algorithmical steps than a general-purpose multiplication.
I think that you are completely wrong in your statements
Multiplying two binary numbers takes
n^2 time
Multiplying two 32bit numbers take exactly one clock cycle. On a 64 bit processor, I would assume that multiplying two 64 bit numbers take exactly 1 clock cycle. It wouldn't even surprise my that a 32bit processor can multiply two 64bit numbers in 1 clock cycle.
yet squaring a number can be done more efficiently somehow.
Squaring a number is just multiplying the number with itself, so that is just a simple multiplication. There is no "square" operation in the CPU.
Maybe you are confusing "squaring" with "multiplying by a power of 2". Multiplying by 2 can be implemeted by shifting all the bits one position to the "left". Multiplying by 4 is shifting all the bits two positions to the "left". By 8, 3 positions. But this trick only applies to a power of two.
I am finding it very hard to understand the way the inverse of the matrix is calculated in the Hill Cipher algorithm. I get the idea of it all being done in modulo arithmetic, but somehow things are not adding up. I would really appreciate a simple explanation!
Consider the following Hill Cipher key matrix:
5 8
17 3
Please use the above matrix for illustration.
You must study the Linear congruence theorem and the extended GCD algorithm, which belong to Number Theory, in order to understand the maths behind modulo arithmetic.
The inverse of matrix K for example is (1/det(K)) * adjoint(K), where det(K) <> 0.
I assume that you don't understand how to calculate the 1/det(K) in modulo arithmetic and here is where linear congruences and GCD come to play.
Your K has det(K) = -121. Lets say that the modulo m is 26. We want x*(-121) = 1 (mod 26).[ a = b (mod m) means that a-b = N*m]
We can easily find that for x=3 the above congruence is true because 26 divides (3*(-121) -1) exactly. Of course, the correct way is to use GCD in reverse to calculate the x, but I don't have time for explaining how do it. Check the extented GCD algorithm :)
Now, inv(K) = 3*([3 -8], [-17 5]) (mod 26) = ([9 -24], [-51 15]) (mod 26) = ([9 2], [1 15]).
Update: check out Basics of Computational Number Theory to see how to calculate modular inverses with the Extended Euclidean algorithm. Note that -121 mod 26 = 9, so for gcd(9, 26) = 1 we get (-1, 3).
In my very humble opinion it is much easier to calculate the inverse matrix (modular or otherwise) by using the Gauss-Jordan method. That way you don't have to calculate the determinant, and the method scales very simply to arbitrarily large systems.
Just look up 'Gauss Jordan Matrix Inverse' - but to summarise, you simply adjoin a copy of the identity matrix to the right of the matrix to be inverted, then use row operations to reduce your matrix to be solved until it itself is an identity matrix. At this point, the adjoined identity matrix has become the inverse of the original matrix. Voila!