Running time of Euclid's GCD algorithm? - algorithm

I am trying to learn number theory for RSA cryptography by reading the CLR algorithms book. I was looking at exercise 31.2-5 which claims a bound of 1 + logΦ(b / gcd(a,b)).
The full question is:
If a > b >= 0, show that the invocation EUCLID(a,b) makes at most 1 + logΦb recursive calls. Improve this bound to 1 + logΦ(b / gcd(a,b)).
Does anyone know how to show this? There are already several other questions and answers to Euclid's algorithm on this site already but none of them seem to have this exact precise answer.

Refer to the analysis of Euclid's algorithm by Donald Knuth, in TAOCP Vol.2 p.356

This is how I solved the first part. I am still working on the second
We know that the runtime of Euclid algorithm is a function of the number of steps involved (pg of the book)
Let k be the number of recursive steps needed.
Hence b >= F k+1 >= φ k -1
b >= φ k -1
Taking log to find k we have
Log φ b >= k-1
1 + Log φ b > = k
Hence the run time is O (1 + Log φ b)

Show that if Euclid(a,b) takes more than N steps, then
a>=F(n+1) and b>=F(n), where F(i) is the ith Fibonacci
number. This can easily be done by Induction.
Show that F(n) ≥ φn-1, again by
Induction.
Using results of Step 1 and 2, we have b ≥ F(n) ≥
φn-1 Taking logarithm on both sides,
logφb ≥ n-1. Hence proved, n ≤ 1 +
logφb
The second part trivially follows.
No. of recursive calls in EUCLID(ka,kb) is the same as in EUCLID(a,b), where k is some integer.
Hence, the bound is improved to 1 + logφ( b / gcd(a,b) ).

Related

T(n) = T(n/10) + T(an) + n, how to solve this?

Update: I'm still looking for a solution without using something from outside sources.
Given: T(n) = T(n/10) + T(an) + n for some a, and that: T(n) = 1 if n < 10, I want to check if the following is possible (for some a values, and I want to find the smallest possible a):
For every c > 0 there is n0 > 0 such that for every n > n0, T(n) >= c * n
I tried to open the function declaration step by step but it got really complicated and I got stuck since I see no advancement at all.
Here is what I did: (Sorry for adding an image, that's because I wrote a lot on word and I cant paste it as text)
Any help please?
Invoking Akra–Bazzi, g(n) = n¹, so the critical exponent is p = 1, so we want (1/10)¹ + a¹ = 1, hence a = 9/10.
Intuitively, this is like the mergesort recurrence: with linear overhead at each call, if we don't reduce the total size of the subproblems, we'll end up with an extra log in the running time (which will overtake any constant c).
Requirement:
For every c > 0 there is n0 > 0 such that for every n > n0, T(n) >= c*n
By substituting the recurrence in the inequality and solving for a, you will get:
T(n) >= c*n
(c*n/10) + (c*a*n) + n >= c*n
a >= 0.9 - (1/c)
Since our desired outcome is for all c (treat c as infinity), we get a >= 0.9. Therefore, smallest value of a is 0.9 which will satisfy T(n) >= c*n for all c.
Another perspective: draw a recursion tree and count the work done per level. The work done in each layer of the tree will be (1/10 + a) as much as the work of the level above it. (Do you see why?)
If (1/10 + a) < 1, this means the work per level is decaying geometrically, so the total work done will sum to some constant multiple of n (whose leading coefficient depends on a). If (1 / 10 + a) ≥ 1, then the work done per level stays the same or grows from one level to the next, so the total work done now depends (at least) on the number of layers in the tree, which is Θ(log n) because the subproblems sizes are dropping by a constant from one layer to the next and that can’t happen more than Θ(log n) times. So once (1 / 10 + a) = 1 your runtime suddenly is Ω(n log n) = ω(n).
(This is basically the reasoning behind the Master Theorem, just applied to non-uniform subproblems sizes, which is where the Akra-Bazzi theorem comes from.)

Understanding geometric improvement approach for obtaining polynomial-time algorithms

I'm reading Network Flows - Theory, Algorithms, and Applications and I'm stuck on the proof of the following theorem (Ch. 3, Page 67):
Theorem. Suppose that 𝑧^𝑘 is the objective function value of some solution of a minimization problem at the 𝑘𝑡ℎ iteration of an algorithm and 𝑧^∗ is the minimum objective function value. Furthermore, suppose that the algorithm guarantees that for every iteration 𝑘,
(1) (𝑧^𝑘 − 𝑧^(𝑘+1)) ≥ 𝛼(𝑧^𝑘 − 𝑧^∗)
(i.e., the improvement at iteration 𝑘 + 1 is at least 𝛼 times the total possible improvement)for some constant 𝛼 with 0 < 𝛼 < 1 (which is independent of the problem data). Then the algorithm terminates in 𝑂((𝑙𝑜𝑔𝐻)/𝛼) iterations, where 𝐻 is the difference between the maximum and minimum objective function values.
Proof. The quantity (𝑧^𝑘 − 𝑧^∗) represents the total possible improvement in the objective function value after the 𝑘𝑡ℎ iteration. Consider a consecutive sequence of 2/𝑎 iterations starting from iteration 𝑘. If each iteration of the algorithm improves the objective function value by at least 𝑎(𝑧^𝑘 − 𝑧^∗)/2 units, the algorithm would determine an optimal solution within these 2/𝑎 iterations. Suppose, instead, that at some iteration 𝑞 + 1, the algorithm improves the objective function value by less than 𝑎(𝑧^𝑘 − 𝑧^∗)/2 units. In other words,
(2) 𝑧^𝑞 − 𝑧^(𝑞+1) ≤ 𝑎(𝑧^𝑘 − 𝑧^∗)/2.
The inequality (1) implies that
(3) 𝛼(𝑧^𝑞 − 𝑧^∗) ≤ 𝑧^𝑞 − 𝑧^(𝑞+1)
The inequalities (2) and (3) imply that
(𝑧^𝑞 − 𝑧^∗) ≤ (𝑧^𝑘 − 𝑧^∗)/2,
so the algorithm has reduced the total possible improvement (𝑧^𝑘 − 𝑧^∗) by a factor at least 2. We have thus shown that within 2/𝑎 consecutive iterations, the algorithm either obtains an optimal solution or reduces the total possible improvement by a factor of at least 2. Since 𝐻 is the maximum possible improvement and every objective function value is an integer, the algorithm must terminate within 𝑂((𝑙𝑜𝑔𝐻)/𝛼) iterations.
Why does the author focus on 2/a iterations?
I think it's written this way for computer scientists to understand more easily that we're halving an integer quantity every 2/α rounds, hence the number of 2/α-round super-rounds will be O(log). No particular reason we can't do the math more directly:
T(n) is the gap between the nth solution and an optimal solution.
T(n) ≤ (1 - α) T(n-1) [assumption]
n
T(n) ≤ T(0) (1 - α) [by induction]
-α n x
< T(0) (e ) [by 1 + x < e for x ≠ 0]
-α (ln T(0))/α
T((ln T(0))/α) < T(0) e
= 1,
so (ln T(0))/α rounds suffice.

is the following way correct for determines the complexity of Fibonacci function? [duplicate]

I understand Big-O notation, but I don't know how to calculate it for many functions. In particular, I've been trying to figure out the computational complexity of the naive version of the Fibonacci sequence:
int Fibonacci(int n)
{
if (n <= 1)
return n;
else
return Fibonacci(n - 1) + Fibonacci(n - 2);
}
What is the computational complexity of the Fibonacci sequence and how is it calculated?
You model the time function to calculate Fib(n) as sum of time to calculate Fib(n-1) plus the time to calculate Fib(n-2) plus the time to add them together (O(1)). This is assuming that repeated evaluations of the same Fib(n) take the same time - i.e. no memoization is used.
T(n<=1) = O(1)
T(n) = T(n-1) + T(n-2) + O(1)
You solve this recurrence relation (using generating functions, for instance) and you'll end up with the answer.
Alternatively, you can draw the recursion tree, which will have depth n and intuitively figure out that this function is asymptotically O(2n). You can then prove your conjecture by induction.
Base: n = 1 is obvious
Assume T(n-1) = O(2n-1), therefore
T(n) = T(n-1) + T(n-2) + O(1) which is equal to
T(n) = O(2n-1) + O(2n-2) + O(1) = O(2n)
However, as noted in a comment, this is not the tight bound. An interesting fact about this function is that the T(n) is asymptotically the same as the value of Fib(n) since both are defined as
f(n) = f(n-1) + f(n-2).
The leaves of the recursion tree will always return 1. The value of Fib(n) is sum of all values returned by the leaves in the recursion tree which is equal to the count of leaves. Since each leaf will take O(1) to compute, T(n) is equal to Fib(n) x O(1). Consequently, the tight bound for this function is the Fibonacci sequence itself (~θ(1.6n)). You can find out this tight bound by using generating functions as I'd mentioned above.
Just ask yourself how many statements need to execute for F(n) to complete.
For F(1), the answer is 1 (the first part of the conditional).
For F(n), the answer is F(n-1) + F(n-2).
So what function satisfies these rules? Try an (a > 1):
an == a(n-1) + a(n-2)
Divide through by a(n-2):
a2 == a + 1
Solve for a and you get (1+sqrt(5))/2 = 1.6180339887, otherwise known as the golden ratio.
So it takes exponential time.
I agree with pgaur and rickerbh, recursive-fibonacci's complexity is O(2^n).
I came to the same conclusion by a rather simplistic but I believe still valid reasoning.
First, it's all about figuring out how many times recursive fibonacci function ( F() from now on ) gets called when calculating the Nth fibonacci number. If it gets called once per number in the sequence 0 to n, then we have O(n), if it gets called n times for each number, then we get O(n*n), or O(n^2), and so on.
So, when F() is called for a number n, the number of times F() is called for a given number between 0 and n-1 grows as we approach 0.
As a first impression, it seems to me that if we put it in a visual way, drawing a unit per time F() is called for a given number, wet get a sort of pyramid shape (that is, if we center units horizontally). Something like this:
n *
n-1 **
n-2 ****
...
2 ***********
1 ******************
0 ***************************
Now, the question is, how fast is the base of this pyramid enlarging as n grows?
Let's take a real case, for instance F(6)
F(6) * <-- only once
F(5) * <-- only once too
F(4) **
F(3) ****
F(2) ********
F(1) **************** <-- 16
F(0) ******************************** <-- 32
We see F(0) gets called 32 times, which is 2^5, which for this sample case is 2^(n-1).
Now, we want to know how many times F(x) gets called at all, and we can see the number of times F(0) is called is only a part of that.
If we mentally move all the *'s from F(6) to F(2) lines into F(1) line, we see that F(1) and F(0) lines are now equal in length. Which means, total times F() gets called when n=6 is 2x32=64=2^6.
Now, in terms of complexity:
O( F(6) ) = O(2^6)
O( F(n) ) = O(2^n)
There's a very nice discussion of this specific problem over at MIT. On page 5, they make the point that, if you assume that an addition takes one computational unit, the time required to compute Fib(N) is very closely related to the result of Fib(N).
As a result, you can skip directly to the very close approximation of the Fibonacci series:
Fib(N) = (1/sqrt(5)) * 1.618^(N+1) (approximately)
and say, therefore, that the worst case performance of the naive algorithm is
O((1/sqrt(5)) * 1.618^(N+1)) = O(1.618^(N+1))
PS: There is a discussion of the closed form expression of the Nth Fibonacci number over at Wikipedia if you'd like more information.
You can expand it and have a visulization
T(n) = T(n-1) + T(n-2) <
T(n-1) + T(n-1)
= 2*T(n-1)
= 2*2*T(n-2)
= 2*2*2*T(n-3)
....
= 2^i*T(n-i)
...
==> O(2^n)
Recursive algorithm's time complexity can be better estimated by drawing recursion tree, In this case the recurrence relation for drawing recursion tree would be T(n)=T(n-1)+T(n-2)+O(1)
note that each step takes O(1) meaning constant time,since it does only one comparison to check value of n in if block.Recursion tree would look like
n
(n-1) (n-2)
(n-2)(n-3) (n-3)(n-4) ...so on
Here lets say each level of above tree is denoted by i
hence,
i
0 n
1 (n-1) (n-2)
2 (n-2) (n-3) (n-3) (n-4)
3 (n-3)(n-4) (n-4)(n-5) (n-4)(n-5) (n-5)(n-6)
lets say at particular value of i, the tree ends, that case would be when n-i=1, hence i=n-1, meaning that the height of the tree is n-1.
Now lets see how much work is done for each of n layers in tree.Note that each step takes O(1) time as stated in recurrence relation.
2^0=1 n
2^1=2 (n-1) (n-2)
2^2=4 (n-2) (n-3) (n-3) (n-4)
2^3=8 (n-3)(n-4) (n-4)(n-5) (n-4)(n-5) (n-5)(n-6) ..so on
2^i for ith level
since i=n-1 is height of the tree work done at each level will be
i work
1 2^1
2 2^2
3 2^3..so on
Hence total work done will sum of work done at each level, hence it will be 2^0+2^1+2^2+2^3...+2^(n-1) since i=n-1.
By geometric series this sum is 2^n, Hence total time complexity here is O(2^n)
The proof answers are good, but I always have to do a few iterations by hand to really convince myself. So I drew out a small calling tree on my whiteboard, and started counting the nodes. I split my counts out into total nodes, leaf nodes, and interior nodes. Here's what I got:
IN | OUT | TOT | LEAF | INT
1 | 1 | 1 | 1 | 0
2 | 1 | 1 | 1 | 0
3 | 2 | 3 | 2 | 1
4 | 3 | 5 | 3 | 2
5 | 5 | 9 | 5 | 4
6 | 8 | 15 | 8 | 7
7 | 13 | 25 | 13 | 12
8 | 21 | 41 | 21 | 20
9 | 34 | 67 | 34 | 33
10 | 55 | 109 | 55 | 54
What immediately leaps out is that the number of leaf nodes is fib(n). What took a few more iterations to notice is that the number of interior nodes is fib(n) - 1. Therefore the total number of nodes is 2 * fib(n) - 1.
Since you drop the coefficients when classifying computational complexity, the final answer is θ(fib(n)).
It is bounded on the lower end by 2^(n/2) and on the upper end by 2^n (as noted in other comments). And an interesting fact of that recursive implementation is that it has a tight asymptotic bound of Fib(n) itself. These facts can be summarized:
T(n) = Ω(2^(n/2)) (lower bound)
T(n) = O(2^n) (upper bound)
T(n) = Θ(Fib(n)) (tight bound)
The tight bound can be reduced further using its closed form if you like.
It is simple to calculate by diagramming function calls. Simply add the function calls for each value of n and look at how the number grows.
The Big O is O(Z^n) where Z is the golden ratio or about 1.62.
Both the Leonardo numbers and the Fibonacci numbers approach this ratio as we increase n.
Unlike other Big O questions there is no variability in the input and both the algorithm and implementation of the algorithm are clearly defined.
There is no need for a bunch of complex math. Simply diagram out the function calls below and fit a function to the numbers.
Or if you are familiar with the golden ratio you will recognize it as such.
This answer is more correct than the accepted answer which claims that it will approach f(n) = 2^n. It never will. It will approach f(n) = golden_ratio^n.
2 (2 -> 1, 0)
4 (3 -> 2, 1) (2 -> 1, 0)
8 (4 -> 3, 2) (3 -> 2, 1) (2 -> 1, 0)
(2 -> 1, 0)
14 (5 -> 4, 3) (4 -> 3, 2) (3 -> 2, 1) (2 -> 1, 0)
(2 -> 1, 0)
(3 -> 2, 1) (2 -> 1, 0)
22 (6 -> 5, 4)
(5 -> 4, 3) (4 -> 3, 2) (3 -> 2, 1) (2 -> 1, 0)
(2 -> 1, 0)
(3 -> 2, 1) (2 -> 1, 0)
(4 -> 3, 2) (3 -> 2, 1) (2 -> 1, 0)
(2 -> 1, 0)
The naive recursion version of Fibonacci is exponential by design due to repetition in the computation:
At the root you are computing:
F(n) depends on F(n-1) and F(n-2)
F(n-1) depends on F(n-2) again and F(n-3)
F(n-2) depends on F(n-3) again and F(n-4)
then you are having at each level 2 recursive calls that are wasting a lot of data in the calculation, the time function will look like this:
T(n) = T(n-1) + T(n-2) + C, with C constant
T(n-1) = T(n-2) + T(n-3) > T(n-2) then
T(n) > 2*T(n-2)
...
T(n) > 2^(n/2) * T(1) = O(2^(n/2))
This is just a lower bound that for the purpose of your analysis should be enough but the real time function is a factor of a constant by the same Fibonacci formula and the closed form is known to be exponential of the golden ratio.
In addition, you can find optimized versions of Fibonacci using dynamic programming like this:
static int fib(int n)
{
/* memory */
int f[] = new int[n+1];
int i;
/* Init */
f[0] = 0;
f[1] = 1;
/* Fill */
for (i = 2; i <= n; i++)
{
f[i] = f[i-1] + f[i-2];
}
return f[n];
}
That is optimized and do only n steps but is also exponential.
Cost functions are defined from Input size to the number of steps to solve the problem. When you see the dynamic version of Fibonacci (n steps to compute the table) or the easiest algorithm to know if a number is prime (sqrt(n) to analyze the valid divisors of the number). you may think that these algorithms are O(n) or O(sqrt(n)) but this is simply not true for the following reason:
The input to your algorithm is a number: n, using the binary notation the input size for an integer n is log2(n) then doing a variable change of
m = log2(n) // your real input size
let find out the number of steps as a function of the input size
m = log2(n)
2^m = 2^log2(n) = n
then the cost of your algorithm as a function of the input size is:
T(m) = n steps = 2^m steps
and this is why the cost is an exponential.
Well, according to me to it is O(2^n) as in this function only recursion is taking the considerable time (divide and conquer). We see that, the above function will continue in a tree until the leaves are approaches when we reach to the level F(n-(n-1)) i.e. F(1). So, here when we jot down the time complexity encountered at each depth of tree, the summation series is:
1+2+4+.......(n-1)
= 1((2^n)-1)/(2-1)
=2^n -1
that is order of 2^n [ O(2^n) ].
No answer emphasizes probably the fastest and most memory efficient way to calculate the sequence. There is a closed form exact expression for the Fibonacci sequence. It can be found by using generating functions or by using linear algebra as I will now do.
Let f_1,f_2, ... be the Fibonacci sequence with f_1 = f_2 = 1. Now consider a sequence of two dimensional vectors
f_1 , f_2 , f_3 , ...
f_2 , f_3 , f_4 , ...
Observe that the next element v_{n+1} in the vector sequence is M.v_{n} where M is a 2x2 matrix given by
M = [0 1]
[1 1]
due to f_{n+1} = f_{n+1} and f_{n+2} = f_{n} + f_{n+1}
M is diagonalizable over complex numbers (in fact diagonalizable over the reals as well, but this is not usually the case). There are two distinct eigenvectors of M given by
1 1
x_1 x_2
where x_1 = (1+sqrt(5))/2 and x_2 = (1-sqrt(5))/2 are the distinct solutions to the polynomial equation x*x-x-1 = 0. The corresponding eigenvalues are x_1 and x_2. Think of M as a linear transformation and change your basis to see that it is equivalent to
D = [x_1 0]
[0 x_2]
In order to find f_n find v_n and look at the first coordinate. To find v_n apply M n-1 times to v_1. But applying M n-1 times is easy, just think of it as D. Then using linearity one can find
f_n = 1/sqrt(5)*(x_1^n-x_2^n)
Since the norm of x_2 is smaller than 1, the corresponding term vanishes as n tends to infinity; therefore, obtaining the greatest integer smaller than (x_1^n)/sqrt(5) is enough to find the answer exactly. By making use of the trick of repeatedly squaring, this can be done using only O(log_2(n)) multiplication (and addition) operations. Memory complexity is even more impressive because it can be implemented in a way that you always need to hold at most 1 number in memory whose value is smaller than the answer. However, since this number is not a natural number, memory complexity here changes depending on whether if you use fixed bits to represent each number (hence do calculations with error)(O(1) memory complexity this case) or use a better model like Turing machines, in which case some more analysis is needed.

Calculate big Theta bound for 2 recursive calls

T(m,n) = 2T(m/2,n)+n, assume T(m,n) is constant if either m<2 or n<2
So what I don't understand is, can this problem be solved using Master Theorem? If so how? If not, is this table correct?
level # of instances size cost of each level total cost
0 1 m, n n n
1 2 m/2, n n 2n
2 4 m/4, n n 4n
i 2^i m/(2^i), n n 2^i * n
k m 1, n n n*m
Thanks!
The Master Theorem might be a little bit of an overkill here and your solution method is not bad (log means logarithmus to base 2, c=T(1,n)):
T(m,n)=n+2T(m/2,n)=n+2n+4T(m/4,n)=n*(1+2+4+..+2^log(m))+2^log(m)*c
=n*(2^(log(m)+1)-1)+m*c=Theta(n*m)
If you use the Master Theorem by treating n as a constant, than you would easily get T(m,n)=Theta(m*C(n)) with a constant C depending on n, but the Master Theorem does not tell you much about this constant C. If you get too smart and inattentive you could easily get burned:
T(m,n)=n+2T(m/2,n)=n*(1+2/nT(m/2,n))=n*Theta(2^(log(m/n)))
=n*Theta(m/n)=Theta(m)
And now, because you left out C(n) in the third step, you got a wrong result!

Asymptotic runtime for an algorithm

I've decided to try and do a problem about analyzing the worst possible runtime of an algorithm and to gain some practice.
Since I'm a beginner I only need help in expressing my answer in a right way.
I came accros this problem in a book that uses the following algorithm:
Input: A set of n points (x1, y1), . . . , (xn, yn) with n ≥ 2.
Output: The squared distance of a closest pair of points.
ClosePoints
1. if n = 2 then return (x1 − x2)^2 + (y1 − y2)^2
2. else
3. d ← 0
4. for i ← 1 to n − 1 do
5. for j ← i + 1 to n do
6. t ← (xi − xj)^2 + (yi − yj)^2
7. if t < d then
8. d ← t
9. return d
My question is how can I offer a good proof that T(n) = O(n^2),T(n) = Ω(n^2) and T (n) = Θ(n^2)?,where T(n) represents the worst possible runtime.
I know that we say that f is O(g),
if and only if there is an n0 ∈ N and c > 0 in R such that for all
n ≥ n0 we have
f(n) ≤ cg(n).
And also we say that f is Ω(g) if there is an
n0 ∈ N and c > 0 in R such that for all n ≥ n0 we have
f(n) ≥ cg(n).
Now I know that the algoritm is doing c * n(n - 1) iterations, yielding T(n)=c*n^2 - c*n.
The first 3 lines are executed O(1) times line 4 loops for n - 1 iterations which is O(n) . Line 5 loops for n - i iterations which is also O(n) .Does each line of the inner loop's content
(lines 6-7) takes (n-1)(n-i) or just O(1)?and why?The only variation is how many times 8.(d ← t) is performed but it must be lower than or equal to O(n^2).
So,how should I write a good and complete proof that T(n) = O(n^2),T(n) = Ω(n^2) and T (n) = Θ(n^2)?
Thanks in advance
Count the number of times t changes its value. Since changing t is the innermost operation performed, finding how many times that happens will allow you to find the complexity of the entire algorithm.
i = 1 => j runs n - 1 times (t changes value n - 1 times)
i = 2 => j runs n - 2 times
...
i = n - 1 => j runs 1 time
So the number of times t changes is 1 + 2 + ... + n - 1. This sum is equal n(n - 1) / 2. This is dominated by 0.5 * n^2.
Now just find appropriate constants and you can prove that this is Ω(n^2), O(n^2), Θ(n^2).
T(n)=c*n^2 - c*n approaches c*n^2 for large n, which is the definition of O(n^2).
if you observe the two for loops, each for loop gives an O(n) because each loop is incrementing/decrementing in a linear fashion. hence, two loops combined roughly give a O(n^2) complexity. the whole point of big-oh is to find the dominating term- coeffecients do not matter. i would strongly recommend formatting your pseudocode in a proper manner in which it is not ambiguous. in any case, the if and else loops do no affect the complexity of the algorithm.
lets observe the various definitions:
Big-Oh
• f(n) is O(g(n)) if f(n) is
asymptotically “less than or equal” to
g(n)
Big-Omega
• f(n) is Ω(g(n)) if f(n) is
asymptotically “greater than or equal”
to g(n)
Big-Theta
• f(n) is Θ(g(n)) if f(n) is
asymptotically “equal” to g(n)
so all you need are to find constraints which satisfy the answer.

Resources