Karatsuba Algorithm in Big O n^(lg3) proof by substitution - algorithm

Karatsuba Algorithm involves the recursion relation T(n) = 3T(n/2) + n.
By the recursion tree method, we can approximate the big O of T to be O(nlog23)
However, by the substitution method I am having trouble verifying the approximate result I found by the recursion tree method
I'll simply use lg 3 to mean log23.
Substitution method:
Hypothesis -> T(n) <= cnlg 3 where c is a positive constant
Proof -> T(n) <= 3c(n/2)lg 3 + n
= cnlg 3 + n
But step 2 of the proof shows that I cannot prove my hypothesis because of n term.
I modified step 2 of proof
T(n) <= cnlg 3 + nlg 3
= (c+1)nlg 3
And later realized I had made a mistake because the hypothesis is not proven.
T(n) <= cnlg 3 has to be proven, not T(n) <= (c+1)nlg 3
But the answer is T(n) is O(nlg 3)

When using the substitution method, you sometimes have to strengthen the inductive hypothesis and guess a more complex form of the expression that upper-bounds the recurrence.
Try making a guess of the form T(n) ≤ c0 nlg 3 - c1n. Now that you are subtracting some term of the form c1 n, you can probably make the recurrence work out by using some of the linear term to offset the n term added in later.
For example:
T(n) ≤ 3T(n / 2) + n
≤ 3(c0(n/2)lg 3 - c1(n/2)) + n
= 3(c0(n/2)lg 3) - 3c1n/2 + n
Now, choose c1 so that -3c1n/2 + n = -c1n. This solves to
-3c1n/2 + n = -c1n
-3c1/2 + 1 = -c1
-3c1 + 2 = -2c1
2 = c1
This choice of c1 will then let you cancel out the +n term successfully, letting the induction work successfully.
Hope this helps!

Related

Finding these three algorithm's run time

Hi I am having a tough time showing the run time of these three algorithms for T(n). Assumptions include T(0)=0.
1) This one i know is close to Fibonacci so i know it's close to O(n) time but having trouble showing that:
T(n) = T(n-1) + T(n-2) +1
2) This on i am stumped on but think it's roughly about O(log log n):
T(n) = T([sqrt(n)]) + n. n greater-than-or-equal to 1. sqrt(n) is lower bound.
3) i believe this one is in roughly O(n*log log n):
T(n) = 2T(n/2) + (n/(log n)) + n.
Thanks for the help in advance.
T(n) = T(n-1) + T(n-2) + 1
Assuming T(0) = 0 and T(1) = a, for some constant a, we notice that T(n) - T(n-1) = T(n-2) + 1. That is, the growth rate of the function is given by the function itself, which suggests this function has exponential growth.
Let T'(n) = T(n) + 1. Then T'(n) = T'(n-1) + T'(n-2), by the above recurrence relation, and we have eliminated the troublesome constant term. T(n) and U(n) differ by a constant factor of 1, so assuming they are both non-decreasing (they are) then they will have the same asymptotic complexity, albeit for different constants n0.
To show T'(n) has asymptotic growth of O(b^n), we would need some base cases, then the hypothesis that the condition holds for all n up to, say, k - 1, and then we'd need to show it for k, that is, cb^(n-2) + cb^(n-1) < cb^n. We can divide through by cb^(n-2) to simplify this to 1 + b <= b^2. Rearranging, we get b^2 - b - 1 > 0; roots are (1 +- sqrt(5))/2, and we must discard the negative one since we cannot use a negative number as the base for our exponent. So for b >= (1+sqrt(5))/2, T'(n) may be O(b^n). A similar thought experiment will show that for b <= (1+sqrt(5))/2, T'(n) may be Omega(b^n). Thus, for b = (1+sqrt(5))/2 only, T'(n) may be Theta(b^n).
Completing the proof by induction that T(n) = O(b^n) is left as an exercise.
T(n) = T([sqrt(n)]) + n
Obviously, T(n) is at least linear, assuming the boundary conditions require T(n) be nonnegative. We might guess that T(n) is Theta(n) and try to prove it. Base case: let T(0) = a and T(1) = b. Then T(2) = b + 2 and T(4) = b + 6. In both cases, a choice of c >= 1.5 will work to make T(n) < cn. Suppose that whatever our fixed value of c is works for all n up to and including k. We must show that T([sqrt(k+1)]) + (k+1) <= c*(k+1). We know that T([sqrt(k+1)]) <= csqrt(k+1) from the induction hypothesis. So T([sqrt(k+1)]) + (k+1) <= csqrt(k+1) + (k+1), and csqrt(k+1) + (k+1) <= c(k+1) can be rewritten as cx + x^2 <= cx^2 (with x = sqrt(k+1)); dividing through by x (OK since k > 1) we get c + x <= cx, and solving this for c we get c >= x/(x-1) = sqrt(k+1)/(sqrt(k+1)-1). This eventually approaches 1, so for large enough n, any constant c > 1 will work.
Making this proof totally rigorous by fixing the following points is left as an exercise:
making sure enough base cases are proven so that all assumptions hold
distinguishing the cases where (a) k + 1 is a perfect square (hence [sqrt(k+1)] = sqrt(k+1)) and (b) k + 1 is not a perfect square (hence sqrt(k+1) - 1 < [sqrt(k+1)] < sqrt(k+1)).
T(n) = 2T(n/2) + (n/(log n)) + n
This T(n) > 2T(n/2) + n, which we know is the recursion relation for the runtime of Mergesort, which by the Master theorem is O(n log n), s we know our complexity is no less than that.
Indeed, by the master theorem: T(n) = 2T(n/2) + (n/(log n)) + n = 2T(n/2) + n(1 + 1/(log n)), so
a = 2
b = 2
f(n) = n(1 + 1/(log n)) is O(n) (for n>2, it's always less than 2n)
f(n) = O(n) = O(n^log_2(2) * log^0 n)
We're in case 2 of the Master Theorem still, so the asymptotic bound is the same as for Mergesort, Theta(n log n).

Solving recurrences with iteration, substitution, Master Theorem?

I'm familiar with solving recurrences with iteration:
t(1) = c1
t(2) = t(1) + c2 = c1 + c2
t(3) = t(2) + c2 = c1 + 2c2
...
t(n) = c1 + (n-1)c2 = O(n)
But what if I had a recurrence with no base case? How would I solve it using the three methods mentioned in the title?
t(n) = 2t(n/2) + 1
For Master Theorem I know the first step, find a, b, and f(n):
a = 2
b = 2
f(n) = 1
But not where to go from here. I'm at a standstill because I'm not sure how to approach the question.
I know of 2 ways to solve this:
(1) T(n) = 2T(n/2) + 1
(2) T(n/2) = 2T(n/4) + 1
now replace T(n/2) from (2) into (1)
T(n) = 2[2T(n/4) + 1] + 1
= 2^2T(n/4) + 2 + 1
T(n/4) = 2T(n/8) + 1
T(n) = 2^2[2T(n/8) + 1] + 2 + 1
= 2^3T(n/8) + 4 + 2 + 1
You would just keep doing this until you can generalize. Eventually you will spot that:
T(n) = 2^kT(n/2^k) + sum(2^(k-1))
You want T(1) so set n/2^k = 1 and solve for k. When you do this you will find that, k = lgn
Substitute lgn for k you will end up with
T(n) = 2^lgnT(n/2^lgn) + (1 - 2^lgn) / (1 - 2)
2^lgn = n so,
T(n) = nT(1) + n - 1
T(n) = n + n - 1 where n is the dominant term.
For Master Theorem its really fast
Consider, T(n) = aT(n/b) + n^c for n>1
There are three cases (note that b is the log base)
(1) if logb a < c, T(n)=Θ(n^c),
(2) if logb a = c, T (n) = Θ(n^c log n),
(3) if logb a > c, T(n) = Θ(n^(logb a)).
In this case a = 2, b = 2, and c = 0 (n^0 = 1)
A quick check shows case 3.
n^(log2 2)
note log2 2 is 1
So by master theorem this is Θ(n)
Apart from the Master Theorem, the Recursion Tree Method and the Iterative Method there is also the so
called "Substitution Method".
Often you will find people talking about the
substitution method, when in fact they mean the iterative method (especially on Youtube).
I guess this stems from the fact that in the iterative method you are also substituting
something, namely the n+1-th recursive call into the n-th one...
The standard reference work about algorithms
(CLRS)
defines it as follows:
Substitution Method
Guess the form of the solution.
Use mathematical induction to find the constants and show that the solution works.
As example let's take your recurrence equation: T(n) = 2T(ⁿ/₂)+1
We guess that the solution is T(n) ∈ O(n²), so we have to prove that
T(n) ≤ cn² for some constant c.
Also, let's assume that for n=1 you are doing some constant work c.
Given:
T(1) ≤ c
T(n) = 2T(ⁿ/₂)+1
To prove:
∃c > 0, ∃n₀ ∈ ℕ, ∀n ≥ n₀, such that T(n) ≤ cn² is true.
Base Case:
n=1: T(1) ≤ c
n=2: T(2) ≤ T(1) + T(1) + 1 ≤ 4c
(≤c) (≤c) (cn²)
Induction Step:
As inductive hypothesis we assume T(n) ≤ cn² for all positive numbers smaller than n
especially for (ⁿ/₂).
Therefore T(ⁿ/₂) ≤ c(ⁿ/₂)², and hence
T(n) ≤ 2c(ⁿ/₂)² + 1 ⟵ Here we're substituting c(ⁿ/₂)² for T(ⁿ/₂)
= (¹/₂)cn² + 1
≤ cn² (for c ≥ 2, and all n ∈ ℕ)
So we have shown, that there is a constant c, such that T(n) ≤ cn² is true for all n ∈ ℕ.
This means exactly T(n) ∈ O(n²). ∎
(for Ω, and hence Θ, the proof is similar).

Solving the following recurrence: T(n) = T(n/3) + T(n/2) + sqrt(n)

I am trying to solve the following recurrence:
T(n) = T(n/3) + T(n/2) + sqrt(n)
I currently have done the following but am not sure if I am on the right track:
T(n) <= 2T(n/2) + sqrt(n)
T(n) <= 4T(n/4) + sqrt(n/2) + sqrt(n)
T(n) <= 8T(n/8) + sqrt(n/4) + sqrt(n/2) + sqrt(n)
so, n/(2^k) = 1, and the sqrt portion simplifies to: (a(1-r^n))/(1-r)
K = log2(n) and the height is 2^k, so 2^(log2(n)) but:
I am not sure how to combine the result of 2^(log2(n)) with the sqrt(n) portion.
A good initial attempt would be to identify the upper and lower bounds of the time complexity function. These are given by:
These two functions are much easier to solve for than T(n) itself. Consider the slightly more general function:
When do we stop recursing? We need a stopping condition. Since it is not given, we can assume it is n = 1 without loss of generality (you'll hopefully see how). Therefore the number of terms, m, is given by:
Therefore we can obtain the lower and upper bounds for T(n):
Can we do better than this? i.e. obtain the exact relationship between n and T(n)?
From my previous answer here, we can derive a binomial summation formula for T(n):
Where
C is such that n = C is the stopping condition for T(n). If not given, we can assume C = 1 without loss of generality.
In your example, f(n) = sqrt(n), c1 = c2 = 1, a = 3, b = 2. Therefore:
How do we evaluate the inner sum? Consider the standard formula for a binomial expansion, with positive exponent m:
Thus we replace x, y with the corresponding values in the formula, and get:
Where we arrived at the last two steps with the standard geometric series formula and logarithm rules. Note that the exponent is consistent with the bounds we found before.
Some numerical tests to confirm the relationship:
N T(N)
--------------------
500000 118537.6226
550000 121572.4712
600000 135160.4025
650000 141671.5369
700000 149696.4756
750000 165645.2079
800000 168368.1888
850000 181528.6266
900000 185899.2682
950000 191220.0292
1000000 204493.2952
Plot of log T(N) against log N:
The gradient of such a plot m is such that T(N) ∝ N^m, and we see that m = 0.863, which is quite close to the theoretical value of 0.861.

solving the recurrence T(floor[n/2]) + T(ceil[n/2]) + n - 1

I have the following recurrence:
T(n) = c for n = 1.
T(n) = T(floor[n/2]) + T(ceil[n/2]) + n - 1 for n > 1.
It looks like merge sort to me so i guess that the solution to the recurrence is Θ(nlogn). According to the master method i have:
a) Θ(1) for n = 1 (constant time).
b) If we drop the floor and ceil we have: (step1)
T(N) = 2T(N/2) + n - 1 => a = 2, b = 2.
logb(a) (base b) = lg(2) = 1 so n^lg(2) = n^1 = n
Having a closer look we know that we have case 2 of master method:
if f(n) = Θ(log(b)a) our solution to the recurrence is T(n) = Θ(log(b)a log(2)n)
The solution is indeed T(n) = Θ(nlogn) but we are off my a constant factor 1.
My first question is:
at step 1 we dropped of ceil and floor. Is this correct ? The second question is how do i get rid of the constant factor 1 ? do i drop it ? or should i name it d and prove that n - 1 is indeed n (if so how do i prove it ?). Lastly is it better to prove it with the substitution method ?
Edit: if we use the substitution method we get:
We guess that the solution is O(n). We need to show that T(n) <= cn.
Substitutting in the recurrence we obtein
T(n) <= c(floor[n/2]) + c(ceil[n/2]) + n/2 - 1 = cn + n/2 - 1
So it is not merge sort ? What do i miss?
It was long time ago, but here goes
Step 1 we dropped of ceil and floor. Is this correct ?
I would rather say
T(floor(n/2)) + T(floor[n/2)) <= T(floor(n/2)) + T(ceil[n/2))
T(floor(n/2)) + T(ceil[n/2)) <= T(ceil(n/2)) + T(ceil[n/2))
in case they are not equal they differ by 1 (and you can ignore any constant)
The second question is how do i get rid of the constant factor 1 ?
You ignore it. Reasoning behind it is : even if constant is huge 10^100 it will be small compared to the size when n grows larger. In real life you can't really ignore really big constants, but that is how real life and theory differs. In any case 1 makes smallest amount of difference.
Lastly is it better to prove it with the substitution method
You can prove how you like, some are just simpler. Simpler are usually better, but other then that 'better' has no meaning. So my answer is no.

Asymptotic notations

From what I have studied: I have been asked to determine the complexity of a function with respect to another function. i.e. Given f(n) and g(n), determine O(f(n(). In such cases, I substitute values, compare both of them and arrive at a complexity - using O(), Theta and Omega notations.
However, in the substitution method for solving recurrences, every standard document has the following lines:
• [Assume that T(1) = Θ(1).]
• Guess O(n3) . (Prove O and Ω separately.)
• Assume that T(k) ≤ ck3 for k < n .
• Prove T(n) ≤ cn3 by induction.
How am I supposed to find O and Ω when nothing else (apart from f(n)) is given? I might be wrong (I, definitely am), and any information on the above is welcome.
Some of the assumptions above are with reference to this problem: T(n) = 4T(n/2) + n
, while the basic outline of the steps is for all such problems.
That particular recurrence is solvable via the Master Theorem, but you can get some feedback from the substitution method. Let's try your initial guess of cn^3.
T(n) = 4T(n/2) + n
<= 4c(n/2)^3 + n
= cn^3/2 + n
Assuming that we choose c so that n <= cn^3/2 for all relevant n,
T(n) <= cn^3/2 + n
<= cn^3/2 + cn^3/2
= cn^3,
so T is O(n^3). The interesting part of this derivation is where we used a cubic term to wipe out a linear one. Overkill like that is often a sign that we could guess lower. Let's try cn.
T(n) = 4T(n/2) + n
<= 4cn/2 + n
= 2cn + n
This won't work. The gap between the right-hand side and the bound we want is is cn + n, which is big Theta of the bound we want. That usually means we need to guess higher. Let's try cn^2.
T(n) = 4T(n/2) + n
<= 4c(n/2)^2 + n
= cn^2 + n
At first that looks like a failure as well. Unlike our guess of n, though, the deficit is little o of the bound itself. We might be able to close it by considering a bound of the form cn^2 - h(n), where h is o(n^2). Why subtraction? If we used h as the candidate bound, we'd run a deficit; by subtracting h, we run a surplus. Common choices for h are lower-order polynomials or log n. Let's try cn^2 - n.
T(n) = 4T(n/2) + n
<= 4(c(n/2)^2 - n/2) + n
= cn^2 - 2n + n
= cn^2 - n
That happens to be the exact solution to the recurrence, which was rather lucky on my part. If we had guessed cn^2 - 2n instead, we would have had a little credit left over.
T(n) = 4T(n/2) + n
<= 4(c(n/2)^2 - 2n/2) + n
= cn^2 - 4n + n
= cn^2 - 3n,
which is slightly smaller than cn^2 - 2n.

Resources