Often in CLRS, when proving recurrences via substitution, Ө(f(n)) is replaced with cf(n).
For example,on page 91, the recurrence
T(n) = 3T(⌊n/4⌋) + Ө(n^2)
is written like so in the proof
T(n) <= 3T(⌊n/4⌋) + cn^2
But can't Ө(n^2) stand for, let's say, cn^2 + n? Would that not make such a proof invalid? Further in the proof, the statement
T(n) <= (3/16)dn^2 + cn^2
<= dn^2
is reached. But if cn^2 +n was used instead, it would instead be the following
T(n)<= (3/16)dn^2 + cn^2 + n
Can it still be proven that T(n) <= dn^2 if this is so? Do such lower order terms not matter in proving recurrences via substitution?
Yes, it does not matter.
T(n) <= (3/16)dn^2 + cn^2 + n still less than or equal to dn^2 if n is big enough. Because as n goes to infinity, two sides of the equation have the same increasing rate (which is n^2), so the lower-order term will never matter if there is a constant number of lower-order terms in the cost function. But if there is not a constant number of them, that is a different story.
Edit: as n goes to infinity, you will find suitable d and c for T(n) <= (3/16)dn^2 + cn^2 + n to be less than or equal to dn^2, for example d = 2 and c = 1
Related
When using the substitution method to find the time complexity of recursive functions, why do we to prove the exact form and can't use the asymptotic notations as they are defined. An example from the book "Introduction to Algorithms" at page 86:
T(n) ≤ 2(c⌊n/2⌋) + n
≤ cn + n
= O(n) wrong!!
Why is this wrong?
From the definition of Big-O: O(g(n)) = {f(n): there exist positive constants c2 and n0 such that 0 ≤ f(n) ≤ c2g(n) for all n > n0}, the solution seems right.
Let's say f(n) = cn + n = n(c + 1), and g(n) = n. Then n(c + 1) ≤ c2n if c2 ≥ c + 1. => f(n) = O(n).
Is it wrong because f(n) actually is T(n) and not cn + n, and instead g(n) = cn + n??
I would really appreciate a helpful answer.
At page 2 in this paper the problem is describe. The reason why this solution is wrong is because the extra n in cn + n adds up in every recursive level. As the paper said "over many recursive calls, those “plus ones” add up"
Edit (Have more time to answer probably).
What I tried to do in the question was to solve the problem by induction. This means that I show that the next recursion still is true for my time complexity, which would mean that it would hold for the next recursion after that and so on. However this is only true if the time complexity I calculated is lower than my guess, in this case cn. For example cn + n is larger than cn and and my proof therefore fails. This is because, if I now let the recursion go on one more time I start with cn + n. This would than be calculated to c(cn + n) + n = n * c^2 + cn + n. This would increase for every recursive level and would not grow with O(n). It therefore fails. If however my proof was calculated to cn or lower(imagine it), then the next level would be cn or lower, just as the following one and so on. This leads to O(n). In short the calculation needs to be lower or equal to the quest.
I am trying to solve the following recurrence:
T(n) = T(n/3) + T(n/2) + sqrt(n)
I currently have done the following but am not sure if I am on the right track:
T(n) <= 2T(n/2) + sqrt(n)
T(n) <= 4T(n/4) + sqrt(n/2) + sqrt(n)
T(n) <= 8T(n/8) + sqrt(n/4) + sqrt(n/2) + sqrt(n)
so, n/(2^k) = 1, and the sqrt portion simplifies to: (a(1-r^n))/(1-r)
K = log2(n) and the height is 2^k, so 2^(log2(n)) but:
I am not sure how to combine the result of 2^(log2(n)) with the sqrt(n) portion.
A good initial attempt would be to identify the upper and lower bounds of the time complexity function. These are given by:
These two functions are much easier to solve for than T(n) itself. Consider the slightly more general function:
When do we stop recursing? We need a stopping condition. Since it is not given, we can assume it is n = 1 without loss of generality (you'll hopefully see how). Therefore the number of terms, m, is given by:
Therefore we can obtain the lower and upper bounds for T(n):
Can we do better than this? i.e. obtain the exact relationship between n and T(n)?
From my previous answer here, we can derive a binomial summation formula for T(n):
Where
C is such that n = C is the stopping condition for T(n). If not given, we can assume C = 1 without loss of generality.
In your example, f(n) = sqrt(n), c1 = c2 = 1, a = 3, b = 2. Therefore:
How do we evaluate the inner sum? Consider the standard formula for a binomial expansion, with positive exponent m:
Thus we replace x, y with the corresponding values in the formula, and get:
Where we arrived at the last two steps with the standard geometric series formula and logarithm rules. Note that the exponent is consistent with the bounds we found before.
Some numerical tests to confirm the relationship:
N T(N)
--------------------
500000 118537.6226
550000 121572.4712
600000 135160.4025
650000 141671.5369
700000 149696.4756
750000 165645.2079
800000 168368.1888
850000 181528.6266
900000 185899.2682
950000 191220.0292
1000000 204493.2952
Plot of log T(N) against log N:
The gradient of such a plot m is such that T(N) ∝ N^m, and we see that m = 0.863, which is quite close to the theoretical value of 0.861.
Solve: T(n)=T(n-1)+T(n/2)+n.
I tried solving this using recursion trees.There are two branches T(n-1) and T(n/2) respectively. T(n-1) will go to a higher depth. So we get O(2^n). Is this idea correct?
This is a very strange recurrence for a CS class. This is because from one point of view: T(n) = T(n-1) + T(n/2) + n is bigger than T(n) = T(n-1) + n which is O(n^2).
But from another point of view, the functional equation has an exact solution: T(n) = -2(n + 2). You can easily see that this is the exact solution by substituting it back to the equation: -2(n + 2) = -2(n + 1) + -(n + 2) + n. I am not sure whether this is the only solution.
Here is how I got it: T(n) = T(n-1) + T(n/2) + n. Because you calculate things for very big n, than n-1 is almost the same as n. So you can rewrite it as T(n) = T(n) + T(n/2) + n which is T(n/2) + n = 0, which is equal to T(n) = - 2n, so it is linear. This was counter intuitive to me (the minus sign here), but armed with this solution, I tried T(n) = -2n + a and found the value of a.
I believe you are right. The recurrence relation will always split into two parts, namely T(n-1) and T(n/2). Looking at these two, it is clear that n-1 decreases in value slower than n/2, or in other words, you will have more branches from the n-1 portion of the tree. Despite this, when considering big-o, it is useful to just consider the 'worst-case' scenario, which in this case is that both sides of the tree decreases by n-1 (since this decreases more slowly and you would need to have more branches). In all, you would need to split the relation into two a total of n times, hence you are right to say O(2^n).
Your reasoning is correct, but you give away far too much. (For example, it is also correct to say that 2x^3+4=O(2^n), but that’s not as informative as 2x^3+4=O(x^3).)
The first thing we want to do is get rid of the inhomogeneous term n. This suggests that we may look for a solution of the form T(n)=an+b. Substituting that in, we find:
an+b = a(n-1)+b + an/2+b + n
which reduces to
0 = (a/2+1)n + (b-a)
implying that a=-2 and b=a=-2. Therefore, T(n)=-2n-2 is a solution to the equation.
We now want to find other solutions by subtracting off the solution we’ve already found. Let’s define U(n)=T(n)+2n+2. Then the equation becomes
U(n)-2n-2 = U(n-1)-2(n-1)-2 + U(n/2)-2(n/2)-2 + n
which reduces to
U(n) = U(n-1) + U(n/2).
U(n)=0 is an obvious solution to this equation, but how do the non-trivial solutions to this equation behave?
Let’s assume that U(n)∈Θ(n^k) for some k>0, so that U(n)=cn^k+o(n^k). This makes the equation
cn^k+o(n^k) = c(n-1)^k+o((n-1)^k) + c(n/2)^k+o((n/2)^k)
Now, (n-1)^k=n^k+Θ(n^{k-1}), so that the above becomes
cn^k+o(n^k) = cn^k+Θ(cn^{k-1})+o(n^k+Θ(n^{k-1})) + cn^k/2^k+o((n/2)^k)
Absorbing the lower order terms and subtracting the common cn^k, we arrive at
o(n^k) = cn^k/2^k
But this is false because the right hand side grows faster than the left. Therefore, U(n-1)+U(n/2) grows faster than U(n), which means that U(n) must grow faster than our assumed Θ(n^k). Since this is true for any k, U(n) must grow faster than any polynomial.
A good example of something that grows faster than any polynomial is an exponential function. Consequently, let’s assume that U(n)∈Θ(c^n) for some c>1, so that U(n)=ac^n+o(c^n). This makes the equation
ac^n+o(c^n) = ac^{n-1}+o(c^{n-1}) + ac^{n/2}+o(c^{n/2})
Rearranging and using some order of growth math, this becomes
c^n = o(c^n)
This is false (again) because the left hand side grows faster than the right. Therefore,
U(n) grows faster than U(n-1)+U(n/2), which means that U(n) must grow slower than our assumed Θ(c^n). Since this is true for any c>1, U(n) must grow more slowly than any exponential.
This puts us into the realm of quasi-polynomials, where ln U(n)∈O(log^c n), and subexponentials, where ln U(n)∈O(n^ε). Either of these mean that we want to look at L(n):=ln U(n), where the previous paragraphs imply that L(n)∈ω(ln n)∩o(n). Taking the natural log of our equation, we have
ln U(n) = ln( U(n-1) + U(n/2) ) = ln U(n-1) + ln(1+ U(n/2)/U(n-1))
or
L(n) = L(n-1) + ln( 1 + e^{-L(n-1)+L(n/2)} ) = L(n-1) + e^{-(L(n-1)-L(n/2))} + Θ(e^{-2(L(n-1)-L(n/2))})
So everything comes down to: how fast does L(n-1)-L(n/2) grow? We know that L(n-1)-L(n/2)→∞, since otherwise L(n)∈Ω(n). And it’s likely that L(n)-L(n/2) will be just as useful, since L(n)-L(n-1)∈o(1) is much smaller than L(n-1)-L(n/2).
Unfortunately, this is as far as I’m able to take the problem. I don’t see a good way to control how fast L(n)-L(n/2) grows (and I’ve been staring at this for months). The only thing I can end with is to quote another answer: “a very strange recursion for a CS class”.
I think we can look at it this way:
T(n)=2T(n/2)+n < T(n)=T(n−1)+T(n/2)+n < T(n)=2T(n−1)+n
If we apply the master's theorem, then:
Θ(n∗logn) < Θ(T(n)) < Θ(2n)
Remember that T(n) = T(n-1) + T(n/2) + n being (asymptotically) bigger than T(n) = T(n-1) + n only applies for functions which are asymptotically positive. In that case, we have T = Ω(n^2).
Note that T(n) = -2(n + 2) is a solution to the functional equation, but it doesn't interest us, since it is not an asymptotically positive solution, hence the notations of O don't have meaningful application.
You can also easily check that T(n) = O(2^n). (Refer to yyFred solution, if needed)
If you try using the definition of O for functions of the type n^a(lgn)^b, with a(>=2) and b positive constants, you see that this is not a possible solution too by the Substitution Method.
In fact, the only function that allows a proof with the Substitution Method is exponential, but we know that this recursion doesn't grow as fast as T(n) = 2T(n-1) + n, so if T(n) = O(a^n), we can have a < 2.
Assume that T(m) <= c(a^m), for some constant c, real and positive. Our hypothesis is that this relation is valid for all m < n. Trying to prove this for n, we get:
T(n) <= (1/a+1/a^(n/2))c(a^n) + n
we can get rid of the n easily by changing the hypothesis by a term of lower order. What is important here is that:
1/a+1/a^(n/2) <= 1
a^(n/2+1)-a^(n/2)-a >= 0
Changing variables:
a^(N+1)-a^N-a >= 0
We want to find a bond as tight as possible, so we are searching for the lowest a possible. The inequality we found above accept solutions of a which are pretty close to 1, but is a allowed to get arbitrarily close to 1? The answer is no, let a be of the form a = (1+1/N). Substituting a at the inequality and applying the limit N -> INF:
e-e-1 >= 0
which is a absurd. Hence, the inequality above has some fixed number N* as maximum solution, which can be found computationally. A quick Python program allowed me to find that a < 1+1e-45 (with a little extrapolation), so we can at least be sure that:
T(n) = ο((1+1e-45)^n)
T(n)=T(n-1)+T(n/2)+n is the same as T(n)=T(n)+T(n/2)+n since we are solving for extremely large values of n. T(n)=T(n)+T(n/2)+n can only be true if T(n/2) + n = 0. That means T(n) = T(n) + 0 ~= O(n)
From what I have studied: I have been asked to determine the complexity of a function with respect to another function. i.e. Given f(n) and g(n), determine O(f(n(). In such cases, I substitute values, compare both of them and arrive at a complexity - using O(), Theta and Omega notations.
However, in the substitution method for solving recurrences, every standard document has the following lines:
• [Assume that T(1) = Θ(1).]
• Guess O(n3) . (Prove O and Ω separately.)
• Assume that T(k) ≤ ck3 for k < n .
• Prove T(n) ≤ cn3 by induction.
How am I supposed to find O and Ω when nothing else (apart from f(n)) is given? I might be wrong (I, definitely am), and any information on the above is welcome.
Some of the assumptions above are with reference to this problem: T(n) = 4T(n/2) + n
, while the basic outline of the steps is for all such problems.
That particular recurrence is solvable via the Master Theorem, but you can get some feedback from the substitution method. Let's try your initial guess of cn^3.
T(n) = 4T(n/2) + n
<= 4c(n/2)^3 + n
= cn^3/2 + n
Assuming that we choose c so that n <= cn^3/2 for all relevant n,
T(n) <= cn^3/2 + n
<= cn^3/2 + cn^3/2
= cn^3,
so T is O(n^3). The interesting part of this derivation is where we used a cubic term to wipe out a linear one. Overkill like that is often a sign that we could guess lower. Let's try cn.
T(n) = 4T(n/2) + n
<= 4cn/2 + n
= 2cn + n
This won't work. The gap between the right-hand side and the bound we want is is cn + n, which is big Theta of the bound we want. That usually means we need to guess higher. Let's try cn^2.
T(n) = 4T(n/2) + n
<= 4c(n/2)^2 + n
= cn^2 + n
At first that looks like a failure as well. Unlike our guess of n, though, the deficit is little o of the bound itself. We might be able to close it by considering a bound of the form cn^2 - h(n), where h is o(n^2). Why subtraction? If we used h as the candidate bound, we'd run a deficit; by subtracting h, we run a surplus. Common choices for h are lower-order polynomials or log n. Let's try cn^2 - n.
T(n) = 4T(n/2) + n
<= 4(c(n/2)^2 - n/2) + n
= cn^2 - 2n + n
= cn^2 - n
That happens to be the exact solution to the recurrence, which was rather lucky on my part. If we had guessed cn^2 - 2n instead, we would have had a little credit left over.
T(n) = 4T(n/2) + n
<= 4(c(n/2)^2 - 2n/2) + n
= cn^2 - 4n + n
= cn^2 - 3n,
which is slightly smaller than cn^2 - 2n.
I have this exercise:
"use a recursion tree to determine a good asymptotic upper bound on the recurrence T(n)=T(n/2)+n^2. Use a substitution method to verify your answer"
I have made this recursion tree
Where I have assumed that k -> infinity (in my book they often stop the reccurence when the input in T gets 1, but I don't think this is the case, when I don't have other informations).
I have concluded that:
When I have used the substitution method I have assumed that T(n)=O(n^2)
And then done following steps:
When n>0 and c>0 I see that following is true for some choice of c
And therefore T(n)<=cn^2
And my question is: "Is this the right way to do it? "
First of, as Philip said, your recursion tree is wrong. It didn't affect the complexity in the end, but you got the constants wrong.
T(n) becomes
n^2 + T(n/2) becomes
n^2 + n^2/4 + T(n/4) becomes
...
n^2(1 + 1/4 + 1/16 + ...)
Stoping at one vs stoping at infinity is mostly a matter of taste and choosing what is more convenient. In this case, I'd do the same as you did and use the infinite sum, because then we can use the geometric series formula to get a good guess that T(n) <= (4/3)n^2
The only thing that bothers me a bit is that your proof in the end tended towards the informal. It is very easy to get lost in informal proofs so if I had to grade your assignment, I'd be more confortable with a traditional proof by induction, like the following one:
statement to prove
We wish to prove that T(n) <= (4/3)*n^2, for n >= 1
Concrete values for c and n0 make the proof more belieavable so put them in if you can. Often you will need to run the proof once to actualy find the values and then come back and put them in, as if you had already known them in the first place :) In this case, I'm hoping my 4/3 guess from the recursion tree turns out correct.
Proof by induction:
Base case (n = 1):
T(1) = 1
(You did not make the value of T(1) explicit, but I guess this should be in the original exercise)
T(1) = 1
<= 4/3
= (4/3)*1^2
T(1) <= (4/3)*1^2
As we wanted.
Inductive case (n > 1):
(Here we assume the inductive hypothesis T(n') <= 4/3*(n')^2 for all n' < n)
We know that
T(n) = n^2 + T(n/2)
By the inductive hypothesis:
T(n) <= n^2 + (4/3)(n/2)^2
Doing some algebra:
T(n) <= n^2 + (4/3)(n/2)^2
= n^2 + (1/3)n^2
= (4/3)n^2
T(n) <= (4/3)*n^2
As we wanted.
May look boring, but now I can be sure that I got the answer right!
Your recursion tree is erroneous. It should look like this:
n^2
|
(n/2)^2
|
(n/4)^2
|
...
|
(n/2^k)^2
You can safely stop once T reaches 1, because you are usually counting discrete "steps", and there's no such thing as "0.34 steps".
Since you're dividing n by 2 in each iteration, k equals log2(n) here.