So that might be silly, but I'm stuck with this recursion T(n) = 5T(n/2) + O(nlogn). I know from Master Theorem that it's supposed to be
, but I can't really get there.
So far I got to a point of
I just wanted to know if I'm going the right direction with that
You're definitely on the right track here! Let's see if we can simplify that summation.
First, notice that you can pull out the log n term from the summation, since it's independent of the sum. That gives us
(n log n) (sum from k = 0 to lg n (5/2)k)
That sum is the sum of a geometric series, so it solves to
((5/2)log n + 1 - 1) / (5/2 - 1)
= O((5/2)lg n)
Here, we can use the (lovely) identity that alogb c = clogb a to rewrite
O((5/2)lg n) = O(nlg 5/2)
= O(nlg 5 - 1)
And plugging that back into our original formula gives us
n log n · O(nlg 5 - 1) = O(nlg 5 log n).
Hmmm, that didn't quite work. We're really, really close to having something that works here, though! A good question to ask is why this didn't work, and for that, we have to go back to how you got that original summation in the first place.
Let's try expanding out a few terms of the recurrence T(n) using the recursion method. The first expansion gives us
T(n) = 5T(n / 2) + n log n.
The next one is where things get interesting:
T(n) = 5T(n / 2) + n log n
= 5(5T(n / 4) + (n / 2) log (n / 2)) + n log n
= 25T(n / 4) + (5/2) log (n / 2) + n log n
Then we get
T(n) = 25T(n / 4) + (5/2) log (n/2) + n log n
= 25(5T(n / 8) + (n / 4) log (n / 4)) + (5/2) log (n/2) + n log n
= 125T(n / 8) + (25/4)n log (n / 4) + (5/2) log (n/2) + n log n
The general pattern here seems to be the following sum:
T(n) = sum from i = 0 to lg n (5/2)kn lg(n/2k)
= n sum from i = 0 to lg n (5/2)k lg(n/2k)
And notice that this is not your original sum! In particular, notice that the log term isn't log n, but rather a function that grows much more slowly than that. In fact, as k gets bigger, that logarithmic term gets much, much smaller. In fact, if you think about it, the only time we're really paying the full lg n cost here is when k = 0.
Here's a cute little trick we can use to make this sum easier to work with. The log function grows very, very slowly, so slowly, in fact, that we can say that log n = o(nε) for any ε > 0. So what what happens if we try upper-bounding this summation by replacing lg (n / 2k) with (n / 2k)ε for some very small but positive ε? Well, then we'd get
T(n) = n sum from i = 0 to lg n (5/2)k lg(n/2k)
= O(n sum from i = 0 to lg n (5/2)k (n / 2k)ε)
= O(n sum from i = 0 to lg n (5/2)k nε (1 / 2ε)k)
= O(n1+ε sum from i = 0 to lg n (5 / 21+ε))
This might have seemed like some sort of sorcery, but this technique - replacing logs with tiny, tiny polynomials - is a nice one to keep in your back pocket. It tends to come up in lots of contexts!
The expression we have here might look a heck of a lot worse than the one we started with, but it's about to get a lot better. Let's imagine that we pick ε to be sufficiently small - say, so that that 5 / 21+ε is greater than one. Then that inner summation is, once again, the sum of a geometric series, and so we simplify it to
((5/21+ε)lg n + 1 - 1) / (5/21+ε - 1)
= O((5/21+ε)lg n)
= O(nlg (5/21+ε)) (using our trick from before)
= O(nlg 5 - 1 - ε)
And that's great, because our overall runtime is then
T(n) = O(n1+ε nlg 5 - 1 - ε)
= O(nlg 5),
and you've got your upper bound!
To summarize:
Your original summation can be simplified using the formula for the sum of a geometric series, along with the weird identity that alogb c = clogb a.
However, that won't give you a tight upper bound, because your original summation was slightly off from what you'd get from the recursion method.
By repeating the analysis using the recursion method, you get a tighter sum, but one that's harder to evaluate.
We can simplify that summation by using the fact that log n = o(nε) for any ε > 0, and use that to rejigger the sum to make it easier to manipulate.
With that simplification in place, we basically redo the analysis using the same techniques as before - sums of geometric series, swapping terms in exponents and logs - to arrive at the result.
Hope this helps!
Related
Update: I'm still looking for a solution without using something from outside sources.
Given: T(n) = T(n/10) + T(an) + n for some a, and that: T(n) = 1 if n < 10, I want to check if the following is possible (for some a values, and I want to find the smallest possible a):
For every c > 0 there is n0 > 0 such that for every n > n0, T(n) >= c * n
I tried to open the function declaration step by step but it got really complicated and I got stuck since I see no advancement at all.
Here is what I did: (Sorry for adding an image, that's because I wrote a lot on word and I cant paste it as text)
Any help please?
Invoking Akra–Bazzi, g(n) = n¹, so the critical exponent is p = 1, so we want (1/10)¹ + a¹ = 1, hence a = 9/10.
Intuitively, this is like the mergesort recurrence: with linear overhead at each call, if we don't reduce the total size of the subproblems, we'll end up with an extra log in the running time (which will overtake any constant c).
Requirement:
For every c > 0 there is n0 > 0 such that for every n > n0, T(n) >= c*n
By substituting the recurrence in the inequality and solving for a, you will get:
T(n) >= c*n
(c*n/10) + (c*a*n) + n >= c*n
a >= 0.9 - (1/c)
Since our desired outcome is for all c (treat c as infinity), we get a >= 0.9. Therefore, smallest value of a is 0.9 which will satisfy T(n) >= c*n for all c.
Another perspective: draw a recursion tree and count the work done per level. The work done in each layer of the tree will be (1/10 + a) as much as the work of the level above it. (Do you see why?)
If (1/10 + a) < 1, this means the work per level is decaying geometrically, so the total work done will sum to some constant multiple of n (whose leading coefficient depends on a). If (1 / 10 + a) ≥ 1, then the work done per level stays the same or grows from one level to the next, so the total work done now depends (at least) on the number of layers in the tree, which is Θ(log n) because the subproblems sizes are dropping by a constant from one layer to the next and that can’t happen more than Θ(log n) times. So once (1 / 10 + a) = 1 your runtime suddenly is Ω(n log n) = ω(n).
(This is basically the reasoning behind the Master Theorem, just applied to non-uniform subproblems sizes, which is where the Akra-Bazzi theorem comes from.)
I'm trying to do time complexity analysis on the bottom up heap analysis and I'm stuck. I've done the mathematical evaluation that shows it is O(n) and i completely understand why. The part I'm stuck understanding is how in the "code" it achieves this. I know the outer for executes floor(n/2) times, and I believe the while executes log times, but I don't know how to get from floor(n/2)log to O(n).
Pseudo code: Time analysis:
for i = n/2-1; i <=0; i-- n/2+1
k=i n/2
while(2*k-1 <= n) n/2(????)+1 <-- this is where I'm stuck. Should run log n times?
j = k*2-1 ...
if(j<n && H[j] < H[j+1]) ...
j++ ...
if(H[k] < h[j]) ...
break ...
swap(H[k],H[j]) ...
k=j ...
So I can see that the while probably runs log n times, but I can't see how to get from there (n/2)log n to O(n). I'm only looking for worst case since I know best case is n/2 + 1 since it breaks when the subtree is a heap. Any help or direction to reading material is welcome.
The best advice I have to offer about working out the big-O cost of different loops is this one:
"When in doubt, work inside out!"
In other words, rather than starting with the outermost loop and working inward, start with the innermost loop and work outward.
In this case, we have this code:
for i = n/2-1; i >= 0; i--
k=i
while (2*k-1 <= n)
j = k*2-1
if(j<n && H[j] < H[j+1])
j++
if(H[k] < h[j])
break
swap(H[k],H[j])
k=j
Since we're working inside out, let's focus first on this loop:
Let's start by analyzing the innermost loop:
while (2*k-1 <= n)
j = k*2-1
if(j<n && H[j] < H[j+1])
j++
if(H[k] < h[j])
break
swap(H[k],H[j])
k=j
I'm going to assume this is a worst-case analysis and that we never trigger the inner break statement. In that case, this means that the loop progresses by having k move to either 2k - 1 or 2k after each step of the loop. This means that k is roughly doubling with each iteration of the loop. The loop ends when k exceeds n, so the number of iterations of the loop is equal to the number of times we have to double k before k exceeds n. That works out to O(log(n / k)) total loop iterations. Note that this isn't a constant; as k gets smaller, we end up doing more and more work per iteration.
We can replace the inner loop with the simpler "do O(log(n / k)) work" to get this:
for i = n/2-1; i >= 0; i--
k=i
do O(log (n / k)) work;
And, since k = i, we can rewrite this as
for i = n/2-1; i >= 0; i--
do O(log (n / i)) work;
Now, how much total work is being done here? Adding up the work done per iteration across all iterations, we get that the work done is
log (n / (n/2)) + log (n / (n/2 - 1)) + log (n / (n/2 - 2)) + ... + log(n / 2) + log(n / 1).
Now, "all" we have to do is simplify this sum. :-)
Using properties of logarithms, we can rewrite this as
(log n - log (n/2)) + (log n - log(n/2 - 1)) + (log n - log(n/2 - 2)) + ... + (log n - log 1)
= (log n + log n + ... + log n) - (log(n/2) + (log(n/2 - 1) + ... + log 1)
= (n/2)(log n) - log((n/2)(n/2 - 1)(n/2 - 2) ... 1)
= (n/2)(log n) - log((n/2)!)
Now, we can use Stirling's approximation to rewrite
log((n/2)!) = (n/2)log(n/2) - n log e + O(log n)
And, therefore, to get this:
(n/2)(log n) - log((n/2)!)
= (n/2)(log n) - (n/2)log(n/2) + n log e - O(log n)
= (n/2)(log (2n / 2)) - (n/2) log (n/2) + O(n)
= (n/2)(log 2 + log(n/2)) - (n/2) log (n/2) + O(n)
= (n/2)(1 + log(n/2)) - (n/2) log (n/2) + O(n)
= n/2 + O(n)
= O(n).
So this whole sum works out to O(n).
As you can see, this is a decidedly nontrivial big-O to calculate! Indeed, it's a lot trickier than just counting up the work done per iteration and multiplying by the number of iterations, because the way in which the work per iteration changes across iterations makes that a lot harder to do. Rather, instead we have to do a more nuanced analysis of how much work is done by each loop, then convert things into a summation and pull out some nontrivial (though not completely unexpected) tricks (Stirling's approximation and properties of logarithms) to get everything to work out as expected.
I would categorize this particular set of loops as a fairly tricky one to work through and not particularly representative of what you'd "normally" see when doing a loop analysis. But hopefully the techniques here give you a sense of how to work through trickier loop analyses and a glimpse of some of the beautiful math that goes into them.
Hope this helps!
Got an 'Essential Algorithms' exam so doing a bit of revision.
Came across this question and unsure whether my answer is right.
This imgur link has the question and my working.
http://imgur.com/SfKUrQO
Could someone verify whether i am right / where ive went wrong?
I can't really follow your handwriting to point out where you went wrong, but here's how I would do it:
T(n) = 2T(n^(1/2)) + c
= 2(2T(n^(1/4)) + c)
= ...
= 2^kT(n^(1/2^k)) + 2^(k - 1)c
So we need to find the smallest k such that:
n^(1/2^k) = 1 (considering the integer part)
We can apply a logarithm to this expression:
1/(2^k) log n = 0 (remember we're considering integer parts)
=> 2^k >= log n | apply a logarithm again
=> k log 2 >= log log n
=> k = O(log log n) because log 2 is a constant
So we have:
2^O(log log n)T(1) + 2^O(log log n - 1)c
= O(2^log log n)
= O(log n)
I see you got O(sqrt(n)), which isn't wrong either, because log n < sqrt n, so if log n is an upper bound, so is sqrt n. It's just not a tight bound.
I'm learning Big-O notation right now and stumbled across this small algorithm in another thread:
i = n
while (i >= 1)
{
for j = 1 to i // NOTE: i instead of n here!
{
x = x + 1
}
i = i/2
}
According to the author of the post, the complexity is Θ(n), but I can't figure out how. I think the while loop's complexity is Θ(log(n)). The for loop's complexity from what I was thinking would also be Θ(log(n)) because the number of iterations would be halved each time.
So, wouldn't the complexity of the whole thing be Θ(log(n) * log(n)), or am I doing something wrong?
Edit: the segment is in the best answer of this question: https://stackoverflow.com/questions/9556782/find-theta-notation-of-the-following-while-loop#=
Imagine for simplicity that n = 2^k. How many times x gets incremented? It easily follows this is Geometric series
2^k + 2^(k - 1) + 2^(k - 2) + ... + 1 = 2^(k + 1) - 1 = 2 * n - 1
So this part is Θ(n). Also i get's halved k = log n times and it has no asymptotic effect to Θ(n).
The value of i for each iteration of the while loop, which is also how many iterations the for loop has, are n, n/2, n/4, ..., and the overall complexity is the sum of those. That puts it at roughly 2n, which gets you your Theta(n).
I have the following recurrence:
T(n) = c for n = 1.
T(n) = T(floor[n/2]) + T(ceil[n/2]) + n - 1 for n > 1.
It looks like merge sort to me so i guess that the solution to the recurrence is Θ(nlogn). According to the master method i have:
a) Θ(1) for n = 1 (constant time).
b) If we drop the floor and ceil we have: (step1)
T(N) = 2T(N/2) + n - 1 => a = 2, b = 2.
logb(a) (base b) = lg(2) = 1 so n^lg(2) = n^1 = n
Having a closer look we know that we have case 2 of master method:
if f(n) = Θ(log(b)a) our solution to the recurrence is T(n) = Θ(log(b)a log(2)n)
The solution is indeed T(n) = Θ(nlogn) but we are off my a constant factor 1.
My first question is:
at step 1 we dropped of ceil and floor. Is this correct ? The second question is how do i get rid of the constant factor 1 ? do i drop it ? or should i name it d and prove that n - 1 is indeed n (if so how do i prove it ?). Lastly is it better to prove it with the substitution method ?
Edit: if we use the substitution method we get:
We guess that the solution is O(n). We need to show that T(n) <= cn.
Substitutting in the recurrence we obtein
T(n) <= c(floor[n/2]) + c(ceil[n/2]) + n/2 - 1 = cn + n/2 - 1
So it is not merge sort ? What do i miss?
It was long time ago, but here goes
Step 1 we dropped of ceil and floor. Is this correct ?
I would rather say
T(floor(n/2)) + T(floor[n/2)) <= T(floor(n/2)) + T(ceil[n/2))
T(floor(n/2)) + T(ceil[n/2)) <= T(ceil(n/2)) + T(ceil[n/2))
in case they are not equal they differ by 1 (and you can ignore any constant)
The second question is how do i get rid of the constant factor 1 ?
You ignore it. Reasoning behind it is : even if constant is huge 10^100 it will be small compared to the size when n grows larger. In real life you can't really ignore really big constants, but that is how real life and theory differs. In any case 1 makes smallest amount of difference.
Lastly is it better to prove it with the substitution method
You can prove how you like, some are just simpler. Simpler are usually better, but other then that 'better' has no meaning. So my answer is no.