Big O Question - Algorithmic Analysis - algorithm

I am revising for an exam and I have found this problem on the internet and was wondering how I would go about solving it.
(With base 2 logs)
Prove that log(2n) is a member of O(log n).
I have given it a go but am not sure if I am right as no answer has been provided. Could you please help?
Here is my attempt:
log 2n - c log n ≤ 0
log 2 + log n - c log n ≤ 0
1 + (1-c) log n ≤ 0
(I then divided by the log n.)
Example: n = 8 and c = 10 evaluates to less than zero. Therefore it is true.
My questions are:
Am I doing this right?
Can my answer be simplified further?

lg(2n) = lg(2) + lg(n).
lg(2) is a constant. See Wikipedia, Logarithmic identities.

The long answer is that
log(2n) log(2) + log(n) log(2)
lim n->infinity ------- = lim --------------- = lim ------ + 1 = 0 + 1 = 1
log(n) log(n) log(n)
Because the ratio of the two functions in the limit exists (i.e. is bounded), they have the same asymptotic complexity.
In the same way, to prove that O(n2) is not O(n), you would do
lim n->infinity (n^2 / n) = lim n which tends to infinity
Doing this for O(n) vs. O(log n) requires more work because
lim n->infinity (n / log n)
needs to be handled somehow. The trick is then that you can use the derivatives instead, as the derivatives in the limit also need to be asymptotically related (otherwise their integrals are not, i.e. the original functions). You take the derivative of n, which is 1, and that of log n, which is n-1, after which
lim n->infinity (1 / (1 / n)) = lim n which tends to infinity

Related

Why is O(n/2 + 5 log n) O(log n) and not O(n log n)?

For n/2 + 5 log n, I would of thought the lower order terms of 5 and 2 would be dropped, thus leaving n log n
Where am I going wrong?
Edit:
Thank you, I believe I can now correct my mistake:
O(n/2 + 5 log n) = O(n/2 + log n) = O(n + log n) = O(n)
n/2 + 5 log n <= 2n, for all n >= 1 (c = 2, n0=1)
Let us define the function f as follows for n >= 1:
f(n) = n/2 + 5*log(n)
This function is not O(log n); it grows more quickly than that. To show that, we can show that for any constant c > 0, there is a choice of n0 such that for n > n0, f(n) > c * log(n). For 0 < c <= 5, this is trivial, since f(n) > [5log(n)] by definition. For c > 5, we get
n/2 + 5*log(n) > c*log(n)
<=> n/2 > (c - 5)*log(n)
<=> (1/(2(c - 5))*n/log(n) > 1
We can now note that the expression on the LHS is monotonically increasing for n > 1 and find the limit as n grows without bound using l'Hopital:
lim(n->infinity) (1/(2(c - 5))*n/log(n)
= (1/(2(c - 5))* lim(n->infinity) n/log(n)
= (1/(2(c - 5))* lim(n->infinity) 1/(1/n)
= (1/(2(c - 5))* lim(n->infinity) n
-> infinity
Using l'Hopital we find there is no limit as n grows without bound; the value of the LHS grows without bound as well. Because the LHS is monotonically increasing and grows without bound, there must be an n0 after which the value of the LHS exceeds the value 1, as required.
This all proves that f is not O(log n).
It is true that f is O(n log n). This is not hard to show at all: choose c = (5+1/2), and it is obvious that
f(n) = n/2 + 5log(n) <= nlog(n)/2 + 5nlog(n) = (5+1/2)nlog(n) for all n.
However, this is not the best bound we can get for your function. Your function is actually O(n) as well. Choosing the same value for c as before, we need only notice that n > log(n) for all n >= 1, so
f(n) = n/2 + 5log(n) <= n/2 + 5n = (5+1/2)n
So, f is also O(n). We can show that f(n) is Omega(n) which proves it is also Theta(n). That is left as an exercise but is not difficult to do either. Hint: what if you choose c = 1/2?
It's neither O(log n) nor O(n*log n). It'll be O(n) because for large value of n log n is much smaller than n hence it'll be dropped.
It's neither O(log n) nor O(n*log n). It'll be O(n) because for larger values of n log(n) is much smaller than n hence it'll be dropped.
Consider n=10000, now 5log(n) i.e 5*log(10000)=46(apprx) which is less than n/2(= 5000).

which algorithm dominates f(n) or (g(n)

I am doing an introductory course on algorithms. I've come across this problem which I'm unsure about.
I would like to know which of the 2 are dominant
f(n): 100n + log n or g(n): n + (log n)^2
Given the definitions of each of:
Ω, Θ, O
I assumed f(n), so fn = Ω(g(n))
Reason being that n dominates (log n)^2, is that true?
In this case,
limn → ∞[f(n) / g(n)] = 100.
If you go over calculus definitions, this means that, for any ε > 0, there exists some m for which
100 (1 - ε) g(n) ≤ f(n) ≤ 100 (1 + ε) g(n)
for any n > m.
From the definition of Θ, you can infer that these two functions are Θ of each other.
In general, if
limn → ∞[f(n) / g(n)] = c exists, and
0 < c < ∞,
then the two functions have the same order of growth (they are Θ of each other).
n dominates both log(n) and (log n)^2
A little explanation
f(n) = 100n + log n
Here n dominates log n for large values of n.
So f(n) = O(n) .......... [1]
g(n) = n + (log n)^2
Now, (log n)^2 dominates log n.
But n still dominates (log n)^2.
So g(n) = O(n) .......... [2]
Now, taking results [1] and [2] into consideration.
f(n) = Θ(g(n)) and g(n) = Θ(f(n))
since they will grow at the same rate for large values of n.
We can say that f(n) = O(g(n) if there are constants c > 0 and n0 > 0 such that
f(n) <= c*g(n), n > n0
This is the case for both directions:
# c == 100
100n + log n <= 100(n + (log n)^2)
= 100n + 100(log(n)^2) (n > 1)
and
# c == 1
n + (log n)^2 <= 100n + log n (n > 1)
Taken together, we've proved that n + (log n)^2 <= 100n + log n <= 100(n + (log n)^2), which proves that f(n) = Θ(g(n)), which is to say that neither dominates the other. Both functions are Θ(n).
g(n) dominates f(n), or equivalently, g(n) is Ω(f(n)) and the same hold vice versa.
Considering the definition, you see that you can drop the factor 100 in the definition of f(n) (since you can multiply it by any fixed number) and you can drop both addends since they are dominated by the linear n.
The above follows from n is Ω(n + logn) and n is Ω(n + log^2n.
hope that helps,
fricke

Calculating the Recurrence Relation T(n)=T(n / log n) + Θ(1)

The question comes from Introduction to Algorithms 3rd Edition, P63, Problem 3-6, where it's introduced as Iterated functions. I rewrite it as follows:
int T(int n){
for(int count = 0; n > 2 ; ++count)
{
n = n/log₂(n);
}
return count;
}
Then give as tight a bound as possible on T(n).
I can make it O(log n) and Ω(log n / log log n), but can it be tighter?
PS: Using Mathematica, I've learned that when n=1*10^3281039, T(n)=500000
and the same time, T(n)=1.072435*log n/ log log n
and the coefficient declines with n from 1.22943 (n = 2.07126*10^235) to 1.072435 (n = 1*10^3281039).
May this information be helpful.
It looks like the lower bound is pretty good, so I tried to proof that the upper bound is O(log n / log log n).
But let me first explain the other bounds (just for a better understanding).
TL;DR
T(n) is in Θ(log n / log log n).
T(n) is in O(log n)
This can be seen by modifying n := n/log₂n to n := n/2.
It needs O(log₂ n) steps until n ≤ 2 holds.
T(n) is in Ω(log n / log log n)
This can be seen by modifying n := n/log₂(n) to n := n/m, where m is the initial value of log n.
Solving the equation
n / (log n)x < 2 for x leads us to
log n - x log log n < log 2
⇔ log n - log 2 < x log log n
⇔ (log n - log 2) / log log n < x
⇒ x ∈ Ω(log n / log log n)
Improving the upper bound: O(log n) → O(log n / log log n)
Now let us try to improve the upper bound. Instead of dividing n by a fixed constant (namely 2 in the above proof) we divide n as long by the initial value of log(n)/2 as the current value of log(n) is bigger. To be more clearer have a look at the modified code:
int T₂(int n){
n_old = n;
for(int count=0; n>2 ;++count)
{
n = n / (log₂(n_old)/2);
if(log₂(n)) <= log₂(n_old)/2)
{
n_old = n;
}
}
return count;
}
The complexity of the function T₂ is clearly an upper bound for the function T, since log₂(n_old)/2 < log₂(n) holds for the whole time.
Now we need to know how many times we divide by each 1/2⋅log(n_old):
n / (log(sqrt(n)))x ≤ sqrt(n)
⇔ n / sqrt(n) ≤ log(sqrt(n))x
⇔ log(sqrt(n)) ≤ x log(log(sqrt(n)))
⇔ log(sqrt(n)) / log(log(sqrt(n))) ≤ x
So we get the recurrence formula T₂(n) = T(sqrt(n)) + O(log(sqrt(n)) / log(log(sqrt(n)))).
Now we need to know how often this formula has to be expanded until n < 2 holds.
n2-x < 2
⇔ 2-x⋅log n < log 2
⇔ -x log 2 + log log n < log 2
⇔ log log n < log 2 + x log 2
⇔ log log n < (x + 1) log 2
So we need to expand the formula about log log n times.
Now it gets a little bit harder. (Have also a look at the Mike_Dog's answer)
T₂(n) = T(sqrt(n)) + log(sqrt(n)) / log(log(sqrt(n)))
= Σk=1,...,log log n - 1 2-k⋅log(n) / log(2-k⋅log n))
= log(n) ⋅ Σk=1,...,log log n - 1 2-k / (-k + log log n))
(1) = log(n) ⋅ Σk=1,...,log log n - 1 2k - log log n / k
= log(n) ⋅ Σk=1,...,log log n - 1 2k ⋅ 2- log log n / k
= log(n) ⋅ Σk=1,...,log log n - 1 2k / (k ⋅ log n)
= Σk=1,...,log log n - 1 2k / k
In the line marked with (1) I reordered the sum.
So, at the end we "only" have to calculate Σk=1,...,t 2k / k for t = log log n - 1. At this point Maple solves this to
Σk=1,...,t 2k / k = -I⋅π - 2t⋅LerchPhi(2, 1, t) +2t/t
where I is the imaginary unit and LerchPhi is the Lerch transcendent. Since the result for the sum above is a real number for all relevant cases, we can just ignore all imaginary parts. The Lerch transcendent LerchPhi(2,1,t) seems to be in O(-1/t), but I'm not 100% sure about it. Maybe someone will prove this.
Finally this results in
T₂(n) = -2t⋅O(-1/t) + 2t/t = O(2t/t) = O(log n / log log n)
All together we have T(n) ∈ Ω(log n / log log n) and T(n) ∈ O(log n/ log log n),
so T(n) ∈ Θ(log n/ log log n) holds. This result is also supported by your example data.
I hope this is understandable and it helps a little.
The guts of the problem of verifying the conjectured estimate is to get a good estimate of plugging the value
n / log(n)
into the function
n --> log(n) / log(log(n))
Theorem:
log( n/log(n) ) / log(log( n/log(n) )) = log(n)/log(log(n)) - 1 + o(1)
(in case of font readibility issues, that's little-oh, not big-oh)
Proof:
To save on notation, write
A = n
B = log(n)
C = log(log(n))
The work is based on the first-order approximation to the (natural) logarithm: when 0 < y < x,
log(x) - y/x < log(x - y) < log(x)
The value we're trying to estimate is
log(A/B) / log(log(A/B)) = (B - C) / log(B - C)
Applying the bounds for the logarithm of a difference gives
(B-C) / log(B) < (B-C) / log(B-C) < (B-C) / (log(B) - C/B)
that is,
(B-C) / C < (B-C) / log(B-C) < (B-C)B / (C (B-1))
Both the recursion we're trying to satisfy and the lower bound suggest we should estimate this with B/C - 1. Pulling that off of both sides gives
B/C - 1 < (B-C) / log(B-C) < B/C - 1 + (B-C)/(C(B-1))
and thus we conclude
(B-C) / log(B-C) = B/C - 1 + o(1)
If you take away one idea from this analysis to use on your own, let it be the point of using differential approximations (or even higher order Taylor series) to replace complicated functions with simpler ones. e.g. once you have the idea to use
log(x-y) = log(x) + Θ(y/x) when y = o(x)
then all of the algebraic calculations you need for your problem simply follow directly.
Thanks for the answer of #AbcAeffchen
I'm the owner of the question, using the knowledge of "the master method" I learned yesterday, the "a little bit harder" part of proof can be done as follows simply.
I will start here:
T(n) = T(sqrt(n)) + O(log(sqrt(n)) / log(log(sqrt(n))))
⇔ T(n)=T(sqrt(n)) + O(log n / log log n)
Let
n=2k , S(k)=T(2k)
then we have
T(2k) =T(2k/2) + O(log 2k / log log
2k) ⇔ S(k) =S(k/2) + O( k/log k)
with the master method
S(k)=a*S(k/b)+f(k), where a=1, b=2, f(k)=k/log k
= Ω(klog21 +ε) = Ω(kε),
as long as ε∈(0,1)
so we can apply case 3. Then
S(k) = O(k/log k)
T(n) = S(k) = O(k/log k) = O(log n/ log log n)

Is O(log(n*log n) can be considered as O(log n)

Consider I get f(n)=log(n*log n). Should I say that its O(log(n*log n)?
Or should I do log(n*log n)=log n + log(log n) and then say that the function f(n) is O(log n)?
First of all, as you have observed:
log(n*log n) = log(n) + log(log(n))
but think about log(log N) as N->large (as Floris suggests).
For example, let N = 1000, then log N = 3 (i.e. a small number) and log(3) is even smaller,
this holds as N gets huge, i.e. way more than the number of instructions your code could ever generate.
Thus, O(log(n * log n)) = O(log n + k) = O(log(n)) + k = O(log n)
Another way to look at this is that: n * log n << n^2, so in the worse case:
O(log(n^2)) > O(log(n * log n))
So, 2*O(log(n)) is an upper bound, and O(log(n * log n)) = O(log n)
Use the definition. If f(n) = O(log(n*log(n))), then there must exist a positive constant M and real n0 such that:
|f(n)| ≤ M |log(n*log(n))|
for all n > n0.
Now let's assume (without loss of generality) that n0 > 0. Then
log(n) ≥ log(log(n))
for all n > n0.
From this, we have:
log(n(log(n)) = log(n) + log(log(n)) ≤ 2 * log(n)
Substituting, we find that
|f(n)| ≤ 2*M|log(n))| for all n > n0
Since 2*M is also a positive constant, it immediately follows that f(n) = O(log(n)).
Of course in this case simple transformations show both functions differ by a constant factor asymptotically, as shown.
However, I feel like it is worthwhile remind a classic test for analyzing how two functions relate to each other asymptotically. So here's a little more formal proof.
You can check how does f(x) relates to g(x) by analyzing lim f(x)/g(x) when x->infinity.
There are 3 cases:
lim = infinty <=> O(f(x)) > O(g(x))
inf > lim > 0 <=> O(f(x)) = O(g(x))
lim = 0 <=> O(f(x)) < O(g(x))
So
lim ( log( n * log(n) ) / log n ) =
lim ( log n + log log (n) ) / log n =
lim 1 + log log (n) / log n =
1 + 0 = 1
Note: I assumed log log n / log n to be trivial but you can do it by de l'Hospital Rule.

Solving recurrences: Substitution method

I'm trying to follow Cormen's book "Introduction to Algorithms" (page 59, I believe) about substitution method for solving recurrences. I don't get the notation used for MERGE-SORT substitution:
T(n) ≤ 2(c ⌊n/2⌋lg(⌊n/2⌋)) + n
≤ cn lg(n/2) + n
= cn lg n - cn lg 2 + n
= cn lg n - cn + n
≤ cn lg n
Part I don't understand is how do you turn ⌊n/2⌋ to n/2 assuming that it denotes recursion. Can you explain the substitution method and its general thought process (especially the math induction part) in a simple and easily understandable way ? I know there's a great answer of that sort about big-O notation here in SO.
The idea behind the substitution method is to bound a function defined by a recurrence via strong induction. I'm going to assume that T(n) is an upper bound on the number of comparisons merge sort uses to sort n elements and define it by the following recurrence with boundary condition T(1) = 0.
T(n) = T(floor(n/2)) + T(ceil(n/2)) + n - 1.
Cormen et al. use n instead of n - 1 for simplicity and cheat by using floor twice. Let's not cheat.
Let H(n) be the hypothesis that T(n) ≤ c n lg n. Technically we should choose c right now, so let's set c = 100. Cormen et al. opt to write down statements that hold for every (positive) c until it becomes clear what c should be, which is an optimization.
The base cases are H(1) and H(2), namely T(1) ≤ 0 and T(2) ≤ 2 c. Okay, we don't need any comparisons to sort one element, and T(2) = T(1) + T(1) + 1 = 1 < 200.
Inductively, when n ≥ 3, assume for all 1 ≤ n' < n that H(n') holds. We need to prove H(n).
T(n) = T(floor(n/2)) + T(ceil(n/2)) + n - 1
≤ c floor(n/2) lg floor(n/2) + T(ceil(n/2)) + n - 1
by the inductive hypothesis H(floor(n/2))
≤ c floor(n/2) lg floor(n/2) + c ceil(n/2) lg ceil(n/2) + n - 1
by the inductive hypothesis H(ceil(n/2))
≤ c floor(n/2) lg (n/2) + c ceil(n/2) lg ceil(n/2) + n - 1
since 0 < floor(n/2) ≤ n/2 and lg is increasing
Now we have to deal with the consequences of our honesty and bound lg ceil(n/2).
lg ceil(n/2) = lg (n/2) + lg (ceil(n/2) / (n/2))
< lg (n/2) + lg ((n/2 + 1) / (n/2))
since 0 < ceil(n/2) ≤ n/2 + 1 and lg is increasing
= lg (n/2) + log (1 + 2/n) / log 2
≤ lg (n/2) + 2/(n log 2)
by the inequality log (1 + x) ≤ x, which can be proved with calculus
Okay, back to bounding T(n).
T(n) ≤ c floor(n/2) lg (n/2) + c ceil(n/2) (lg (n/2) + 2/(n log 2)) + n - 1
since 0 < floor(n/2) ≤ n/2 and lg is increasing
= c n lg n - c n + n + 2 c ceil(n/2) / (n log 2) - 1
since floor(n/2) + ceil(n/2) = n and lg (n/2) = lg n - 1
≤ c n lg n - (c - 1) n + 2 c/log 2
since ceil(n/2) ≤ n
≤ c n lg n
since, for all n' ≥ 3, we have (c - 1) n' = 99 n' ≥ 297 > 200/log 2 ≈ 288.539.
Commentary
I guess this doesn't explain the why very well, but (hopefully) at least the derivations are correct in all of the details. People who write proofs like these often skip the base cases and ignore floor and ceil because, well, the details usually are just an annoyance that affects the constant c (which most computer scientists not named Knuth don't care about).
To me, the substitution method is for confirming a guess rather than formulating one. The interesting question is how one comes up with a guess. Personally, if the recurrence is (i) not something that looks like Fibonacci (e.g., linear homogeneous recurrences) and (ii) not covered by Akra–Bazzi, a generalization of the Master Theorem, then I'm going to have some trouble coming up with a good guess.
Also, I should mention the most common failure mode of the substitution method: if one can't quite choose c to be a large enough to swallow the extra terms from the subproblems, then the bound may be wrong. On the other hand, more base cases might suffice. In the preceding proof, I used two base cases because I couldn't prove the very last inequality unless I knew that n > 2/log 2.

Resources