proving Big-O runtime of an algorithm - algorithm

I am trying to learn how to prove Big O correctly.
what i am trying to do is find some C and N0 for a given function.
the definition given for Big-O is
Let f(n) and g(n) be functions mapping nonnegative integers to real numbers.
We say that f(n) is O(g(n)) if there is a real constant c > 0 and an integer
constant n0 ≥ 1 such that for all n ≥ n0, f(n) ≤ c g(n).
Given the polynomial (n+1)^5 i need to show that it has a runtime of O(n^5).
my question is, how do i find such c and N0 from the definition above, and how do i continue my algebra to see if it runs n^5?
So far by trying induction i have,
(n+1)^5 = n^5 + 5n^4 + n^3 + 10n^2 + 5n^1 + n^0
find the n+1 element so
n^5 + 5n^4 + n^3 + 10n^2 + 5n^1 + n^0 <= n^5 + 5n^5 + n^5 + 10n^5 + 5n^5 + n^5
n^5 + 5n^4 + 10n^2 + 5n + 1 <= 22n^5

You want a constant c such that (n + 1) 5 ≤ c n 5. For that, you do not need induction, only a bit of algebra and it turns out you actually already found such a c, but missed the n0 in the process. So let's start from the beginning.
Note that c does not need to be tight, it can be way bigger than necessary and will still prove time-complexity. We will use that to our advantage.
We can first develop the left side as you did.
(n + 1) 5 = n5 + 5n4 + 10n3 + 10 n2 + 5n + 1
For n ≥ 1, we have that n, n2, n3, n4 ≤ n5, an thus.
(n + 1) 5 ≤ (1 + 5 + 10 + 10 + 5 + 1) n5 = 22n5
And there you got a c such that (n + 1) 5 ≤ c n5. That c is 22.
And since we stated above that this holds if n ≥ 1, then we have that n0 = 1.
Generalization
This generalizes for any degree. In general given the polynomial f(n) = (n + a)b, then you know that there exists a number c that is found by summing all the coefficients of the polynomial after development. It turns out the exact value of c does not matter so you do not need to compute it, all that matter is that we proved its existence and thus (n + a)b is O(nb).

Related

Asymptotic Notation: Proving Big Omega, O, and Theta

I have a few asymptotic notation problems I do not entirely grasp.
So when proving asymptotic complexity, I understand the operations of finding a constant and the n0 term of which the notation will be true for. So, for example:
Prove 7n+4 = Ω(n)
In such a case we would pick a constant c, such that it is lower than 7 since this regarding Big Omega. Picking 6 would result in
7n+4 >= 6n
n+4 >= 0
n = -4
But since n0 cannot be a negative term, we pick a positive integer, so n0 = 1.
But what about a problem like this:
Prove that n^3 − 91n^2 − 7n − 14 = Ω(n^3).
I picked 1/2 as the constant, reaching
1/2n^3 - 91n^2 - 7n -14 >= 0.
But I am unsure how to continue. Also, a problem like this, I think regarding theta:
Let g(n) = 27n^2 + 18n and let f(n) = 0.5n^2 − 100. Find positive constants n0, c1 and c2 such
that c1f(n) ≤ g(n) ≤ c2f(n) for all n ≥ n0.
In such a case am I performing two separate operations here, one big O comparison and one Big Omega comparison, so that there is a theta relationship, or tight bound? If so, how would I go about that?
To show n3 − 91n2 − 7n − 14 is in Ω(n3), we need to exhibit some numbers n0 and c such that, for all n ≥ n0:
n3 − 91n2 − 7n − 14 ≥ cn3
You've chosen c = 0.5, so let's go with that. Rearranging gives:
n3 − 0.5n3 ≥ 91n2 + 7n + 14
Multiplying both sides by 2 and simplifying:
182n2 + 14n + 28 ≤ n3
For all n ≥ 1, we have:
182n2 + 14n + 28 ≤ 182n2 + 14n2 + 28n2 = 224n2
And when n ≥ 224, we have 224n2 ≤ n3. Therefore, the choice of n0 = 224 and c = 0.5 demonstrates that the original function is in Ω(n3).

Is it always possible to find a constant K to prove big O or big Omega?

So I have to figure out if n^(1/2) is Big Omega of log(n)^3. I am pretty sure that it is not, since n^(1/2) is not even in the bounds of log(n)^3; but I do not know how to prove it without limits. I know the definition without limits is
g(n) is big Omega of f(n) iff there is a constant c > 0 and an
integer constant n0 => 1 such that f(n) => cg(n) for n => n0
But can I really always find a constant c that will satisfy this?
for instance for log(n)^3=>c*n^(1/2) if c = 0.1 and n = 10 then we get 1=>0.316.
When comparing sqrt(n) with ln(n)^3 what happens is that
ln(n)^3 <= sqrt(n) ; for all n >= N0
How do I know? Because I printed out sufficient samples of both expressions as to convince myself which dominated the other.
To see this more formally, let's first assume that we have already found N0 (we will do that later) and let's prove by induction that if the inequality holds for n >= N0, it will also hold for n+1.
Note that I'm using ln in base e for the sake of simplicity.
Equivalently, we have to show that
ln(n + 1) <= (n + 1)^(1/6)
Now
ln(n + 1) = ln(n + 1) - ln(n) + ln(n)
= ln(1 + 1/n) + ln(n)
<= ln(1 + 1/n) + n^(1/6) ; inductive hypethesis
From the definition of e we know
e = limit (1 + 1/n)^n
taking logarithms
1 = limit n*ln(1 + 1/n)
Therefore, there exits N0 such that
n*ln(1 + 1/n) <= 2 ; for all n >= N0
so
ln(1 + 1/n) <= 2/n
<= 1
Using this above, we get
ln(n + 1) <= 1 + n^(1/6)
<= (n+1)^(1/6)
as we wanted.
We are now left with the task of finding some N0 such that
ln(N0) <= N0^(1/6)
let's take N0 = e^(6k) for some value of k that we will are about to find. We get
ln(N0) = 6k
N0^(1/6) = e^k
so, we only need to pick k such that 6k < e^k, which is possible because the right hand side grows much faster than the left.

Finding these three algorithm's run time

Hi I am having a tough time showing the run time of these three algorithms for T(n). Assumptions include T(0)=0.
1) This one i know is close to Fibonacci so i know it's close to O(n) time but having trouble showing that:
T(n) = T(n-1) + T(n-2) +1
2) This on i am stumped on but think it's roughly about O(log log n):
T(n) = T([sqrt(n)]) + n. n greater-than-or-equal to 1. sqrt(n) is lower bound.
3) i believe this one is in roughly O(n*log log n):
T(n) = 2T(n/2) + (n/(log n)) + n.
Thanks for the help in advance.
T(n) = T(n-1) + T(n-2) + 1
Assuming T(0) = 0 and T(1) = a, for some constant a, we notice that T(n) - T(n-1) = T(n-2) + 1. That is, the growth rate of the function is given by the function itself, which suggests this function has exponential growth.
Let T'(n) = T(n) + 1. Then T'(n) = T'(n-1) + T'(n-2), by the above recurrence relation, and we have eliminated the troublesome constant term. T(n) and U(n) differ by a constant factor of 1, so assuming they are both non-decreasing (they are) then they will have the same asymptotic complexity, albeit for different constants n0.
To show T'(n) has asymptotic growth of O(b^n), we would need some base cases, then the hypothesis that the condition holds for all n up to, say, k - 1, and then we'd need to show it for k, that is, cb^(n-2) + cb^(n-1) < cb^n. We can divide through by cb^(n-2) to simplify this to 1 + b <= b^2. Rearranging, we get b^2 - b - 1 > 0; roots are (1 +- sqrt(5))/2, and we must discard the negative one since we cannot use a negative number as the base for our exponent. So for b >= (1+sqrt(5))/2, T'(n) may be O(b^n). A similar thought experiment will show that for b <= (1+sqrt(5))/2, T'(n) may be Omega(b^n). Thus, for b = (1+sqrt(5))/2 only, T'(n) may be Theta(b^n).
Completing the proof by induction that T(n) = O(b^n) is left as an exercise.
T(n) = T([sqrt(n)]) + n
Obviously, T(n) is at least linear, assuming the boundary conditions require T(n) be nonnegative. We might guess that T(n) is Theta(n) and try to prove it. Base case: let T(0) = a and T(1) = b. Then T(2) = b + 2 and T(4) = b + 6. In both cases, a choice of c >= 1.5 will work to make T(n) < cn. Suppose that whatever our fixed value of c is works for all n up to and including k. We must show that T([sqrt(k+1)]) + (k+1) <= c*(k+1). We know that T([sqrt(k+1)]) <= csqrt(k+1) from the induction hypothesis. So T([sqrt(k+1)]) + (k+1) <= csqrt(k+1) + (k+1), and csqrt(k+1) + (k+1) <= c(k+1) can be rewritten as cx + x^2 <= cx^2 (with x = sqrt(k+1)); dividing through by x (OK since k > 1) we get c + x <= cx, and solving this for c we get c >= x/(x-1) = sqrt(k+1)/(sqrt(k+1)-1). This eventually approaches 1, so for large enough n, any constant c > 1 will work.
Making this proof totally rigorous by fixing the following points is left as an exercise:
making sure enough base cases are proven so that all assumptions hold
distinguishing the cases where (a) k + 1 is a perfect square (hence [sqrt(k+1)] = sqrt(k+1)) and (b) k + 1 is not a perfect square (hence sqrt(k+1) - 1 < [sqrt(k+1)] < sqrt(k+1)).
T(n) = 2T(n/2) + (n/(log n)) + n
This T(n) > 2T(n/2) + n, which we know is the recursion relation for the runtime of Mergesort, which by the Master theorem is O(n log n), s we know our complexity is no less than that.
Indeed, by the master theorem: T(n) = 2T(n/2) + (n/(log n)) + n = 2T(n/2) + n(1 + 1/(log n)), so
a = 2
b = 2
f(n) = n(1 + 1/(log n)) is O(n) (for n>2, it's always less than 2n)
f(n) = O(n) = O(n^log_2(2) * log^0 n)
We're in case 2 of the Master Theorem still, so the asymptotic bound is the same as for Mergesort, Theta(n log n).

Is my explanation about big o correct in this case?

I'm trying to explain to my friend why 7n - 2 = O(N). I want to do so based on the definition of big O.
Based on the definition of big O, f(n) = O(g(n)) if:
We can find a real value C and integer value n0 >= 1 such that:
f(n)<= C . g(n) for all values of n >= n0.
In this case, is the following explanation correct?
7n - 2 <= C . n
-2 <= C . n - 7n
-2 <= n (C - 7)
-2 / (C - 7) <= n
if we consider C = 7, mathematically, -2 / (C - 7) is equal to negative infinity, so
n >= (negative infinity)
It means that for all values of n >= (negative infinity) the following holds:
7n - 2 <= 7n
Now we have to pick n0 such that for all n >= n0 and n0 >= 1 the following holds:
7n - 2 <= 7n
Since for all values of n >= (negative infinity) the inequality holds, we can simply take n0 = 1.
You're on the right track here. Fundamentally, though, the logic you're using doesn't work. If you are trying to prove that there exist an n0 and c such that f(n) ≤ cg(n) for all n ≥ n0, then you can't start off by assuming that f(n) ≤ cg(n) because that's ultimately what you're trying to prove!
Instead, see if you can start with the initial expression (7n - 2) and massage it into something upper-bounded by cn. Here's one way to do this: since 7n - 2 ≤ 7n, we can (by inspection) just pick n0 = 0 and c = 7 to see that 7n - 2 ≤ cn for all n ≥ n0.
For a more interesting case, let's try this with 7n + 2:
7n + 2
≤ 7n + 2n (for all n ≥ 1)
= 9n
So by inspection we can pick c = 9 and n0 = 1 and we have that 7n + 2 ≤ cn for all n ≥ n0, so 7n + 2 = O(n).
Notice that at no point in this math did we assume the ultimate inequality, which means we never had to risk a divide-by-zero error.

Solving recurrences: Substitution method

I'm trying to follow Cormen's book "Introduction to Algorithms" (page 59, I believe) about substitution method for solving recurrences. I don't get the notation used for MERGE-SORT substitution:
T(n) ≤ 2(c ⌊n/2⌋lg(⌊n/2⌋)) + n
≤ cn lg(n/2) + n
= cn lg n - cn lg 2 + n
= cn lg n - cn + n
≤ cn lg n
Part I don't understand is how do you turn ⌊n/2⌋ to n/2 assuming that it denotes recursion. Can you explain the substitution method and its general thought process (especially the math induction part) in a simple and easily understandable way ? I know there's a great answer of that sort about big-O notation here in SO.
The idea behind the substitution method is to bound a function defined by a recurrence via strong induction. I'm going to assume that T(n) is an upper bound on the number of comparisons merge sort uses to sort n elements and define it by the following recurrence with boundary condition T(1) = 0.
T(n) = T(floor(n/2)) + T(ceil(n/2)) + n - 1.
Cormen et al. use n instead of n - 1 for simplicity and cheat by using floor twice. Let's not cheat.
Let H(n) be the hypothesis that T(n) ≤ c n lg n. Technically we should choose c right now, so let's set c = 100. Cormen et al. opt to write down statements that hold for every (positive) c until it becomes clear what c should be, which is an optimization.
The base cases are H(1) and H(2), namely T(1) ≤ 0 and T(2) ≤ 2 c. Okay, we don't need any comparisons to sort one element, and T(2) = T(1) + T(1) + 1 = 1 < 200.
Inductively, when n ≥ 3, assume for all 1 ≤ n' < n that H(n') holds. We need to prove H(n).
T(n) = T(floor(n/2)) + T(ceil(n/2)) + n - 1
≤ c floor(n/2) lg floor(n/2) + T(ceil(n/2)) + n - 1
by the inductive hypothesis H(floor(n/2))
≤ c floor(n/2) lg floor(n/2) + c ceil(n/2) lg ceil(n/2) + n - 1
by the inductive hypothesis H(ceil(n/2))
≤ c floor(n/2) lg (n/2) + c ceil(n/2) lg ceil(n/2) + n - 1
since 0 < floor(n/2) ≤ n/2 and lg is increasing
Now we have to deal with the consequences of our honesty and bound lg ceil(n/2).
lg ceil(n/2) = lg (n/2) + lg (ceil(n/2) / (n/2))
< lg (n/2) + lg ((n/2 + 1) / (n/2))
since 0 < ceil(n/2) ≤ n/2 + 1 and lg is increasing
= lg (n/2) + log (1 + 2/n) / log 2
≤ lg (n/2) + 2/(n log 2)
by the inequality log (1 + x) ≤ x, which can be proved with calculus
Okay, back to bounding T(n).
T(n) ≤ c floor(n/2) lg (n/2) + c ceil(n/2) (lg (n/2) + 2/(n log 2)) + n - 1
since 0 < floor(n/2) ≤ n/2 and lg is increasing
= c n lg n - c n + n + 2 c ceil(n/2) / (n log 2) - 1
since floor(n/2) + ceil(n/2) = n and lg (n/2) = lg n - 1
≤ c n lg n - (c - 1) n + 2 c/log 2
since ceil(n/2) ≤ n
≤ c n lg n
since, for all n' ≥ 3, we have (c - 1) n' = 99 n' ≥ 297 > 200/log 2 ≈ 288.539.
Commentary
I guess this doesn't explain the why very well, but (hopefully) at least the derivations are correct in all of the details. People who write proofs like these often skip the base cases and ignore floor and ceil because, well, the details usually are just an annoyance that affects the constant c (which most computer scientists not named Knuth don't care about).
To me, the substitution method is for confirming a guess rather than formulating one. The interesting question is how one comes up with a guess. Personally, if the recurrence is (i) not something that looks like Fibonacci (e.g., linear homogeneous recurrences) and (ii) not covered by Akra–Bazzi, a generalization of the Master Theorem, then I'm going to have some trouble coming up with a good guess.
Also, I should mention the most common failure mode of the substitution method: if one can't quite choose c to be a large enough to swallow the extra terms from the subproblems, then the bound may be wrong. On the other hand, more base cases might suffice. In the preceding proof, I used two base cases because I couldn't prove the very last inequality unless I knew that n > 2/log 2.

Resources