In 3rd edition of CLRS specifically section 3.1 (page 47 in my book) they say
when a > 0, any linear function an + b is in O(n^2), which is easily verified by taking c = a + |b| and n0 = max(1,-b/a).
where n0 is the value such that when n >= n0 we could show that an + b <= cn^2 in a proof of the above.
I tried to verify this but I couldn't get very far :(
How did they choose these values of c and n0? I know that the only thing that matters is that there exists such a c and n0 such that the above is true to prove that an + b is O(n^2) but I wonder how did they choose specifically those values of c and n0? They don't seem arbitrary, its as if they applied some technique I have never seen before to obtain them.
Thanks.
Let's take the simple case where a and b are both positive. What the authors are trying to do is to create a value where the quadratic function dominates the linear function for n >= 1. That's it. They're just trying to create a general solution to show where the right parabola dominates any line.
So for n=1, the value of the linear function (i.e. l(n) = an + b) will be a+b when n=1. A dominating quadratic without any linear sub-functions (i.e. q(n) = c * n^2) would dominate the linear function, l(n) at n=1 if we choose c = a + b. So, q(n) = (a+b)n^2 dominates l(n) = an + b when n>=1, assuming a and b are both positive. You can check out examples for yourself for plotting 30x^2 and 10x + 20 on Densmos.
It's a bit trickier when b is negative, but the positive case is basically the point.
Related
I'm having tough time in understanding Big O time complexity.
Formal definition of Big O :
f(n) = O(g(n)) means there are positive constants c and k, such that 0 ≤ f(n) ≤ cg(n) for all n ≥ k. The values of c and k must be fixed for the function f and must not depend on n.
And worst time complexity of insertion sort is O(n^2).
I want to understand what is f(n), g(n), c and k here in case of insertion sort.
Explanation
It is not that easy to formalize an algorithm such that you can apply Big-O formally, it is a mathematical concept and does not easily translate to algorithms. Typically you would measure the amount of "computation steps" needed to perform the operation based on the size of the input.
So f is the function that measures how many computation steps the algorithm performs.
n is the size of the input, for example 5 for a list like [4, 2, 9, 8, 2].
g is the function you measure against, so g = n^2 if you check for O(n^2).
c and k heavily depend on the specific algorithm and how exactly f looks like.
Example
The biggest issue with formalizing an algorithm is that you can not really tell exactly how many computation steps are performed. Let's say we have the following Java code:
public static void printAllEven(int n) {
for (int i = 0; i < n; i++) {
if (i % 2 == 0) {
System.out.println(i);
}
}
}
How many steps does it perform? How deep should we go? What about for (int i = 0; i < count; i++)? Those are multiple statements which are executed during the loop. What about i % 2? Can we assume this is a "single operation"? On which level, one CPU cycle? One assembly line? What about the println(i), how many computation steps does it need, 1 or 5 or maybe 200?
This is not practical. We do not know the exact amount, we have to abstract and say it is a constant A, B and C amount of steps, which is okay since it runs in constant time.
After simplifying the analysis, we can say that we are effectively only interested in how often println(i) is called.
This leads to the observation that we call it precisely n / 2 times (since we have so many even numbers between 0 and n.
The exact formula for f using aboves constants would yield something like
n * A + n * B + n/2 * C
But since constants do not really play any role (they vanish in c), we could also just ignore this and simplify.
Now you are left with proving that n / 2 is in O(n^2), for example. By doing that, you will also get concrete numbers for c and k. Example:
n / 2 <= n <= 1 * n^2 // for all n >= 0
So by choosing c = 1 and k = 0 you have proven the claim. Other values for c and k work as well, example:
n / 2 <= 100 * n <= 5 * n^2 // for all n >= 20
Here we have choosen c = 5 and k = 20.
You could play the same game with the full formula as well and get something like
n * A + n * B + n/2 * C
<= n * (A + B + C)
= D * n
<= D * n^2 // for all n > 0
with c = D and k = 0.
As you see, it does not really play any role, the constants just vanish in c.
In case of insertion sort f(n) is the actual number of operations your processor will do to perform a sort. g(n)=n2. Minimal values for c and k will be implementation-defined, but they are not as important. The main idea this Big-O notation gives is that if you double the size of the array the time it takes for insertion sort to work will grow approximately by a factor of 4 (22). (For insertion sort it can be smaller, but Big-O only gives upper bound)
I was doing study on Asymptotic Notations Topic, i recon that its formula is so simple yet it tells nothing and there are couple of things i don't understand.
When we say
f(n) <= c.g(n) where n >= n₀
And we don't know the value of c =? and n=? at first but by doing division of f(n) or g(n) we get the value of c. (here is where confusion lies)
First Question: How do we decide which side's 'n' has to get divided in equation f(n) or g(n)?
Suppose we have to prove:
2n(square) = O(n(cube))
here f(n) = 2(n(square)) and g(n)=n(cube)
which will form as:
2(n(square)) = c . n(cube)
Now in the notes i have read they are dividing 2(n(square)) to get the value of c by doing that we get c = 1;
But if we do it dividing n(cube) [which i don't know whether we can do it or not] we get c = 2;
How do we know what value we have to divide ?
Second Problem: Where does n₀ come from what's its task ?
Well by formula we know n >= n(0) which means what ever we take n we should take the value of n(0) or should be greater what n is.
But i am confuse that where do we use n₀ ? Why it is needed ?
By just finding C and N can't we get to conclusion if
n(square) = O(n(cube)) or not.
Would any one like to address this? Many thanks in advance.
Please don't snub me if i ask anything stupid or give -1. Address it please any useful link which covers all this would be enough as well:3
I have gone through the following links before posting this question this is what i understand and here are those links:
http://openclassroom.stanford.edu/MainFolder/VideoPage.php?course=IntroToAlgorithms&video=CS161L2P8&speed=
http://faculty.cse.tamu.edu/djimenez/ut/utsa/cs3343/lecture3.html
https://sites.google.com/sites/algorithmss15
From the second url in your question:
Let's define big-Oh more formally:
O(g(n)) = { the set of all f such that there exist positive constants c and n0 satisfying 0 <= f(n) <= cg(n) for all n >= n0 }.
This means, that for f(n) = 4*n*n + 135*n*log(n) + 1e8*n the big-O is O(n*n).
Because for large enough c and n0 this is true:
4*n*n + 135*n*log(n) + 1e8*n = f(n) <= O(n*n) = c*n*n
In this particular case the [c,n0] can be for example [6, 1e8], because (this is of course not valid mathematical proof, but I hope it's "obvious" from it):
f(1e8) = 4*1e16 + 135*8*1e8 + 1e16 = 5*1e16 + 1080*1e8 <= 6*1e16 = 6*1e8*1e8 =~= O(n*n). There are of course many more possible [c,n0] for which the f(n) <= c*n*n holds true, but you need to find only one such pair to prove the f(n) has O(f(n)) of O(n*n).
As you can see, for n=1 you need quite a huge c (like 1e9), so at first look the f(n) may look much bigger than n*n, but in the asymptotic notion you don't care about the first few initial values, as long as the behaviour since some boundary is as desired. That boundary is some [c,n0]. If you can find such boundary ([6, 1e8]), then QED: "f(n) has big-O of n*n".
The n >= n₀ means that whatever you say in the lemma can be false for some first k (countable) parameters n' : n' < n₀, but since some n₀ the lemma is true for all the rest of (bigger) integers.
It says that you don't care about first few integers ("first few" can be as "little" as 1e400, or 1e400000, ...etc... from the theory point of view), and you only care about the bigger (big enough, bigger than n₀) n values.
Ultimately it means, that in the big-O notation you usually write the simplest and lowest function having the same asymptotic notion as the examined f(n).
For example for any f(n) of polynomial type like f(n) = ∑aini, i=0..k the O(f(n)) = O(nk).
So I did throw away all the lower 0..(k-1) powers of n, as they stand no chance against nk in the long run (for large n). And the ak does lose to some bigger c.
In case you are lost in that i,k,...:
f(n) = 34n4 + 23920392n2 has O(n4).
As for large enough n that n4 will "eclipse" any value created from n2. And 34n4 is only 34 times bigger than n4 => 34 is constant (relates to c) and can be omitted from big-O notation too.
The great people at MyCodeSchool.com have this introductory video on YouTube, covering the basics of Big-O, Theta, and Omega notation.
The following definition of Big-O notation is provided:
O(g(n) ) := { f(n) : f(n) ≤ cg(n) }, for all n ≥ n0
My casual understanding of the equation is as follows:
Given a function f(), which takes as its input n, there exists another function g(), whose output is always greater than or equal to the output of f()--given two conditions:
g() is multiplied by some constant c
n is greater than some lower bound n0
Is my understanding correct?
Furthermore, the following specific example was provided to illustrate Big-O:
Given:
f(n) = 5n2 + 2n + 1
Because all of the following are true:
5n2 > 2n2, for all n
5n2 > 1n2, for all n
It follows that:
c = 5 + 2 + 1 = 8
Therefore, the video concludes, f(n) ≤ 8n2 for all n ≥ 1, and g(n) = 8n2
I think maybe the video concluded that n0 must be 1, because 1 is the only positive root of the equality 8n2 = 5n2 + 2n + 1 ( Negative one-third is also a root, but n is limited to whole numbers. So, no dice there. )
Is this the standard way of computing n0 for Big-O notation?
Take the largest powered factor in your polynomial
Multiply it by the sum of the coefficients in your time function
Set their product equal to your time function
Solve for zero
Reject all roots that are not in the set of whole numbers
Any help would be greatly appreciated. Thanks in advance.
Your understanding is mostly correct, but from your wording - "I think maybe the video concluded that n0 must be 1", I have to point out that it is also valid to take n0 to be 2, or 3 etc. In fact, any number greater than 1 will satisfy the required condition, there are actually infinitely many choices for the pair (c, n0)!
The important point to note is that the values of the constant c and n0 does not really matter, all we care is the existence a pair of constants (c, n0).
The Basics
Big-O notation describes the asymptotic behavior of a given function f, it essential describes the upper bound of f when the its input value is sufficiently large.
Formally, we say that f is big-O of another function g, i.e. f(x) = O(g(x)), if there exists a positive constant c and a constant n0 such that the following inequality holds:
f(n) ≤ c g(n), for all n ≥ n0
Note that the inequality captures the idea of upper bound: f is upper-bounded by a positive multiple of g. Moreover, the "for all" condition satisfies that the upper bound holds when the input n is sufficiently large (e.g. larger than n0).
How to Pick (c, n0)?
In order to prove f(x) = O(g(x)) for given functions f, g, all we need is to pick any pair of (c, n0) such that the inequality holds, and then we are done!
There is no standard way of finding (c, n0), just use whatever mathematical tools you find helpful. For example, you may fix n0, and then find c by using Calculus to compute the maximum value of f(x) / g(x) in the interval [n0, +∞).
In your case, it appears that you are trying to prove that a polynomial of degree d is big-O of xd, the proof of the following lemma gives a way to pick (c, n0):
Lemma
If f is a polynomial of degree d, then f(x) = O(xd).
Proof: We have f(x) = ad xd + ad-1 xd-1 + ... + a1 x + a0, for each coefficient ai, we have ai ≤ |ai| (absolute value of ai).
Take c = (|ad| + |ad-1| + ... + |a1| + |a0|) , and n0 = 1, then we have:
f(x) = ad xd + ad-1 xd-1 + ... + a1 x + a0
≤ |ad| xd + |ad-1| xd-1 + ... + |a1| x + |a0|
≤ (|ad| + |ad-1| + ... + |a1| + |a0|) xd
= c xd, for all x ≥ 1
Therefore we have f(x) = O(xd)
This question already has answers here:
Difference between Big-O and Little-O Notation
(5 answers)
Closed 8 years ago.
What does nb = o(an) (o is little oh) mean, intuitively? I am just beginning to self teach my self algorithms and I am having hard time interpreting such expressions every time I see one. Here, the way I understood is that for the function nb, the rate of growth is an. But this is not making sense to me regardless of being right or wrong.
f(n)=o(g(n)) means that f(n)/g(n)->0 when n->infinite.
For your problem,it should hold a>1. (n^b)/(a^n)->0 when n->infinite, since (n^b)/(sqrt(a)^n*sqrt(a)^n))=((n^b)/sqrt(a)^n) * (1/sqrt(a)^n). Let f(n)=((n^b)/sqrt(a)^n) is a function increase first and then decrease, so you can get the maximum value of max(f(n))=M, then (n^b)/(a^n) < M/(sqrt(a)^n), since a>1, sqrt(a)>1, so (sqrt(a)^n)->infinite when n->infinite. That is M/(sqrt(a)^n)->0 when n->infinite, At last, we get (n^b)/(a^n)->0 when n->infinite. That is n^b=o(a^n) by definition.
(For simplicity I'll assume that all functions always return positive values. This is the case for example for functions measuring run-time of an algorithm, as no algorithm runs in "negative" time.)
First, a recap of big-O notation, to clear up a common misunderstanding:
To say that f is O(g) means that f grows asymptotically at most as fast as g. More formally, treating both f and g as functions of a variable n, to say that f(n) is O(g(n)) means that there is a constant K, so that eventually, f(n) < K * g(n). The word "eventually" here means that there is some fixed value N (which is a function of K, f, and g), so that if n > N then f(n) < K * g(n).
For example, the function f(n) = n + 2 is O(n^2). To see why, let K = 1. Then, if n > 10, we have n + 2 < n^2, so our conditions are satisfied. A few things to note:
For n = 1, we have f(n) = 3 and g(n) = 1, so f(n) < K * g(n) actually fails. That's ok! Remember, the inequality only needs to hold eventually, and it does not matter if the inequality fails for some small finite list of n.
We used K = 1, but we didn't need to. For example, K = 2 would also have worked. The important thing is that there is some value of K which gives us the inequality we want eventually.
We saw that n + 2 is O(n^2). This might look confusing, and you might say, "Wait, isn't n + 2 actually O(n)?" The answer is yes. n + 2 is O(n), O(n^2), O(n^3), O(n/3), etc.
Little-o notation is slightly different. Big-O notation, intuitively, says that if f is O(g), then f grows asymptotically at most as fast as g. Little-o notation says that if f is o(g), then f grows asymptotically strictly slower than g.
Formally, f is o(g) if for any (let's say positive) choice of K, eventually the inequality f(n) < K * o(g) holds. So, for instance:
The function f(n) = n is not o(n). This is because, for K = 1, there is no value of n so that f(n) < K * g(n). Intuitively, f and g grow asymptotically at the same rate, so f does not grow strictly slower than g does.
The function f(n) = n is o(n^2). Why is this? Pick your favorite positive value of K. (To see the actual point, try to make K small, for example 0.001.) Imagine graphing the functions f(n) and K * g(n). One is a straight line through the origin of positive slope, and the other is a concave-up parabola through the origin. Eventually the parabola will be higher than the line, and will stay that way. (If you remember your pre-calc/calculus...)
Now we get to your actual question: let f(n) = n^b and g(n) = a^n. You asked why f is o(g).
Presumably, the author of the original statement treats a and b as constant, positive real numbers, and moreover a > 1 (if a <= 1 then the statement is false).
The statement, in Engish, is:
For any positive real number b, and any real number a > 1, the function n^b grows asymptotically strictly slower than a^n.
This is an important thing to know if you are ever going to deal with algorithmic complexity. Put simpler, one can say "polynomials grow much slower than exponential functions." It isn't immediately obvious that this is true, and is too much to write out, so here is a reference:
https://math.stackexchange.com/questions/55468/how-to-prove-that-exponential-grows-faster-than-polynomial
Probably you will have to have some comfort with math to be able to read any proof of this fact.
Good luck!
The super high level meaning of the statement nb is o(an) is just that exponential functions like an grow much faster than polynomial functions, like nb.
The important thing to understand when looking at big O and little o notation is that they are both upper bounds. I'm guessing that's why you're confused. nb is o(an) because the growth rate of an is much bigger. You could probably find a tighter little o upper bound on nb (one where the gap between the bound and the function is smaller) but an is still valid. It's also probably worth looking at the difference between Big O and little o.
Remember that a function f is Big O of a function g if for some constant k > 0, you can eventually find a minimum value for n so that f(n) ≤ k * g(n).
A function f is little o of a function g if for any constant k > 0 you can eventually find a minimum value for n so that f(n) ≤ k * g(n).
Note that the little o requirement is harder to fulfill, meaning that if a function f is little o of a function g, it is also Big O of g, and it means the function g grows faster than if it were just Big O of g.
In your example, if b is 3 and a is 2 and we set k to 1, we can work out the minimum value for n so that nb ≤ k * an. In this case, it's between 9 and 10 since
9³ = 729 and 1 * 2⁹ = 512, which means at 9 an is not yet greater than nb
but
10³ = 1000 and 1 * 2¹⁰ = 1024, which means n is now greater than nb.
You can see graphing these functions that n will be greater than nb for any value of n > 10. At this point we've only shown that nb is Big O of n, since Big O only requires that for some value of k > 0 (we picked 1) an ≥ nb for some minimum n (in this case it's between 9 and 10)
To show that nb is little o of an, we would have to show that for any k greater than 0 you can still find a minimum value of n so that an > nb. For example, if you picked k = .5 the minimum of 10 we found earlier doesn't work, since 10³ = 1000, and .5 * 2¹⁰ = 512. But we can just keep sliding the minimum for n out further and further, the smaller you make k the bigger the minimum for n will b. Saying nb is little o of an means no matter how small you make k we will always be able to find a big enough value for n so that nb ≤ k * an
I'm starting to learn about Big-Oh notation.
What is an easy way for finding C and N0 for a given function?
Say, for example:
(n+1)5, or n5+5n4+10n2+5n+1
I know the formal definition for Big-Oh is:
Let f(n) and g(n) be functions mapping
nonnegative integers to real numbers.
We say that f(n) is O(g(n)) if there
is a real constant c > 0 and an
integer constant N0 >= 1
such that f(n) <= cg(n) for every integer N > N0.
My question is, what is a good, sure-fire method for picking values for c and N0?
For the given polynomial above (n+1)5, I have to show that it is O(n5). So, how should I pick my c and N0 so that I can make the above definition true without guessing?
You can pick a constant c by adding the coefficients of each term in your polynomial. Since
| n5 + 5n4 + 0n3 + 10n2 + 5n1 + 1n0 | <= | n5 + 5n5 + 0n5 + 10n5 + 5n5 + 1n5 |
and you can simplify both sides to get
| n5 + 5n4 + 10n2 + 5n + 1 | <= | 22n5 |
So c = 22, and this will always hold true for any n >= 1.
It's almost always possible to find a lower c by raising N0, but this method works, and you can do it in your head.
(The absolute value operations around the polynomials are to account for negative coefficients.)
Usually the proof is done without picking concrete C and N0. Instead of proving f(n) < C * g(n) you prove that f(n) / g(n) < C.
For example, to prove n3 + n is O(n3) you do the following:
(n3 + n) / n3 = 1 + (n / n3) = 1 + (1 / n2) < 2 for any n >= 1. Here you can pick any C >= 2 with N0 = 1.
You can check what the lim abs(f(n)/g(n)) is when n->+infitity and that would give you the constant (g(n) is n^5 in your example, f(n) is (n+1)^5).
Note that the meaning of Big-O for x->+infinity is that if f(x) = O(g(x)), then f(x) "grows no faster than g(x)", so you just need to prove that lim abs(f(x)/g(x)) exists and is less than +infinity.
It's going to depend greatly on the function you are considering. However, for a given class of functions, you may be able to come up with an algorithm.
For instance, polynomials: if you set C to any value greater than the leading coefficient of the polynomial, then you can solve for N0.
After you understand the magic there, you should also get that big-O is a notation. It means that you do not have to look for these coefficients in every problem you solve, once you made sure you understood what's going on behind these letters. You should just operate the symbols according to the notaion, according to its rules.
There's no easy generic rule to determine actual values of N and c. You should recall your calculus knowledge to solve it.
The definition to big-O is entangled with definition of the limit. It makes c satisfy:
c > lim |f(n)/g(n)|, given n approaches +infinity.
If the sequence is upper-bounded, it always has a limit. If it's not, well, then f is not O(g). After you have picked concrete c, you will have no problem finding appropriate N.