I'm reading a textbook right now for my Java III class. We're reading about Big-Oh and I'm a little confused by its formal definition.
Formal Definition: "A function f(n) is of order at most g(n) - that is, f(n) = O(g(n)) - if a positive real number c and positive integer N exist such that f(n) <= c g(n) for all n >= N. That is, c g(n) is an upper bound on f(n) when n is sufficiently large."
Ok, that makes sense. But hold on, keep reading...the book gave me this example:
"In segment 9.14, we said that an
algorithm that uses 5n + 3 operations
is O(n). We now can show that 5n + 3 =
O(n) by using the formal definition of
Big Oh.
When n >= 3, 5n + 3 <= 5n + n = 6n.
Thus, if we let f(n) = 5n + 3, g(n) =
n, c = 6, N = 3, we have shown that
f(n) <= 6 g(n) for n >= 3, or 5n + 3 =
O(n). That is, if an algorithm
requires time directly proportional to
5n + 3, it is O(n)."
Ok, this kind of makes sense to me. They're saying that if n = 3 or greater, 5n + 3 takes less time than if n was less than 3 - thus 5n + n = 6n - right? Makes sense, since if n was 2, 5n + 3 = 13 while 6n = 12 but when n is 3 or greater 5n + 3 will always be less than or equal to 6n.
Here's where I get confused. They give me another example:
Example 2: "Let's show that 4n^2 + 50n
- 10 = O(n^2). It is easy to see that: 4n^2 + 50n - 10 <= 4n^2 + 50n
for any n. Since 50n <= 50n^2 for n
= 50, 4n^2 + 50n - 10 <= 4n^2 + 50n^2 = 54n^2 for n >= 50. Thus, with c = 54 and N = 50, we have shown that 4n^2
+ 50n - 10 = O(n^2)."
This statement doesn't make sense: 50n <= 50n^2 for n >= 50.
Isn't any n going to make the 50n less than 50n^2? Not just greater than or equal to 50? Why did they even mention that 50n <= 50n^2? What does that have to do with the problem?
Also, 4n^2 + 50n - 10 <= 4n^2 + 50n^2 = 54n^2 for n >= 50 is going to be true no matter what n is.
And how in the world does picking numbers show that f(n) = O(g(n))?
Keep in mind that you're looking for "an upper bound on f(n) when n is sufficiently large." Thus, if you can show that f(n) is less than or equal to some cg(n) for values of n greater than N, this means cg(n) is an upper bound for f(n) and f(n)'s complexity is therefore O(g(n)).
The examples given are intended to show that the given function f(n) can never grow beyond c*g(n) for any n > N. By manipulating an initial upper bound so it can be expressed more simply (if 4n^2 + 50n is an upper bound on f(n) then so is 4n^2 + 50n^2, which is equal to 54n^2, which becomes your 54*g(n) where c = 54 and g(n) = n^2), the authors can show that f(n) is bounded by c*g(n), which has complexity O(g(n)) and therefore so does f(n).
The whole thing about picking numbers is just this: To make it easier. Because you're allowed to pick any numbers you like for N and c, the author just picks something, where it's most easy to see. And that's what you can also do (when writing an exam etc).
So while it would often be possible to use a smaller N, the reasoning would become a little bit harder (often requiring some previous knowledge about analysis - we've all learnt years before, that x doesn't grow as fast as x^2... But do you want to write down the analysis proof?)
Keep it simple, is the message :-) It's just a bit strange to get used to this at first.
50n <= 50n^2 for n >= 50
because if n is 50, then 50n is the same as n^2, because 50*50 equals 50^2.
Substituting n^2 for 50n we get
n^2 <= 50n^2 for n >= 50
which is obvious.
Probably the reason that they said 50n<=50n^2 for n>=50 is that if n is less than 1, than n^2 < n. Of course, if n is a positive integer, then yes 50n<=50n^2. In this case, it seems that n is assumed to be a positive integer, although the formal definition they give doesn't state that explicitly.
I can see why saying 50n<=50n^2 for n>=50 may seem a little silly. But it's still true. The book doesn't say 50n<=50n^2 holds ONLY for n>=50; that would be false.
As an analogy, if I say "all of my siblings speak English", that would be true, even though there are a lot of people who speak English who are not my siblings.
Regarding the proof, we might split it into different statements.
(1): 4n^2 + 50n - 10 <= 4n^2 + 50n (for all n)
(2): 4n^2 + 50n <= 4n^2 + 50n^2 (for all n>=50)
(3): 4n^2 + 50n^2 = 54 n^2 (for all n, including all n>=50)
(4): Therefore, 4n^2 + 50n - 10 <= 54n^2 for all n>=50
(5): Therefore, for f(n)=4n^2 + 50n - 10, g(n)=n^2, N=50, and c=54,
the statement f(n) <= c g(n) for all n >= N is true
(6): Therefore, by definition 4n^2 + 50n - 10=O(n^2).
It should be clear that each of these statements is true, either on its own (1,2,3), or as a result of the previous statements.
Formal definition:
f(n) = O(g(n)) means there exist c > 0 and n0 such that for any n >= n0 f(n) <= c*g(n)
f(n) = o(g(n)) means for any c > 0 there exist n0 such that for any n >= n0 f(n) <= c*g(n)
As you can note there are slightly different :)
Related
Example-3 Find upper bound for f(n) = n^4 + 100n^2 + 50
Solution: n^4 + 100n^2 + 50 ≤ 2n^4, for all n ≥ 11
∴ n^4 + 100n^2 + 50 = O(n^4 ) with c = 2 and n0 = 11
In the above question the solution says n>11 and n-nought is 11.
can anybody explain why is it 11?
for reference - this is a problem from the Data Structures and Algorithms Made Easy by Narasimha Karumanchi
f(n) = n^4 + 100n^2 + 50
Intuitively, n^4 grows very fast; n^2 grows less fast than n^4; and 50 doesn't grow at all.
However, for small values of n, n^4 < 50; additionally, the n^2 term has a factor 100 in front of it. Because of this factor, for small values of n, n^4 < 100 n^2.
But because we have the intuition that n^4 grows much faster than n^2, we expect that, for n big enough, 100 n^2 + 50 < n^4.
In order to assert and prove this claim, we need to be more precise on what "for n big enough" means. Your textbook found an exact value; and they claimed: for n ≥ 11, 100 n^2 + 50 < n^4.
How did they find that? Maybe they solved the inequality for n. Or maybe they just intuited it by noticing that:
100 n^2 = 10 * 10 * n * n`
n^4 = n * n * n * n
Thus n^4 is going to be the bigger of the two as soon as n is bigger than 10.
In conclusion: as soon as n ≥ 11, f(n) < 2 n^4. Thus, f(n) satisfies the textbook definition for f(n) = O(n^4).
It doesn't say that n>11 it says that n4 + 100n2 + 50 ≤ 2n4, for all n ≥ 11.
Is it true? You can substitute n for 11 in the formula and check it yourself.
How was 11 obtained? By solving the inequality.
It is not finding an upper bound for a function. It is an asymptotic analysis of a function with big-O notation. Hence, the constant c = 11 does not matter for the analysis, and if you can show the inequality is valid for all n greater than any constant, for instance c = 100, that will be accepted. By the way, you can show that it is true for all n > 11 by the mathematical induction.
According to this book, big O means:
f(n) = O(g(n)) means c · g(n) is an upper bound on f(n). Thus there exists some constant c such that f(n) is always ≤ c · g(n), for large enough n (i.e. , n ≥ n0 for some constant n0).
I have trubble understanding the following big O equation
3n2 − 100n + 6 = O(n2), because I choose c = 3 and 3n2 > 3n2 − 100n + 6;
How can 3 be a factor? In 3n2 − 100n + 6, if we drop the low order terms -100n and 6, aren't 3n2 and 3.n2 the same? How to solve this equation?
I'll take the liberty to slightly paraphrase the question to:
Why do and have the same asymptotic complexity.
For that to be true, the definition should be in effect both directions.
First:
let
Then for the inequality is always satisfied.
The other way around:
let
We have a parabola opened upwards, therefore there is again some after which the inequality is always satisfied.
Let's look at the definition you posted for f(n) in O(g(n)):
f(n) = O(g(n)) means c · g(n) is an upper bound on f(n). Thus there
exists some constant c such that f(n) is always ≤ c · g(n), for
large enough n (i.e. , n ≥ n0 for some constant n0).
So, we only need to find one set of constants (c, n0) that fulfils
f(n) < c · g(n), for all n > n0, (+)
but this set is not unique. I.e., the problem of finding the constants (c, n0) such that (+) holds is degenerate. In fact, if any such pair of constants exists, there will exist an infinite amount of different such pairs.
Note that here I've switched to strict inequalities, which is really only a matter of taste, but I prefer this latter convention. Now, we can re-state the Big-O definition in possibly more easy-to-understand terms:
... we can say that f(n) is O(g(n)) if we can find a constant c such
that f(n) is less than c·g(n) or all n larger than n0, i.e., for all
n>n0.
Now, let's look at your function f(n)
f(n) = 3n^2 - 100n + 6 (*)
Let's describe your functions as a sum of it's highest term and another functions
f(n) = 3n^2 + h(n) (**)
h(n) = 6 - 100n (***)
We now study the behaviour of h(n) and f(n), respectively:
h(n) = 6 - 100n
what can we say about this expression?
=> if n > 6/100, then h(n) < 0, since 6 - 100*(6/100) = 0
=> h(n) < 0, given n > 6/100 (i)
f(n) = 3n^2 + h(n)
what can we say about this expression, given (i)?
=> if n > 6/100, the f(n) = 3n^2 + h(n) < 3n^2
=> f(n) < c*n^2, with c=3, given n > 6/100 (ii)
Ok!
From (ii) we can choose constant c=3, given that we choose the other constant n0 as larger than 6/100. Lets choose the first integer that fulfils this: n0=1.
Hence, we've shown that (+) golds for constant set **(c,n0) = (3,1), and subsequently, f(n) is in O(n^2).
For a reference on asymptotic behaviour, see e.g.
https://www.khanacademy.org/computing/computer-science/algorithms/asymptotic-notation/a/big-o-notation
y=3n^2 (top graph) vs y=3n^2 - 100n + 6
Consider the sketch above. By your definition, 3n^2 only needs to be bigger than 3n^2 - 100n + 6 for large enough n (i.e. , n ≥ n0 for some constant n0). Let that n0 = 5 in this case (it could be something a little smaller, but it's clear which graph is bigger by n=5 so we'll just go with that).
Clearly from the graph, 3n^2 >= 3n^2 - 100n + 6 in the range we've plotted. The only way for 3n^2 - 100n + 6 to get bigger than 3n^2 then is for it to grow more steeply.
But the gradients of 3n^2 and 3n^2 - 100n + 6 are 6n and 6n - 100 respectively, so 3n^2 - 100n + 6 can't grow more steeply, therefore must always be underneath.
So your definition holds - 3n^2 - 100n + 6 <= 3n^2 for all n>=5
I am not an expert, but this looks a lot similar to what we just had in our real analysis course.
Basically if you have something like f(n) = 3n^2 − 100n + 6, the "fastest growing" term "wins" the other terms, when you have really really big n.
So in this case 3n^2 surpasses what ever 100n is, when the n is really big.
Another example would be something like f(n) = n/n^2 or f(n) = n! * n^2.
The first one gets smaller, as n simply cannot "keep up" with n^2. In the second example n! clearly grows faster than n^2, so I guess the answer for that should be f(n) = n! then, because the n^2 technically stops mattering with big n.
And terms like +6, which have no n affecting them are constants and matter even less as they cannot grow even if n grows.
It is all about what happends when n is really big. If your n is 34934854385754385463543856, then n^2 is hell of a bigger than 100n, because n^2 = n * n = 34934854385754385463543856 * 34934854385754385463543856.
How do you work this out? do you get c first which is the ratio of the two functions then with the ratio find the range of n ? how can you tell ? please explain i'm really lost, Thanks.
Example 1: Prove that running time T(n) = n^3 + 20n + 1 is O(n^3)
Proof: by the Big-Oh definition,
T(n) is O(n^3) if T(n) ≤ c·n^3 for some n ≥ n0 .
Let us check this condition:
if n^3 + 20n + 1 ≤ c·n^3 then 1 + 20/n^2 + 1/n^3 <=c .
Therefore,
the Big-Oh condition holds for n ≥ n0 = 1 and c ≥ 22 (= 1 + 20 + 1). Larger
values of n0 result in smaller factors c (e.g., for n0 = 10 c ≥ 1.201 and so on) but in
any case the above statement is valid.
I think the trick you're seeing is that you aren't thinking of LARGE numbers. Hence, let's take a counter example:
T(n) = n^4 + n
and let's assume that we think it's O(N^3) instead of O(N^4). What you could see is
c = n + 1/n^2
which means that c, a constant, is actually c(n), a function dependent upon n. Taking N to a really big number shows that no matter what, c == c(n), a function of n, so it can't be O(N^3).
What you want is in the limit as N goes to infinity, everything but a constant remains:
c = 1 + 1/n^3
Now you can easily say, it is still c(n)! As N gets really, really big 1/n^3 goes to zero. Hence, with very large N in the case of declaring T(n) in O(N^4) time, c == 1 or it is a constant!
Does that help?
I am reading through Skiena's "Algorithm Design Manual".
The first chapter states a formal definition for Big O notation:
f(n) = O(g(n)) means that c * g(n) is an upper bound on f(n).
i.e. there exists some constant c such that f(n) is always less than or equal to c * g(n) for large enough n. (i.e. n >= n0 for some constant n0)
So that's fine and makes sense.
But then the author goes on to describe the Big O of a particular function: 3n^2 - 100n + 6
He says that O(3n^2 - 100n - 6) is NOT equal to O(n). And his reason is that for any c that I choose, c * n is always < 3n^2 when n>c. Which is true, but what about the (-100n + 6) part?
Let's say I choose c = 1 and n = 2.
3n^2 - 100n + 6 = 12 - 200 + 6 = -182
and c * n is 1*2 which is 2. -182 is definitely less than 2, so why does Skiena ignore those terms?
Note the n >= n0 in the definition.
If you pick some c and n0, it has to be true for each n >= n0, not just n0.
So if you have c = 1 and n0 = 2, it also has to be true for n = 1000 for example, which it isn't.
3n^2 - 100n + 6
=> 3(1000)^2 - 100(1000) + 6 = 3 000 000 - 100 000 + 6 = 2 900 006
c.n
=> 1(1000) = 1 000
It's simplification. 3n^2 is greater than any 100n-6 for every n >= (SQRT(2482)+50)/3 ~= 33.2732249428 - please check - it's simple equation. Thus O(3n^2) > O(100n-6). That's why it's not worth considering that part - it does not add any value.
Please note that according to definition you have to find (at least one) c for which every c*n is always < 3n^2 - 100n + 6 for every n greater or equal than some n0 (at least one). Just pick c = 1000 and n0=1000 and you will see that it is always true for those c and n0. Because I have found such c and n0 than statement O(n) < O(3n^2 - 100n - 6) holds true.
But I agree that this simplification might be misleading.
From what I have studied: I have been asked to determine the complexity of a function with respect to another function. i.e. Given f(n) and g(n), determine O(f(n(). In such cases, I substitute values, compare both of them and arrive at a complexity - using O(), Theta and Omega notations.
However, in the substitution method for solving recurrences, every standard document has the following lines:
• [Assume that T(1) = Θ(1).]
• Guess O(n3) . (Prove O and Ω separately.)
• Assume that T(k) ≤ ck3 for k < n .
• Prove T(n) ≤ cn3 by induction.
How am I supposed to find O and Ω when nothing else (apart from f(n)) is given? I might be wrong (I, definitely am), and any information on the above is welcome.
Some of the assumptions above are with reference to this problem: T(n) = 4T(n/2) + n
, while the basic outline of the steps is for all such problems.
That particular recurrence is solvable via the Master Theorem, but you can get some feedback from the substitution method. Let's try your initial guess of cn^3.
T(n) = 4T(n/2) + n
<= 4c(n/2)^3 + n
= cn^3/2 + n
Assuming that we choose c so that n <= cn^3/2 for all relevant n,
T(n) <= cn^3/2 + n
<= cn^3/2 + cn^3/2
= cn^3,
so T is O(n^3). The interesting part of this derivation is where we used a cubic term to wipe out a linear one. Overkill like that is often a sign that we could guess lower. Let's try cn.
T(n) = 4T(n/2) + n
<= 4cn/2 + n
= 2cn + n
This won't work. The gap between the right-hand side and the bound we want is is cn + n, which is big Theta of the bound we want. That usually means we need to guess higher. Let's try cn^2.
T(n) = 4T(n/2) + n
<= 4c(n/2)^2 + n
= cn^2 + n
At first that looks like a failure as well. Unlike our guess of n, though, the deficit is little o of the bound itself. We might be able to close it by considering a bound of the form cn^2 - h(n), where h is o(n^2). Why subtraction? If we used h as the candidate bound, we'd run a deficit; by subtracting h, we run a surplus. Common choices for h are lower-order polynomials or log n. Let's try cn^2 - n.
T(n) = 4T(n/2) + n
<= 4(c(n/2)^2 - n/2) + n
= cn^2 - 2n + n
= cn^2 - n
That happens to be the exact solution to the recurrence, which was rather lucky on my part. If we had guessed cn^2 - 2n instead, we would have had a little credit left over.
T(n) = 4T(n/2) + n
<= 4(c(n/2)^2 - 2n/2) + n
= cn^2 - 4n + n
= cn^2 - 3n,
which is slightly smaller than cn^2 - 2n.