I found that taking the logarithm of both sides when comparing two functions asymptotically is a usual technique(according to some solutions to the problems for the CLRS book).
But does it always hold that the asymptotic relation of two functions after taking their logarithm indicates their original asymptotic relation?
I kind of doubt if it works when comparing two exponential functions.
For example log(3^n) = nlog3, log(2^n) = nlog2, then it should indicate that O(2^n) and O(3^n) are on the same level of running time, which is not right.
Asymptotic bounds implicitly include a multiplicative constant which is ignored.
Formally, f(n) = O(g(n)) means that you can find N and C such that n > N => f(n) < C.g(n).
When taking the logarithm, the multiplicative constant become an additive one, log(f(n)) < log(C) + log(g(n)), and it isn't true that f(n) = O(g(n)) <=> log(f(n)) = O(log(g(n))).
So if you compare two complexities by their logarithms, you cannot drop a multiplicative constant, but an additive one, and n.Log(3) indeed differs from n.Log(2).
Similarly, O(n²) and O(n³) differ because 2.Log(n) and 3.Log(n) don't have the same cefficient.
Related
In big O notation, we always say that we should ignore constant factors for most cases. That is, rather than writing,
3n^2-100n+6
we are almost always satisfied with
n^2
since that term is the fastest growing term in the equation.
But I found many algorithm courses starts comparing functions with many terms
2n^2+120n+5 = big O of n^2
then finding c and n0 for those long functions, before recommending to ignore low order terms in the end.
My question is what would I get from trying to understand and annalising these kinds of functions with many terms? Before this month I am comfortable with understanding what O(1), O(n), O(LOG(n)), O(N^3) mean. But am I missing some important concepts if I just rely on this typically used functions? What will I miss if I skipped analysing those long functions?
Let's first of all describe what we mean when we say that f(n) is in O(g(n)):
... we can say that f(n) is O(g(n)) if we can find a constant c such
that f(n) is less than c·g(n) or all n larger than n0, i.e., for all
n>n0.
In equation for: we need to find one set of constants (c, n0) that fulfils
f(n) < c · g(n), for all n > n0, (+)
Now, the result that f(n) is in O(g(n)) is sometimes presented in difference forms, e.g. as f(n) = O(g(n)) or f(n) ∈ O(g(n)), but the statement is the same. Hence, from your question, the statement 2n^2+120n+5 = big O of n^2 is just:
f(n) = 2n^2 + 120n + 5
a result after some analysis: f(n) is in O(g(n)), where
g(n) = n^2
Ok, with this out of the way, we look at the constant term in the functions we want to analyse asymptotically, and let's look at it educationally, using however, your example.
As the result of any big-O analysis is the asymptotic behaviour of a function, in all but some very unusual cases, the constant term has no effect whatsoever on this behaviour. The constant factor can, however, affect how to choose the constant pair (c, n0) used to show that f(n) is in O(g(n)) for some functions f(n) and g(n), i.e., the none-unique constant pair (c, n0) used to show that (+) holds. We can say that the constant term will have no effect of our result of the analysis, but it can affect our derivation of this result.
Lets look at your function as well as another related function
f(n) = 2n^2 + 120n + 5 (x)
h(n) = 2n^2 + 120n + 22500 (xx)
Using a similar approach as in this thread, for f(n), we can show:
linear term:
120n < n^2 for n > 120 (verify: 120n = n^2 at n = 120) (i)
constant term:
5 < n^2 for e.g. n > 3 (verify: 3^2 = 9 > 5) (ii)
This means that if we replace both 120n as well as 5 in (x) by n^2 we can state the following inequality result:
Given that n > 120, we have:
2n^2 + n^2 + n^2 = 4n^2 > {by (ii)} > 2n^2 + 120n + 5 = f(n) (iii)
From (iii), we can choose (c, n0) = (4, 120), and (iii) then shows that these constants fulfil (+) for f(n) with g(n) = n^2, and hence
result: f(n) is in O(n^2)
Now, for for h(n), we analogously have:
linear term (same as for f(n))
120n < n^2 for n > 120 (verify: 120n = n^2 at n = 120) (I)
constant term:
22500 < n^2 for e.g. n > 150 (verify: 150^2 = 22500) (II)
In this case, we replace 120n as well as 22500 in (xx) by n^2, but we need a larger less than constraint on n for these to hold, namely n > 150. Hence, we the following holds:
Given that n > 150, we have:
2n^2 + n^2 + n^2 = 4n^2 > {by (ii)} > 2n^2 + 120n + 5 = h(n) (III)
In same way as for f(n), we can, here, choose (c, n0) = (4, 150), and (III) then shows that these constants fulfil (+) for h(n), with g(n) = n^2, and hence
result: h(n) is in O(n^2)
Hence, we have the same result for both functions f(n) and h(n), but we had to use different constants (c,n0) to show these (i.e., somewhat different derivation). Note finally that:
Naturally the constants (c,n0) = (4,150) (used for h(n) analysis) are also valid to show that f(n) is in O(n^2), i.e., that (+) holds for f(n) with g(n)=n^2.
However, not the reverse: (c,n0) = (4,120) cannot be used to show that (+) holds for h(n) (with g(n)=n^2).
The core of this discussion is that:
As long as you look at sufficiently large values of n, you will be able to describe the constant terms in relations as constant < dominantTerm(n), where, in our example, we look at the relation with regard to dominant term n^2.
The asymptotic behaviour of a function will not (in all but some very unusual cases) depend on the constant terms, so we might as well skip looking at them at all. However, for a rigorous proof of the asymptotic behaviour of some function, we need to take into account also the constant terms.
Ever have intermediate steps in your work? That is what this likely is as when you are computing a big O, chances are you don't already know for sure what the highest order term is and thus you keep track of them all and then determine which complexity class makes sense in the end. There is also something to be said for understanding why the lower order terms can be ignored.
Take some graph algorithms like a minimum spanning tree or shortest path. Now, can just looking at an algorithm you know what the highest term will be? I know I wouldn't and so I'd trace through the algorithm and collect a bunch of terms.
If you want another example, consider Sorting Algorithms and whether you want to memorize all the complexities or not. Bubble Sort, Shell Sort, Merge Sort, Quick Sort, Radix Sort and Heap Sort are a few of the more common algorithms out there. You could either memorize both the algorithm and complexity or just the algorithm and derive the complexity from the pseudo code if you know how to trace them.
I was reading Intro to Algorithms, by Thomas H. Corman when I encountered this statement (in Asymptotic Notations)
when a>0, any linear function an+b is in O(n^2) which is essentially verified by taking c = a + |b| and no = max(1, -b/a)
I can't understand why O(n^2) and not O(n). When will O(n) upper bound fail.
For example, for 3n+2, according to the book
3n+2 <= (5)n^2 n>=1
but this also holds good
3n+2 <= 5n n>=1
So why is the upper bound in terms of n^2?
Well I found the relevant part of the book. Indeed the excerpt comes from the chapter introducing big-O notation and relatives.
The formal definition of the big-O is that the function in question does not grow asymptotically faster than the comparison function. It does not say anything about whether the function grows asymptotically slower, so:
f(n) = n is in O(n), O(n^2) and also O(e^n) because n does not grow asymptotically faster than any of these. But n is not in O(1).
Any function in O(n) is also in O(n^2) and O(e^n).
If you want to describe the tight asymptotic bound, you would use the big-Θ notation, which is introduced just before the big-O notation in the book. f(n) ∊ Θ(g(n)) means that f(n) does not grow asymptotically faster than g(n) and the other way around. So f(n) ∊ Θ(g(n)) is equivalent to f(n) ∊ O(g(n)) and g(n) ∊ O(f(n)).
So f(n) = n is in Θ(n) but not in Θ(n^2) or Θ(e^n) or Θ(1).
Another example: f(n) = n^2 + 2 is in O(n^3) but not in Θ(n^3), it is in Θ(n^2).
You need to think of O(...) as a set (which is why the set theoretic "element-of"-symbol is used). O(g(n)) is the set of all functions that do not grow asymptotically faster than g(n), while Θ(g(n)) is the set of functions that neither grow asymptotically faster nor slower than g(n). So a logical consequence is that Θ(g(n)) is a subset of O(g(n)).
Often = is used instead of the ∊ symbol, which really is misleading. It is pure notation and does not share any properties with the actual =. For example 1 = O(1) and 2 = O(1), but not 1 = O(1) = 2. It would be better to avoid using = for the big-O notation. Nonetheless you will later see that the = notation is useful, for example if you want to express the complexity of rest terms, for example: f(n) = 2*n^3 + 1/2*n - sqrt(n) + 3 = 2*n^3 + O(n), meaning that asymptotically the function behaves like 2*n^3 and the neglected part does asymptotically not grow faster than n.
All of this is kind of against the typically usage of big-O notation. You often find the time/memory complexity of an algorithm defined by it, when really it should be defined by big-Θ notation. For example if you have an algorithm in O(n^2) and one in O(n), then the first one could actually still be asymptotically faster, because it might also be in Θ(1). The reason for this may sometimes be that a tight Θ-bound does not exist or is not known for given algorithm, so at least the big-O gives you a guarantee that things won't take longer than the given bound. By convention you always try to give the lowest known big-O bound, while this is not formally necessary.
The formal definition (from Wikipedia) of the big O notation says that:
f(x) = O(g(x)) as x → ∞
if and only if there is a positive constant M such that for all
sufficiently large values of x, f(x) is at most M multiplied by g(x)
in absolute value. That is, f(x) = O(g(x)) if and only if there exists
a positive real number M and a real number x0 such that
|f(x)|≤ M|g(x)| for all x > x₀ (mean for x big enough)
In our case, we can easily show that
|an + b| < |an + n| (for n sufficiently big, ie when n > b)
Then |an + b| < (a+1)|n|
Since a+1 is constant (corresponds to M in the formal definition), definitely
an + b = O(n)
Your were right to doubt.
I have two algorithms.
The complexity of the first one is somewhere between Ω(n^2*(logn)^2) and O(n^3).
The complexity of the second is ω(n*log(logn)).
I know that O(n^3) tells me that it can't be worse than n^3, but I don't know the difference between Ω and ω. Can someone please explain?
Big-O: The asymptotic worst case performance of an algorithm. The function n happens to be the lowest valued function that will always have a higher value than the actual running of the algorithm. [constant factors are ignored because they are meaningless as n reaches infinity]
Big-Ω: The opposite of Big-O. The asymptotic best case performance of an algorithm. The function n happens to be the highest valued function that will always have a lower value than the actual running of the algorithm. [constant factors are ignored because they are meaningless as n reaches infinity]
Big-Θ: The algorithm is so nicely behaved that some function n can describe both the algorithm's upper and lower bounds within the range defined by some constant value c. An algorithm could then have something like this: BigTheta(n), O(c1n), BigOmega(-c2n) where n == n throughout.
Little-o: Is like Big-O but sloppy. Big-O and the actual algorithm performance will actually become nearly identical as you head out to infinity. little-o is just some function that will always be bigger than the actual performance. Example: o(n^7) is a valid little-o for a function that might actually have linear or O(n) performance.
Little-ω: Is just the opposite. w(1) [constant time] would be a valid little omega for the same above function that might actually exihbit BigOmega(n) performance.
Big omega (Ω) lower bound:
A function f is an element of the set Ω(g) (which is often written as f(n) = Ω(g(n))) if and only if there exists c > 0, and there exists n0 > 0 (probably depending on the c), such that for every n >= n0 the following inequality is true:
f(n) >= c * g(n)
Little omega (ω) lower bound:
A function f is an element of the set ω(g) (which is often written as f(n) = ω(g(n))) if and only for each c > 0 we can find n0 > 0 (depending on the c), such that for every n >= n0 the following inequality is true:
f(n) >= c * g(n)
You can see that it's actually the same inequality in both cases, the difference is only in how we define or choose the constant c. This slight difference means that the ω(...) is conceptually similar to the little o(...). Even more - if f(n) = ω(g(n)), then g(n) = o(f(n)) and vice versa.
Returning to your two algorithms - the algorithm #1 is bounded from both sides, so it looks more promising to me. The algorithm #2 can work longer than c * n * log(log(n)) for any (arbitrarily large) c, so it might eventually loose to the algorithm #1 for some n. Remember, it's only asymptotic analysis - so all depends on actual values of these constants and the problem size which has some practical meaning.
My question refers to the big-Oh notation in algorithm analysis. While Big-Oh seems to be a math question, it's much useful in algorithm analysis.
Suppose two functions are defined below:
f(n) = 2( to the power n) when n is even
f(n) = n when n is odd
g(n) = n when n is even
g(n) = 2( to the power n) when n is odd.
For the above two functions which one is big-Oh of other? Or whether any function is not a Big-Oh of another function.
Thanks!
In this case,
f ∉ O(g), and
g ∉ O(f).
This is because no matter what constants N and k you pick,
there exists i ≥ N such that f(i) > k g(i), and
there exists j ≥ N such that g(j) > k f(j).
The Big-Oh relationship is quite specific in that one function is, after a finite n, always larger than the other.
Is this true here? If so, give such a n. If not, you should prove it.
Usually Big-O and Big-Theta notations get confused.
A layman attempt at definition could be that Big-O means that one function is growing as fast or faster than another one, i.e. that given a large enough n, f(n)<=k*g(n) where k is some constant. That means that if f(x) = 2x^3, then it is in O(x^3), O(x^4), O(2^x), O(x!) etc..
Big-Theta means that one function is growing as fast as another one, with neither one being able to "outgrow" the other, or, k1*g(n)<=f(n)<=k2*g(n) for some k1 and k2. In programming terms that means that these two functions have the same level of complexity. If f(x) = 2x^3, then it is in Θ(x^3), as for example, if k1=1, and k2=3, 1*x^3 < 2*x^3 < 3*x^3
In my experience whenever programmers are talking about Big-O, the discussion is actually about Big-Θ, as we are concerned more with the as fast as part more, than in the no faster than part.
That said, if two functions with different Θ's are combined, as in your example, the larger one - (Θ(2^n) - swallows the smaller - Θ(n), so both f and g have the exact same Big-O and Big-Θ complexities. In this case, it's both correct that
f(n) = O(g(n)), also f(n) = Θ(g(n))
g(n) = O(f(n)), also g(n) = Θ(f(n))
so, as they have the same complexity, they are O and Θ bound by each other.
I have received the assignment to prove 1/O(n) = Ω(n)
However, this would mean that n element of O(n) => 1/n element of Ω(n) which is clearly wrong.
So my question is: Is the statement 1/O(n) = Ω(n) correct?
EDIT: I send an Email to the assistant who wrote the questions. And used the example of f(n) = 1.
He then said, that the statement is indeed incorrect.
The notation 1/O(n) = Ω(n) is a bit vague. There is really no O(n) on it's own, there is only f(n) ~ O(n), which is a statement about values of function f (there is a constant C so that f(n) < Cn for each n).
And the statement to prove, if I understand it correctly, is "if function f(n) is O(n) than 1/f(n) is Ω(n)", formally:
f(n) ~ O(n) => 1/f(n) ~ Ω(n)
Edit: Except I don't think I understand it correctly, because f(n) = 1 ~ O(n), but 1/f(n) = f(n) = 1 is clearly isn't Ω(n). Weren't the assignment f(n) ~ O(n) => 1/f(n) ~ Ω(1/n) instead?
Edit: Different people tend to use different operators. Most common is f(n) = O(n), but that is confusing because the right hand side is not a function, so it can't be normal equality. We usually used f(n) ~ O(n) in school, which is less confusing, but still inconsistent with common use of that operator for general equivalence relations. Most consistent operator would be f(n) ∈ O(n), because the right hand side can reasonably be treated as set of functions.
O(n) more or less implies the following, for some polynomial function f(x), some polynomial function g(x) and O(f(x)):
In terms of magnitude, we have |f(x)| <= M|g(x)|, for some M. Basically, f is bounded above by a constant times g.
Ω(n) implies that, for some polynomial h(x), some polynomial k(x) and Ω(h(x)):
In terms of magnitude, |h(x)| >= M|k(x)|, for some M. Basically, h is bounded below by a constant times k.
So, for (O(f(x)))^-1, 1/|f(x)| <= 1/(M|g(x)|).
A bit of re-arranging gives M|g(x)| <= |f(x)| - i.e. f(x) is bounded below by a constant times g, which is exactly the same as our definition for Ω(n) above.
It's a tad scratchy to be a formal proof, but it should get you started.