I understand how to calculate a function's complexity for the most part. The same goes for determining the order of growth for a mathematical function. [I probably don't understand it as much as I think I do, which is why I'm probably asking this.] For instance:
an^3 + bn^2 + cn + d could be written as O(n^3) in big-o notation, since for large enough nthe values of the term bn^2 + cn + d are insignificant in comparison to an^3 (the constant coefficients a, b, c and d are left out as well, as their contribution to the value become insignificant, too).
What I don't understand is, how does this work when the leading term is involved in some sort of division? For instance:
a/n^3 + bn^2 or n^3/a + bn^2
Let n=100, a=1000 and b=10 for the former formula, then we have
n^3/a = 100^3/1000 = 1000 and bn^2 = 10*100^2 = 100,000
or even more dramatic for the latter - in this case the leading term is not only growing slowly as above, but it's also shrinking, isn't it?:
a/n^3 = 1000/100^3 = 0.001 and bn^2 = 100,000 as above.
In both cases the second term contributes much more, so isn't it n^2that actually determines the order of growth?
It gets even more complicated (for me, at least) when the leading term is followed by a subtraction (a/n^3 - bn^2) or when the second term is also a division (n^3/a + n^2/b) or when both are divisions but in mixed order (a/n^3 + n^2/b), etc.
The list seems endless, so my general question is, how to understand and handle formulas that involve division (and subtraction) in order to determine the order of growth for a given function?
A division is just a multiplication by the multiplicative inverse, so n^3/a == n^3 * a^-1, and you can handle it the same way as any other coefficient.
With regards to substraction a*n^3 - b*n^2 <= a*n^3, so it is also in O(n^3). Also, a*n^3 - b*n^2 >= a/2 * n^3 for large enough values of n, and it is also in Omega(n^3). A more detailed explanation about substraction can be found in: Algorithm complexity when faced with substraction in value
big O notation is generally used for increasing (don't have to be monotonically) functions, and a decreasing function such as a/n is not a good fit for it, though O(1/n) seems to be still perfectly defined, AFAIK, and it is a subset of O(1) (unless you take into account only discrete functions). However, this has very little value for algorithm's analysis, as a complexity of an algorithm cannot really shrink..
There's a very simple rule for the type of questions you posted.
Suppose you're trying to find the order of growth of f(n), and you find some simple function g(n) such that
lim {n -> inf} f(n) / g(n) = k
where k is a positive finite constant. Then
f(n) = Theta(g(n))
(It's easy to see this from the calculus definitions.)
Now let's see how this applies to your examples:
lim {n -> inf} (a/n^3 + bn^2) / n^2 = b
so it's Theta(n^2).
lim {n -> inf} (a n^3 - bn^2) / n^3 = a
so it's Theta(n^2).
(of course, assuming a and b are positive.)
Related
Are these two equal? I read somewhere that O(2lg n) = O(n). Going by this observation, I'm guessing the answer would be no, but I'm not entirely sure. I'd appreciate any help.
Firstly, O(2log(n)) isn't equal to O(n).
To use big O notation, you would find a function that represents the complexity of your algorithm, then you would find the term in that function with the largest growth rate. Finally, you would eliminate any constant factors you could.
e.g. say your algorithm iterates 4n^2 + 5n + 1 times, where n is the size of the input. First, you would take the term with the highest growth rate, in this case 4n^2, then remove any constant factors, leaving O(n^2) complexity.
In your example, O(2log(n)) can be simplified to O(log(n))
Now on to your question.
In computer science, unless specified otherwise, you can generally assume that log(n) actually means the log of n, base 2.
This means, using log laws, 2^log(n) can be simplified to O(n)
Proof:
y = 2^log(n)
log(y) = log(2^log(n))
log(y) = log(n) * log(2) [Log(2) = 1 since we are talking about base 2 here]
log(y) = log(n)
y = n
In big O notation, we always say that we should ignore constant factors for most cases. That is, rather than writing,
3n^2-100n+6
we are almost always satisfied with
n^2
since that term is the fastest growing term in the equation.
But I found many algorithm courses starts comparing functions with many terms
2n^2+120n+5 = big O of n^2
then finding c and n0 for those long functions, before recommending to ignore low order terms in the end.
My question is what would I get from trying to understand and annalising these kinds of functions with many terms? Before this month I am comfortable with understanding what O(1), O(n), O(LOG(n)), O(N^3) mean. But am I missing some important concepts if I just rely on this typically used functions? What will I miss if I skipped analysing those long functions?
Let's first of all describe what we mean when we say that f(n) is in O(g(n)):
... we can say that f(n) is O(g(n)) if we can find a constant c such
that f(n) is less than c·g(n) or all n larger than n0, i.e., for all
n>n0.
In equation for: we need to find one set of constants (c, n0) that fulfils
f(n) < c · g(n), for all n > n0, (+)
Now, the result that f(n) is in O(g(n)) is sometimes presented in difference forms, e.g. as f(n) = O(g(n)) or f(n) ∈ O(g(n)), but the statement is the same. Hence, from your question, the statement 2n^2+120n+5 = big O of n^2 is just:
f(n) = 2n^2 + 120n + 5
a result after some analysis: f(n) is in O(g(n)), where
g(n) = n^2
Ok, with this out of the way, we look at the constant term in the functions we want to analyse asymptotically, and let's look at it educationally, using however, your example.
As the result of any big-O analysis is the asymptotic behaviour of a function, in all but some very unusual cases, the constant term has no effect whatsoever on this behaviour. The constant factor can, however, affect how to choose the constant pair (c, n0) used to show that f(n) is in O(g(n)) for some functions f(n) and g(n), i.e., the none-unique constant pair (c, n0) used to show that (+) holds. We can say that the constant term will have no effect of our result of the analysis, but it can affect our derivation of this result.
Lets look at your function as well as another related function
f(n) = 2n^2 + 120n + 5 (x)
h(n) = 2n^2 + 120n + 22500 (xx)
Using a similar approach as in this thread, for f(n), we can show:
linear term:
120n < n^2 for n > 120 (verify: 120n = n^2 at n = 120) (i)
constant term:
5 < n^2 for e.g. n > 3 (verify: 3^2 = 9 > 5) (ii)
This means that if we replace both 120n as well as 5 in (x) by n^2 we can state the following inequality result:
Given that n > 120, we have:
2n^2 + n^2 + n^2 = 4n^2 > {by (ii)} > 2n^2 + 120n + 5 = f(n) (iii)
From (iii), we can choose (c, n0) = (4, 120), and (iii) then shows that these constants fulfil (+) for f(n) with g(n) = n^2, and hence
result: f(n) is in O(n^2)
Now, for for h(n), we analogously have:
linear term (same as for f(n))
120n < n^2 for n > 120 (verify: 120n = n^2 at n = 120) (I)
constant term:
22500 < n^2 for e.g. n > 150 (verify: 150^2 = 22500) (II)
In this case, we replace 120n as well as 22500 in (xx) by n^2, but we need a larger less than constraint on n for these to hold, namely n > 150. Hence, we the following holds:
Given that n > 150, we have:
2n^2 + n^2 + n^2 = 4n^2 > {by (ii)} > 2n^2 + 120n + 5 = h(n) (III)
In same way as for f(n), we can, here, choose (c, n0) = (4, 150), and (III) then shows that these constants fulfil (+) for h(n), with g(n) = n^2, and hence
result: h(n) is in O(n^2)
Hence, we have the same result for both functions f(n) and h(n), but we had to use different constants (c,n0) to show these (i.e., somewhat different derivation). Note finally that:
Naturally the constants (c,n0) = (4,150) (used for h(n) analysis) are also valid to show that f(n) is in O(n^2), i.e., that (+) holds for f(n) with g(n)=n^2.
However, not the reverse: (c,n0) = (4,120) cannot be used to show that (+) holds for h(n) (with g(n)=n^2).
The core of this discussion is that:
As long as you look at sufficiently large values of n, you will be able to describe the constant terms in relations as constant < dominantTerm(n), where, in our example, we look at the relation with regard to dominant term n^2.
The asymptotic behaviour of a function will not (in all but some very unusual cases) depend on the constant terms, so we might as well skip looking at them at all. However, for a rigorous proof of the asymptotic behaviour of some function, we need to take into account also the constant terms.
Ever have intermediate steps in your work? That is what this likely is as when you are computing a big O, chances are you don't already know for sure what the highest order term is and thus you keep track of them all and then determine which complexity class makes sense in the end. There is also something to be said for understanding why the lower order terms can be ignored.
Take some graph algorithms like a minimum spanning tree or shortest path. Now, can just looking at an algorithm you know what the highest term will be? I know I wouldn't and so I'd trace through the algorithm and collect a bunch of terms.
If you want another example, consider Sorting Algorithms and whether you want to memorize all the complexities or not. Bubble Sort, Shell Sort, Merge Sort, Quick Sort, Radix Sort and Heap Sort are a few of the more common algorithms out there. You could either memorize both the algorithm and complexity or just the algorithm and derive the complexity from the pseudo code if you know how to trace them.
The informal idea of Big-O is described as "it's the highest order of growth of a function" ie f(n) = 3n^2 + 5n + 50 is just O(n^2).
I do understand that Big-O is just a way of saying "guaranteed to not be worse than this period". Formally, it appears the definition is f(n) -> O(g(n)) iff f(n) <= c * g(n) where c is positive
First some mathy stuff.. if f(n) = 5n^2, g(n)=n I should be able to show 5n^2 isn't O(g(n)) by doing
5n^2 <= cn
5n <= c
If the idea is that is that c isn't a constant(I have no idea if that's a requirement), and that is proof f(n) isnt in O(g(n)), what about if g(n) were n^3 (of which it surely should be contained)?
5n^2 <= cn^3
5/n <= c
I have a misunderstanding of how the math works out for all of this I assume, so I ask:
How does all this fancy stuff work
How does it connect to the simple definition given in my data structures class?
Thanks for any help
n is a positive integer, which means that 1<=n and therefore 5/n<=5/1=5, so you can pick c=5.
A more complete definition also allows you to pick n0 and a, both constants, and only prove that f(n)<=a+c*g(n) for all n0<n
c is a constant (i.e. independent of n)
In your first example (it's proof by contradiction):
i.e. assume
5n^2 <= cn
5n <= c
But for any fixed constant c, we can find a value of n that makes it untrue.
For example pick c = 1000000, then a value of n = 200001 would be a contradiction.
In your second example, we know that f(n) is O(n^2), therefore it is also O(n^3) and above. If you are bounded by k(n^2), you are also bounded by j(n^3)
The informal idea of Big-O is described as "it's the highest order of growth of a function" ie f(n) = 3n^2 + 5n + 50 is just O(n^2).
I wouldn't say that this is the idea behind Big-O. Informally Big-O is some rough estimation of what a given function cannot exceed. And it's usage is mostly approximating how something will grow for big numbers.
For example, if we take a 6 digit number, we can definitely say that it's less than million without looking for its digits. There are a lot of cases when this is enough and we don't need to analyse all digits.
For analysing function growth two factors play their role:
we only interested in function behaviour for very large numbers
if f is bigger than g, but we can fix it with multiplying g by some big constant, that's means f's advantage is not because of growth
This leads us to two parts of the definition: (1) some constant and (2) for big enough n
And for polynomials indeed, the higher order component defines grow speed.
I am new in the algorithm analysis domain. I read here in the Stack Overflow question
"What is a plain English explanation of "Big O" notation?" that O(2n^2) and O(100 n^2) are the same as O(n^2). I don't understand this, because if we take n = 4, the number of operations will be:
O(2 n^2) = 32 operations
O(100 n^2) = 1600 operations
O(n^2) = 16 operations
Can any one can explain why we are supposed to treat these different operation counts as equivalent?
Why this is true can be derived directly from the formal definition. More specifically, f(x) = O(g(n)) if and only if |f(x)| <= M|g(x)| for all x >= x0 for some M and x0. Here you're free to pick M as you wish, so if M = 5 for f(x) = O(n2) to be true, then you can just pick M = 5*100 for f(x) = O(100 n2) to be true.
Why this is useful is a bit of a different story.
Some concerns with having constants matter:
What operations are we measuring? Array accesses? Arithmetic operations? Multiplication only? Arithmetic with multiplication weighted double as much as addition? And you may want to compare algorithms (that have the same Big-O complexity) using this metric, when in fact there can be some subtle difference in the number of operations that even the most experience computer scientists can miss.
Let's say you can assign a reasonable weight to each operation. Now there has to be across the board agreement to this, otherwise you'll have some near-meaningless analyses of algorithms done by someone using different weights (except for what information big-O would've given you).
The weights may be time-bound, as the speed of operations improve with time, and some operations may improve faster than others.
The weights may be environment-bound, as the speed of operations can differ on different environments. For example, disk read is a lot slower than memory read.
Big-O (which is part of asymptotic complexity) avoid all of these issues. You only check how many times some piece of code that takes a constant amount of time (i.e. independent of input size) is executed. As example:
c = 0
for i = 1 to n
for j = 1 to n
for k = 1 to n
x = input[i]*input[j]
y = input[j]*input[k]
z = input[i]*input[k]
c += (x-y)*z
So there are 4 multiplications, 1 subtraction and 1 addition, each executed n3 times, but here we just say that this code:
x = input[i]*input[j]
y = input[j]*input[k]
z = input[i]*input[k]
c += (x-y)*z
runs in constant time (it will always take the same amount of time, regardless of how many elements there are in the array) and will be executed O(n3) times, thus the running time is O(n3).
Because O(f(n)) means that the said function is bouded by some constant times f(n). If g is bounded by multiple of 100 f(n), it is also bouded by multiple of f(n). Specifically, O(2 n^2) does not mean it's not greater than 16 for n = 4, but that for all n it's not greater than C * 2n^2 for some fixed C, independent of n.
Because it is a classification, so it places algorithms in some complexity class. The classes are O(1), O(n), O(n log n), O(n ^ 2), O(n ^ 3), O(n ^ n), etc. By definition, two algorithms are in the same complexity class if the difference is a constant factor when n goes to infinity (the big-oh notation is for comparing algorithmic complexity for large values of n).
Suppose I have the following : T(n) = 5n^2 +2n
The asymtotic tight bound of this is theta n^2. I want to understand the reason behind dropping the 5. I understand why we ignore the lower order terms.
Consult the definition of big-O.
Keeping things simple[*], let's define that a function g is O(f) if there exist constants C and M such that for n > M, 0 <= g(n) < Cf(n).
The presence of a positive constant multiplier in f doesn't affect this, just choose C appropriately. Your example T is O(n^2) by choosing a value of C greater than 5, and a value of M big enough that the +2n is irrelevant. For for example, for n > 2 it's a fact that 5n^2 + 2n < 6n^2 (because n^2 > 2n), so with C= 6 and M = 2 we see that T(n) is O(n^2).
So it's true that T(n) is O(n^2), and also true that it's O(5n^2), and O(5n^2 + 2n). The most interesting of those facts is that it's O(n^2), since it's the simplest expression and the other two are logically equivalent. If we want to compare the complexities of different functions, then we want to use simple expressions.
For big-Theta just note that we can play the same trick when f and g are the other way around. The relation "g is Theta(f)" is an equivalence relation, so what are we going to choose as the representative member of the equivalence class of T? The simplest one.
[*] Keeping things less simple, we cope with negative numbers by using a limsup rather than a plain limit. My definition above is actually sufficient but not necessary.
wEverything comes back to the concept of "Order of Magnitute". Given something like
5n^2 +2n
You think that the 5 is significant, however when you break it down and think about the numbers, the constant really doesn't matter (graph it and you will see why). For example .. say n is 50.
5 * 50^2 + 2 * 50 --> 5 * 2500 + 2 * 50 --> 12,600
As you mentioned, the 2*n is insignificant when compared to n^2. This same concept applies when viewing the constant as well... you might think that 2500 vs 125,000 is a big difference; however consider if the algorithm was n^3... you are now looking at 12,600 vs 625,100
So the factor that will have the most significant difference on the cost of an algorithm is just n^2.
The constant is dropped because it does not affect which complexity class the function belongs to.
If you have two functions f(n) = c1 * n^2 and g(n) = c2 * n^3, where c1 and c2 are constants, it doesn't matter how large c1 is and how small c2 is, g(n) will ALWAYS overtake and outgrow f(n) at some value of n.