I am reading an example where the following are equivalent to O(N):
O(N + P), where P < N/2
O(N + log N)
Can someone explain in laymen terms how it is that the two examples above are the same thing as O(N)?
We always take the greater one in case of addition.
In both the cases N is bigger than the other part.
In first case P < N/2 < N
In second case log N < N
Hence the complexity is O(N) in both the cases.
Let f and g be two functions defined on some subset of the real numbers. One writes
f(x) = O(g(x)) as x -> infinite
if and only if there is a positive constant M such that for all sufficiently large values of x, the absolute value of f(x) is at most M multiplied by the absolute value of g(x). That is, f(x) = O(g(x)) if and only if there exists a positive real number M and a real number x0 such that
|f(x)| <= M |g(x)| for all x > x0
So in your case 1:
f(N) = N + P <= N + N/2
We could set M = 2 Then:
|f(N)| <= 3/2|N| <= 2|N| (N0 could any number)
So:
N+p = O(N)
In your second case, we could also set M=2 and N0=1 to satify that:
|N + logN| <= 2 |N| for N > 1
Big O notation usually only provides an upper bound on the growth rate of the function, wiki. Meaning for your both cases, as P < N and logN < N. So that O(N + P) = O(2N) = O(N), The same to O(N + log N) = O(2N) = O(N). Hope that can answer your question.
For the sake of understanding you can assume that O(n) represents that the complexity is of the order of n and also that O notation represents the upper bound(or the complexity in worst case). So, when I say that O(n+p) it represents that the order of n+p.
Let's assume that in worst case p = n/2, then what would be order of n+n/2? It would still be O(n), that is, linear because constants do form a part of the Big-O notation.
Similary, for O(n+logn) because logn can never be greater than n. So, overall complexity turns out to be linear.
In short
If N is a function and C is a constant:
O(N+N/2):
If C=2, then for any N>1 :
(C=2)*N > N+N/2,
2*N>3*N/2,
2> 3/2 (true)
O(N+logN):
If C=2, then for any N>2 :
(C=2)*N > N+logN,
2*N > N+logN,
2>(N+logN)/N,
2> 1 + logN/N (limit logN/N is 0),
2>1+0 (true)
Counterexample O(N^2):
No C exists such that C*N > N^2 :
C > N^2/N,
C>N (contradiction).
Boring mathematical part
I think the source of confusion is that equals sign in O(f(x))=O(N) does not mean equality! Usually if x=y then y=x. However consider O(x)=O(x^2) which is true, but reverse is false: O(x^2) != O(x)!
O(f(x)) is an upper bound of how fast a function is growing.
Upper bound is not an exact value.
If g(x)=x is an upper bound for some function f(x), then function 2*g(x) (and in general anything growing faster than g(x)) is also an upper bound for f(x).
The formal definition is: for function f(x) to be bound by some other function g(x) if you chose any constant C then, starting from some x_0 g(x) is always greater than f(x).
f(x)=N+N/2 is the same as 3*N/2=1.5*N. If we take g(x)=N and our constant C=2 then 2*g(x)=2*N is growing faster than 1.5*N:
If C=2 and x_0=1 then for any n>(x_0=1) 2*N > 1.5*N.
same applies to N+log(N):
C*N>N+log(N)
C>(N+logN)/N
C>1+log(N)/N
...take n_0=2
C>1+1/2
C>3/2=1.5
use C=2: 2*N>N+log(N) for any N>(n_0=2),
e.g.
2*3>3+log(3), 6>3+1.58=4.68
...
2*100>100+log(100), 200>100+6.64
...
Now interesting part is: no such constant exist for N & N^2. E.g. N squared grows faster than N:
C*N > N^2
C > N^2/N
C > N
obviously no single constant exists which is greater than a variable. Imagine such a constant exists C=C_0. Then starting from N=C_0+1 function N is greater than constant C, therefore such constant does not exist.
Why is this useful in computer science?
In most cases calculating exact algorithm time or space does not make sense as it would depend on hardware speed, language overhead, algorithm implementation details and many other factors.
Big O notation provides means to estimate which algorithm is better independently from real world complications. It's easy to see that O(N) is better than O(N^2) starting from some n_0 no matter which constants are there in front of two functions.
Another benefit is ability to estimate algorithm complexity by just glancing at program and using Big O properties:
for x in range(N):
sub-calc with O(C)
has complexity of O(N) and
for x in range(N):
sub-calc with O(C_0)
sub-calc with O(C_1)
still has complexity of O(N) because of "multiplication by constant rule".
for x in range(N):
sub-calc with O(N)
has complexity of O(N*N)=O(N^2) by "product rule".
for x in range(N):
sub-calc with O(C_0)
for y in range(N):
sub-calc with O(C_1)
has complexity of O(N+N)=O(2*N)=O(N) by "definition (just take C=2*C_original)".
for x in range(N):
sub-calc with O(C)
for x in range(N):
sub-calc with O(N)
has complexity of O(N^2) because "the fastest growing term determines O(f(x)) if f(x) is a sum of other functions" (see explanation in the mathematical section).
Final words
There is much more to Big-O than I can write here! For example in some real world applications and algorithms beneficial n_0 might be so big that an algorithm with worse complexity works faster on real data.
CPU cache might introduce unexpected hidden factor into otherwise asymptotically good algorithm.
Etc...
Related
Can anybody explain which one of them has highest asymptotic complexity and why,
10000000n vs 1.000001^n vs n^2
You can use standard domination rules from asymptotic analysis.
Domination rules tell you that when n -> +Inf, n = o(n^2). (Note the difference between the notations O(.) and o(.), the latter meaning f(n) = o(g(n)) iff there exists a sequence e(n) which converges to 0 as n -> +Inf such that f(n) = e(n)g(n). With f(n) = n, g(n) = n^2, you can see that f(n)/g(n) = 1/n -> 0 as n -> +Inf.)
Furthermore, you know that for any integer k and real x > 1, we have n^k/x^n -> 0 as n -> +Inf. x^n (exponential) complexity dominates n^k (polynomial) complexity.
Therefore, in order of increasing complexity, you have:
n << n^2 << 1.000001^n
Note:10000000n could be written O(n) with the loose written conventions used for asymptotic analysis in computer science. Recall that the complexity C(n) of an algorithm is O(n) (C(n) = O(n)) if and only if (iff) there exists an integer p >= 0 and K >= 0 such that for all n >= p the relation |C(n)| <= K.n holds.
When calculating asymptotic time complexity, you need to ignore all coefficients of n and just focus on its exponent.
The higher the exponent, the higher the time complexity.
In this case
We ignore the coefficients of n, leaving n^2, x^n and n.
However, we ignore the second one as it has an exponent of n. As n^2 is higher than n, the answer to your question is n^2.
Why is ω(n) smaller than O(n)?
I know what is little omega (for example, n = ω(log n)), but I can't understand why ω(n) is smaller than O(n).
Big Oh 'O' is an upper bound and little omega 'ω' is a Tight lower bound.
O(g(n)) = { f(n): there exist positive constants c and n0 such that 0 ≤ f(n) ≤ cg(n) for all n ≥ n0}
ω(g(n)) = { f(n): for all constants c > 0, there exists a constant n0 such that 0 ≤ cg(n) < f(n) for all n ≥ n0}.
ALSO: infinity = lim f(n)/g(n)
n ∈ O(n) and n ∉ ω(n).
Alternatively:
n ∈ ω(log(n)) and n ∉ O(log(n))
ω(n) and O(n) are at the opposite side of the spectrum, as is illustrated below.
Formally,
For more details, see CSc 345 — Analysis of Discrete Structures
(McCann), which is the source of the graph above. It also contains a compact representation of the definitions, which makes them easy to remember:
I can't comment, so first of all let me say that n ≠ Θ(log(n)). Big Theta means that for some positive constants c1, c2, and k, for all values of n greater than k, c1*log(n) ≤ n ≤ c2*log(n), which is not true. As n approaches infinity, it will always be larger than log(n), no matter log(n)'s coefficient.
jesse34212 was correct in saying that n = ω(log(n)). n = ω(log(n)) means that n ≠ Θ(log(n)) AND n = Ω(log(n)). In other words, little or small omega is a loose lower bound, whereas big omega can be loose or tight.
Big O notation signifies a loose or tight upper bound. For instance, 12n = O(n) (tight upper bound, because it's as precise as you can get), and 12n = O(n^2) (loose upper bound, because you could be more precise).
12n ≠ ω(n) because n is a tight bound on 12n, and ω only applies to loose bounds. That's why 12n = ω(log(n)), or even 12n = ω(1). I keep using 12n, but that value of the constant does not affect the equality.
Technically, O(n) is a set of all functions that grow asymptotically equal to or slower than n, and the belongs character is most appropriate, but most people use "= O(n)" (instead of "∈ O(n)") as an informal way of writing it.
Algorithmic complexity has a mathematic definition.
If f and g are two functions, f = O(g) if you can find two constants c (> 0) and n such as f(x) < c * g(x) for every x > n.
For Ω, it is the opposite: you can find constants such as f(x) > c * g(x).
f = Θ(g) if there are three constants c, d and n such as c * g(x) < f(x) < d * g(x) for every x > n.
Then, O means your function is dominated, Θ your function is equivalent to the other function, Ω your function has a lower limit.
So, when you are using Θ, your approximation is better for you are "wrapping" your function between two edges ; whereas O only set a maximum. Ditto for Ω (minimum).
To sum up:
O(n): in worst situations, your algorithm has a complexity of n
Ω(n): in best case, your algorithm has a complexity of n
Θ(n): in every situation, your algorithm has a complexity of n
To conclude, your assumption is wrong: it is Θ, not Ω. As you may know, n > log(n) when n has a huge value. Then, it is logic to say n = Θ(log(n)), according to previous definitions.
I am just a bit confused. If time complexity of an algorithm is given by
what is that in big O notation? Just or we keep the log?
If that's the time-complexity of the algorithm, then it is in big-O notation already, so, yes, keep the log. Asymptotically, there is a difference between O(n^2) and O((n^2)*log(n)).
A formal mathematical proof would be nice here.
Let's define following variables and functions:
N - input length of the algorithm,
f(N) = N^2*ln(N) - a function that computes algorithm's execution time.
Let's determine whether growth of this function is asymptotically bounded by O(N^2).
According to the definition of the asymptotic notation [1], g(x) is an asymptotic bound for f(x) if and only if: for all sufficiently large values of x, the absolute value of f(x) is at most a positive constant multiple of g(x). That is, f(x) = O(g(x)) if and only if there exists a positive real number M and a real number x0 such that
|f(x)| <= M*g(x) for all x >= x0 (1)
In our case, there must exists a positive real number M and a real number N0 such that:
|N^2*ln(N)| <= M*N^2 for all N >= N0 (2)
Obviously, such M and x0 do not exist, because for any arbitrary large M there is N0, such that
ln(N) > M for all N >= N0 (3)
Thus, we have proved that N^2*ln(N) is not asymptotically bounded by O(N^2).
References:
1: - https://en.wikipedia.org/wiki/Big_O_notation
A simple way to understand the big O notation is to divide the actual number of atomic steps by the term withing the big O and validate you get a constant (or a value that is smaller than some constant).
for example if your algorithm does 10n²⋅logn steps:
10n²⋅logn/n² = 10 log n -> not constant in n -> 10n²⋅log n is not O(n²)
10n²⋅logn/(n²⋅log n) = 10 -> constant in n -> 10n²⋅log n is O(n²⋅logn)
You do keep the log because log(n) will increase as n increases and will in turn increase your overall complexity since it is multiplied.
As a general rule, you would only remove constants. So for example, if you had O(2 * n^2), you would just say the complexity is O(n^2) because running it on a machine that is twice more powerful shouldn't influence the complexity.
In the same way, if you had complexity O(n^2 + n^2) you would get to the above case and just say it's O(n^2). Since O(log(n)) is more optimal than O(n^2), if you had O(n^2 + log(n)), you would say the complexity is O(n^2) because it's even less than having O(2 * n^2).
O(n^2 * log(n)) does not fall into the above situation so you should not simplify it.
if complexity of some algorithm =O(n^2) it can be written as O(n*n). is it O(n)?absolutely not. so O(n^2*logn) is not O(n^2).what you may want to know is that O(n^2+logn)=O(n^2).
A simple explanation :
O(n2 + n) can be written as O(n2) because when we increase n, the difference between n2 + n and n2 becomes non-existent. Thus it can be written O(n2).
Meanwhile, in O(n2logn) as the n increases, the difference between n2 and n2logn will increase unlike the above case.
Therefore, logn stays.
If I understand Big-O notation correctly, k should be a constant time for the efficiency of an algorithm. Why would a constant time be considered O(1) rather than O(k), considering it takes a variable time? Linear growth ( O(n + k) ) uses this variable to shift the time right by a specific amount of time, so why not the same for constant complexity?
There is no such linear growth asymptotic O(n + k) where k is a constant. If k were a constant and you went back to the limit representation of algorithmic growth rates, you'd see that O(n + k) = O(n) because constants drop out in limits.
Your answer may be O(n + k) due to a variable k that is fundamentally independent of the other input set n. You see this commonly in compares vs moves in sorting algorithm analysis.
To try to answer your question about why we drop k in Big-O notation (which I think is taught poorly, leading to all this confusion), one definition (as I recall) of O() is as follows:
Read: f(n) is in O( g(n) ) iff there exists d and n_0 where for all n > n_0,
f(n) <= d * g(n)
Let's try to apply it to our problem here where k is a constant and thus f(x) = k and g(x) = 1.
Is there a d and n_0 that exist to satisfy these requirements?
Trivially, the answer is of course yes. Choose d > k and for n > 0, the definition holds.
In n-element array sorting processing takes;
in X algorithm: 10-8n2 sec,
in Y algoritm 10-6n log2n sec,
in Z algoritm 10-5 sec.
My question is how do i compare them. For example for y works faster according to x, Which should I choose the number of elements ?
When comparing Big-Oh notations, you ignore all constants:
N^2 has a higher growth rate than N*log(N) which still grows more quickly than O(1) [constant].
The power of N determines the growth rate.
Example:
O(n^3 + 2n + 10) > O(200n^2 + 1000n + 5000)
Ignoring the constants (as you should for pure big-Oh comparison) this reduces to:
O(n^3 + n) > O(n^2 + n)
Further reduction ignoring lower order terms yields:
O(n^3) > O(n^2)
because the power of N 3 > 2.
Big-Oh follows a hierarchy that goes something like this:
O(1) < O(log[n]) < O(n) < O(n*log[n]) < O(n^x) < O(x^n) < O(n!)
(Where x is any amount greater than 1, even the tiniest bit.)
You can compare any other expression in terms of n via some rules which I will not post here, but should be looked up in Wikipedia. I list O(n*log[n]) because it is rather common in sorting algorithms; for details regarding logarithms with different bases or different powers, check a reference source (did I mention Wikipedia?)
Give the wiki article a shot: http://en.wikipedia.org/wiki/Big_O_notation
I propose this different solution since there is not an accepted answer yet.
If you want to see at what value of n does one algorithm perform better than another, you should set the algorthim times equal to each other and solve for n.
For Example:
X = Z
10^-8 n^2 = 10^-5
n^2 = 10^3
n = sqrt(10^3)
let c = sqrt(10^3)
So when comparing X and Z, choose X if n is less than c, and Z if n is greater than c. This can be repeating between the other two pairs.