n^2 log n complexity - big-o

I am just a bit confused. If time complexity of an algorithm is given by
what is that in big O notation? Just or we keep the log?

If that's the time-complexity of the algorithm, then it is in big-O notation already, so, yes, keep the log. Asymptotically, there is a difference between O(n^2) and O((n^2)*log(n)).

A formal mathematical proof would be nice here.
Let's define following variables and functions:
N - input length of the algorithm,
f(N) = N^2*ln(N) - a function that computes algorithm's execution time.
Let's determine whether growth of this function is asymptotically bounded by O(N^2).
According to the definition of the asymptotic notation [1], g(x) is an asymptotic bound for f(x) if and only if: for all sufficiently large values of x, the absolute value of f(x) is at most a positive constant multiple of g(x). That is, f(x) = O(g(x)) if and only if there exists a positive real number M and a real number x0 such that
|f(x)| <= M*g(x) for all x >= x0 (1)
In our case, there must exists a positive real number M and a real number N0 such that:
|N^2*ln(N)| <= M*N^2 for all N >= N0 (2)
Obviously, such M and x0 do not exist, because for any arbitrary large M there is N0, such that
ln(N) > M for all N >= N0 (3)
Thus, we have proved that N^2*ln(N) is not asymptotically bounded by O(N^2).
References:
1: - https://en.wikipedia.org/wiki/Big_O_notation

A simple way to understand the big O notation is to divide the actual number of atomic steps by the term withing the big O and validate you get a constant (or a value that is smaller than some constant).
for example if your algorithm does 10n²⋅logn steps:
10n²⋅logn/n² = 10 log n -> not constant in n -> 10n²⋅log n is not O(n²)
10n²⋅logn/(n²⋅log n) = 10 -> constant in n -> 10n²⋅log n is O(n²⋅logn)

You do keep the log because log(n) will increase as n increases and will in turn increase your overall complexity since it is multiplied.
As a general rule, you would only remove constants. So for example, if you had O(2 * n^2), you would just say the complexity is O(n^2) because running it on a machine that is twice more powerful shouldn't influence the complexity.
In the same way, if you had complexity O(n^2 + n^2) you would get to the above case and just say it's O(n^2). Since O(log(n)) is more optimal than O(n^2), if you had O(n^2 + log(n)), you would say the complexity is O(n^2) because it's even less than having O(2 * n^2).
O(n^2 * log(n)) does not fall into the above situation so you should not simplify it.

if complexity of some algorithm =O(n^2) it can be written as O(n*n). is it O(n)?absolutely not. so O(n^2*logn) is not O(n^2).what you may want to know is that O(n^2+logn)=O(n^2).

A simple explanation :
O(n2 + n) can be written as O(n2) because when we increase n, the difference between n2 + n and n2 becomes non-existent. Thus it can be written O(n2).
Meanwhile, in O(n2logn) as the n increases, the difference between n2 and n2logn will increase unlike the above case.
Therefore, logn stays.

Related

How are the following equivalent to O(N)

I am reading an example where the following are equivalent to O(N):
O(N + P), where P < N/2
O(N + log N)
Can someone explain in laymen terms how it is that the two examples above are the same thing as O(N)?
We always take the greater one in case of addition.
In both the cases N is bigger than the other part.
In first case P < N/2 < N
In second case log N < N
Hence the complexity is O(N) in both the cases.
Let f and g be two functions defined on some subset of the real numbers. One writes
f(x) = O(g(x)) as x -> infinite
if and only if there is a positive constant M such that for all sufficiently large values of x, the absolute value of f(x) is at most M multiplied by the absolute value of g(x). That is, f(x) = O(g(x)) if and only if there exists a positive real number M and a real number x0 such that
|f(x)| <= M |g(x)| for all x > x0
So in your case 1:
f(N) = N + P <= N + N/2
We could set M = 2 Then:
|f(N)| <= 3/2|N| <= 2|N| (N0 could any number)
So:
N+p = O(N)
In your second case, we could also set M=2 and N0=1 to satify that:
|N + logN| <= 2 |N| for N > 1
Big O notation usually only provides an upper bound on the growth rate of the function, wiki. Meaning for your both cases, as P < N and logN < N. So that O(N + P) = O(2N) = O(N), The same to O(N + log N) = O(2N) = O(N). Hope that can answer your question.
For the sake of understanding you can assume that O(n) represents that the complexity is of the order of n and also that O notation represents the upper bound(or the complexity in worst case). So, when I say that O(n+p) it represents that the order of n+p.
Let's assume that in worst case p = n/2, then what would be order of n+n/2? It would still be O(n), that is, linear because constants do form a part of the Big-O notation.
Similary, for O(n+logn) because logn can never be greater than n. So, overall complexity turns out to be linear.
In short
If N is a function and C is a constant:
O(N+N/2):
If C=2, then for any N>1 :
(C=2)*N > N+N/2,
2*N>3*N/2,
2> 3/2 (true)
O(N+logN):
If C=2, then for any N>2 :
(C=2)*N > N+logN,
2*N > N+logN,
2>(N+logN)/N,
2> 1 + logN/N (limit logN/N is 0),
2>1+0 (true)
Counterexample O(N^2):
No C exists such that C*N > N^2 :
C > N^2/N,
C>N (contradiction).
Boring mathematical part
I think the source of confusion is that equals sign in O(f(x))=O(N) does not mean equality! Usually if x=y then y=x. However consider O(x)=O(x^2) which is true, but reverse is false: O(x^2) != O(x)!
O(f(x)) is an upper bound of how fast a function is growing.
Upper bound is not an exact value.
If g(x)=x is an upper bound for some function f(x), then function 2*g(x) (and in general anything growing faster than g(x)) is also an upper bound for f(x).
The formal definition is: for function f(x) to be bound by some other function g(x) if you chose any constant C then, starting from some x_0 g(x) is always greater than f(x).
f(x)=N+N/2 is the same as 3*N/2=1.5*N. If we take g(x)=N and our constant C=2 then 2*g(x)=2*N is growing faster than 1.5*N:
If C=2 and x_0=1 then for any n>(x_0=1) 2*N > 1.5*N.
same applies to N+log(N):
C*N>N+log(N)
C>(N+logN)/N
C>1+log(N)/N
...take n_0=2
C>1+1/2
C>3/2=1.5
use C=2: 2*N>N+log(N) for any N>(n_0=2),
e.g.
2*3>3+log(3), 6>3+1.58=4.68
...
2*100>100+log(100), 200>100+6.64
...
Now interesting part is: no such constant exist for N & N^2. E.g. N squared grows faster than N:
C*N > N^2
C > N^2/N
C > N
obviously no single constant exists which is greater than a variable. Imagine such a constant exists C=C_0. Then starting from N=C_0+1 function N is greater than constant C, therefore such constant does not exist.
Why is this useful in computer science?
In most cases calculating exact algorithm time or space does not make sense as it would depend on hardware speed, language overhead, algorithm implementation details and many other factors.
Big O notation provides means to estimate which algorithm is better independently from real world complications. It's easy to see that O(N) is better than O(N^2) starting from some n_0 no matter which constants are there in front of two functions.
Another benefit is ability to estimate algorithm complexity by just glancing at program and using Big O properties:
for x in range(N):
sub-calc with O(C)
has complexity of O(N) and
for x in range(N):
sub-calc with O(C_0)
sub-calc with O(C_1)
still has complexity of O(N) because of "multiplication by constant rule".
for x in range(N):
sub-calc with O(N)
has complexity of O(N*N)=O(N^2) by "product rule".
for x in range(N):
sub-calc with O(C_0)
for y in range(N):
sub-calc with O(C_1)
has complexity of O(N+N)=O(2*N)=O(N) by "definition (just take C=2*C_original)".
for x in range(N):
sub-calc with O(C)
for x in range(N):
sub-calc with O(N)
has complexity of O(N^2) because "the fastest growing term determines O(f(x)) if f(x) is a sum of other functions" (see explanation in the mathematical section).
Final words
There is much more to Big-O than I can write here! For example in some real world applications and algorithms beneficial n_0 might be so big that an algorithm with worse complexity works faster on real data.
CPU cache might introduce unexpected hidden factor into otherwise asymptotically good algorithm.
Etc...

Is the complexity of 3 logn and 2logn same?

DOes it have same complexity since they vary by constant multiplier, or should it be made n^3 and n^2 and be compared?
For 'BigOh' notation, the constant multiplier really doesn't matter. All that it does is, it gives the order of the running time complexity.
You can consider this small example:
Say you have 3 * 100 = 300 apples and 2 * 100 = 200 apples. Surely, 300 != 200, but the order of both are same, that is in order of hundreds.
So by the same means, 3(log n) != 2(log n), but both 3(log n) and 2(log n) are in the order of log n, that is O(log n).
Yes. Multiplying by a constant doesn't matter. The are both just O(log n).
In fact, this is part of the definition of big-o notation. If a function may be bounded by a polynomial in n, then as n tends to infinity, you may disregard lower-order terms of the polynomial.
Both are equivalent to O(log n). The constant does not alter the complexity.
Big O notation is defined by the set of all upper bounded functions.
With that being said, it is also important to note that Big O can be defined mathematically as:
O(g(n)) = {f(n): f(n) < c·g(n), c being some arbitrary constant}
So as you can see, the constant doesn't really matter in Big O; we don't care if there is one that works. So both 3logn and 2logn both can be described as O(logn).

Algorithm Running Time for O(n.m^2)

I would like to know, because I couldn't find any information online, how is an algorithm like O(n * m^2) or O(n * k) or O(n + k) supposed to be analysed?
Does only the n count?
The other terms are superfluous?
So O(n * m^2) is actually O(n)?
No, here the k and m terms are not superfluous,they do have a valid existence and essential for computing time complexity. They are wrapped together to provide a concrete-complexity to the code.
It may seem like the terms n and k are independent to each other in the code,but,they both combinedly determines the complexity of the algorithm.
Say, if you've to iterate a loop of size n-elements, and, in between, you have another loop of k-iterations, then the overall complexity turns O(nk).
Complexity of order O(nk), you can't dump/discard k here.
for(i=0;i<n;i++)
for(j=0;j<k;j++)
// do something
Complexity of order O(n+k), you can't dump/discard k here.
for(i=0;i<n;i++)
// do something
for(j=0;j<k;j++)
// do something
Complexity of order O(nm^2), you can't dump/discard m here.
for(i=0;i<n;i++)
for(j=0;j<m;j++)
for(k=0;k<m;k++)
// do something
Answer to the last question---So O(n.m^2) is actually O(n)?
No,O(nm^2) complexity can't be reduced further to O(n) as that would mean m doesn't have any significance,which is not the case actually.
FORMALLY: O(f(n)) is the SET of ALL functions T(n) that satisfy:
There exist positive constants c and N such that, for all n >= N,
T(n) <= c f(n)
Here are some examples of when and why factors other than n matter.
[1] 1,000,000 n is in O(n). Proof: set c = 1,000,000, N = 0.
Big-Oh notation doesn't care about (most) constant factors. We generally leave constants out; it's unnecessary to write O(2n), because O(2n) = O(n). (The 2 is not wrong; just unnecessary.)
[2] n is in O(n^3). [That's n cubed]. Proof: set c = 1, N = 1.
Big-Oh notation can be misleading. Just because an algorithm's running time is in O(n^3) doesn't mean it's slow; it might also be in O(n). Big-Oh notation only gives us an UPPER BOUND on a function.
[3] n^3 + n^2 + n is in O(n^3). Proof: set c = 3, N = 1.
Big-Oh notation is usually used only to indicate the dominating (largest
and most displeasing) term in the function. The other terms become
insignificant when n is really big.
These aren't generalizable, and each case may be different. That's the answer to the questions: "Does only the n count? The other terms are superfluous?"
Although there is already an accepted answer, I'd still like to provide the following inputs :
O(n * m^2) : Can be viewed as n*m*m and assuming that the bounds for n and m are similar then the complexity would be O(n^3).
Similarly -
O(n * k) : Would be O(n^2) (with the bounds for n and k being similar)
and -
O(n + k) : Would be O(n) (again, with the bounds for n and k being similar).
PS: It would be better not to assume the similarity between the variables and to first understand how the variables relate to each other (Eg: m=n/2; k=2n) before attempting to conclude.

Why does Big-O Notation use O(1) instead of O(k)?

If I understand Big-O notation correctly, k should be a constant time for the efficiency of an algorithm. Why would a constant time be considered O(1) rather than O(k), considering it takes a variable time? Linear growth ( O(n + k) ) uses this variable to shift the time right by a specific amount of time, so why not the same for constant complexity?
There is no such linear growth asymptotic O(n + k) where k is a constant. If k were a constant and you went back to the limit representation of algorithmic growth rates, you'd see that O(n + k) = O(n) because constants drop out in limits.
Your answer may be O(n + k) due to a variable k that is fundamentally independent of the other input set n. You see this commonly in compares vs moves in sorting algorithm analysis.
To try to answer your question about why we drop k in Big-O notation (which I think is taught poorly, leading to all this confusion), one definition (as I recall) of O() is as follows:
Read: f(n) is in O( g(n) ) iff there exists d and n_0 where for all n > n_0,
f(n) <= d * g(n)
Let's try to apply it to our problem here where k is a constant and thus f(x) = k and g(x) = 1.
Is there a d and n_0 that exist to satisfy these requirements?
Trivially, the answer is of course yes. Choose d > k and for n > 0, the definition holds.

Big-Oh Notation

if T(n) is O(n), then it is also correct to say T(n) is O(n2) ?
Yes; because O(n) is a subset of O(n^2).
Assuming
T(n) = O(n), n > 0
Then both of the following are true
T(n) = O(2n)
T(n) = O(n2)
This is because both 2n and n2 grow as quickly as or more quickly than just plain n. EDIT: As Philip correctly notes in the comments, even a value smaller than 1 can be the multiplier of n, since constant terms may be dropped (they become insignificant for large values of n; EDIT 2: as Oli says, all constants are insignificant per the definition of O). Thus the following is also true:
T(n) = O(0.2n)
In fact, n2 grows so quickly that you can also say
T(n) = o(n2)
But not
T(n) = Θ(n2)
because the functions given provide an asymptotic upper bound, not an asymptotically tight bound.
if you mean O(2 * N) then yes O(n) == O(2n). The time taken is a linear function of the input data in both cases
I disagree with the other answer that says O(N) = O(N*N). It is true that the O(N) function will finish in less time than O(N*N), but the completion time is not a function of n*n so it really isnt true
I suppose the answer depends on why u r asking the question
O also known as Big-Oh is a upper bound. We can say that there exists a C such that, for all n > N, T(n) < C g(n). Where C is a constant.
So until an unless the large co-efficient in T(n) is smaller or equal to g(n) then that statement is always valid.

Resources