Prove O(n) is not a subset of O(n log n) - algorithm

I saw a proof for O(2n) is same as O(n) in this post => Which algorithm is faster O(N) or O(2N)?
Which means O(n) is same as O(4n).
Can someone show me how O(n) is not a subset of O(n log n)?
Because, if n = 16 and base = 2, O(n log n) will be O(n * 4), which should make it O(n)?
I know above statement is wrong. But not sure which part. Kindly clarify.

Because, if n = 16 and base = 2, O(n log n) will be O(n * 4), which should make it O(n)?
This is a fundamental misunderstanding of what O(n log n) means.
O(n log n) is a set of functions. Intuitively, it is the set of all functions {g(n)} where g(n) is proportional to f(n) = n log n.
(There is a rigorous mathematical definition of what "proportional" means that deals with awkward edge cases, but you need to understand "limits" ... which is relatively advanced mathematics ... to comprehend the definition.)
You are substituting a value for the argument ... which is mathematically meaningless. Facially, you are evaluating O(n log n) as a function for some value of n. That might make sense if O(...) denoted a function. But it doesn't.
Big O is a mathematical notation for a set of functions that are related to a given function in a particular way. And (intuitively) the relationship is about what happens when the n gets larger. You can substitute a specific value for n and still preserve the meaning of the notation.
(What you have done makes about as much mathematical sense as canceling out the x in:
d(x.x)
------
dx
... or one of those schoolboy "proofs" that one is zero that entails division by zero.)
To gain a deeper understand of why your substtution is meaningless, review the more formal definition of Big Oh notation; e.g. on Wikipedia. If you know what limits are.

You cannot say that n=16. Then you're treating it as a constant. n is a variable.
Look at O(n²). If n=16, then O(n²)=O(16*n)=O(256)=O(1)
It works for any complexity. Consider O(n!), as it is for traveling salesman. If n=16, then O(n!)=O(16!)=O(huge constant)=O(1)
Besides, as chepner pointed out, O(n) IS a subset of O(nlog n). Your real question is if the sets O(n) and O(nlog n) are equal, which they are not.

Related

Why O(n log n) is greater than O(n)?

I read that O(n log n) is greater than O(n), I need to know why is it so?
For instance taking n as 1, and solving O(n log n) will be O(1 log 1) = O(0). On the same hand O(n) will be O(1)?
Which actually contradicts O(n log n) > O(n)
Let us start by clarifying what is Big O notation in the current context. From (source) one can read:
Big O notation is a mathematical notation that describes the limiting
behavior of a function when the argument tends towards a particular
value or infinity. (..) In computer science, big O notation is used to classify algorithms
according to how their run time or space requirements grow as the
input size grows.
The following statement is not accurate:
For instance taking n as 1, solving O(n log n) will be O(1 log 1) =
O(0). On the same hand O(n) will be O(1)?
One cannot simply perform "O(1 log 1)" since the Big O notation does not represent a function but rather a set of functions with a certain asymptotic upper-bound; as one can read from source:
Big O notation characterizes functions according to their growth
rates: different functions with the same growth rate may be
represented using the same O notation.
Informally, in computer-science time-complexity and space-complexity theories, one can think of the Big O notation as a categorization of algorithms with a certain worst-case scenario concerning time and space, respectively. For instance, O(n):
An algorithm is said to take linear time/space, or O(n) time/space, if its time/space complexity is O(n). Informally, this means that the running time/space increases at most linearly with the size of the input (source).
and O(n log n) as:
An algorithm is said to run in quasilinear time/space if T(n) = O(n log^k n) for some positive constant k; linearithmic time/space is the case k = 1 (source).
Mathematically speaking the statement
I read that O(n log n) is greater than O(n) (..)
is not accurate, since as mentioned before Big O notation represents a set of functions. Hence, more accurate will be O(n log n) contains O(n). Nonetheless, typically such relaxed phrasing is normally used to quantify (for the worst-case scenario) how a set of algorithms behaves compared with another set of algorithms regarding the increase of their input sizes. To compare two classes of algorithms (e.g., O(n log n) and O(n)) instead of
For instance taking n as 1, solving O(n log n) will be O(1 log 1) =
O(0). On the same hand O(n) will be O(1)?
Which actually contradicts O(n log n) > O(n)
you should analyze how both classes of algorithms behaves with the increase of their input size (i.e., n) for the worse-case scenario; analyzing n when it tends to the infinity
As #cem rightly point it out, in the image "big-O denote one of the asymptotically least upper-bounds of the plotted functions, and does not refer to the sets O(f(n))"
As you can see in the image after a certain input, O(n log n) (green line) grows faster than O(n) (yellow line). That is why (for the worst-case) O(n) is more desirable than O(n log n) because one can increase the input size, and the growth rate will increase slower with the former than with the latter.
I'm going to give the you the real answer, even though it seems to be more than one step away from the way you're currently thinking about it...
O(n) and O(n log n) are not numbers, or even functions, and it doesn't quite make sense to say that one is greater than the other. It's sloppy language, but there are actually two accurate statements that might be meant by saying that "O(n log n) is greater than O(n)".
Firstly, O(f(n)), for any function f(n) of n, is the infinite set of all functions that asymptotically grow no faster than f(n). A formal definition would be:
A function g(n) is in O(f(n)) if and only if there are constants n0 and C such that g(n) <= Cf(n) for all n > n0.
So O(n) is a set of functions and O(n log n) is a set of functions, and O(n log n) is a superset of O(n). Being a superset is kind of like being "greater", so if one were to say that "O(n log n) is greater than O(n)", they might be referring to the superset relationship between them.
Secondly, the definition of O(f(n)) makes f(n) an upper bound on the asymptotic growth of functions in the set. And the upper bound is greater for O(n log n) than it is for O(n). In more concrete terms, there a constant n0 such that n log n > n, for all n > n0. The bounding function itself is asymptotically greater, and this is another thing that one might mean when saying "O(n log n) is greater than O(n)".
Finally, both of these things are mathematically equivalent. If g(n) is asymptotically greater than f(n), it follows mathematically that O(g(n)) is a superset of O(f(n)), and if O(g(n)) is a proper superset of O(f(n)), it follows mathematically that g(n) is asymptotically greater than f(n).
Therefore, even though the statement "O(n log n) is greater than O(n)" does not strictly make any sense, it has a clear and unambiguous meaning if you're willing to read it charitably.
The big O notation only has an asymptotic meaning, that is it makes sense only when n goes to infinity.
For example, a time complexity of O(100000) just means your code runs in constant time, which is asymptotically faster (smaller) than O(log n).

Are O(n log n) algorithms always better than all O(n^2) algorithms?

When trying to properly understand Big-O, I am wondering whether it's true that O(n log n) algorithms are always better than all O(n^2) algorithms.
Are there any particular situations where O(n^2) would be better?
I've read multiple times that in sorting for example, a O(n^2) algorithm like bubble sort can be particularly quick when the data is almost sorted, so would it be quicker than a O(n log n) algorithm, such as merge sort, in this case?
No, O(n log n) algorithms are not always better than O(n^2) ones.
The Big-O notation describes an upper bound of the asymptotic behavior of an algorithm, i.e. for n that tends towards infinity.
In this definition you have to consider some aspects:
The Big-O notation is an upper bound of the algorithm complexity, meaning that for some inputs (like the one you mentioned about sorting algorithms) an algorithm with worst Big-O complexity may actually perform better (bubble sort runs in O(n) for an already sorted array, while mergesort and quicksort takes always at least O(n log n));
The Big-O notation only describes the class of complexity, hiding all the constant factors that in real case scenarios may be relevant. For example, an algorithm that has complexity 1000000 x that is in class O(n) perform worst than an algorithm with complexity 0.5 x^2 (class O(n^2)) for inputs smaller than 2000000. Basically the Big-O notation tells you that for big enough input n, the O(n) algorithms will perform better than O(n^2), but if you work with small inputs you may still prefer the latter solution.
O(n log n) is better than O(n2) asymptotically.
Big-O, Big-Theta, Big-Omega, all those measure the asymptotic behavior of functions, i.e., how functions behave when their argument goes toward a certain limit.
O(n log n) functions grow slower than O(n2) functions, that's what Big-O notation essentially says. However, this does not mean that O(n log n) is always faster. It merely means that at some point, the O(n log n) function will always be cheaper for an ever-rising value of n.
In that image, f(n) = O(g(n)). Note that there is a range where f(n) is actually more costly than g(n), even though it is bounded asymptotically by g(n). However, when talking limits, or asymptotics for that matter, f(n) outperforms g(n) "in the long run," so to say.
In addition to #cadaniluk's answer:
If you restrict the inputs to the algorithms to a very special type, this also can effect the running time. E.g. if you run sorting algorithms only on already sorted lists, BubbleSort will run in linear time, but MergeSort will still need O(n log n).
There are also algorithms that have a bad worst-case complexity, but a good average case complexity. This means that there are bad input instances such that the algorithm is slow, but in total it is very unlikely that you have such a case.
Also never forget, that Big-O notation hides constants and additive functions of lower orders. So an Algorithm with worst-case complexity O(n log n) could actually have a complexity of 2^10000 * n * log n and your O(n^2) algorithm could actually run in 1/2^1000 n^2. So for n < 2^10000 you really want to use the "slower" algorithm.
Here is a practical example.
The GCC implementations of sorting functions have O(n log n) complexity. Still, they employ O(n^2) algorithms as soon as the size of the part being sorted is less than some small constant.
That's because for small sizes, they tend to be faster in practice.
See here for some of the internal implementation.

Why is the worst case time complexity of this simple algorithm T(n/2) +1 as opposed to n^2+T(n-1)?

The following question was on a recent assignment in University. I would have thought the answer would be n^2+T(n-1) as I thought the n^2 would make it's asymptotic time complexity O(n^2). Where as with T(n/2)+1 its asymptotic time complexity would be O(log2(n)).
The answers were returned and it turns out the correct answer is T(n/2)+1 however I can't get my head around why this is the case.
Could someone possibly explain to me why that's the worst case time complexity of this algorithm? It's possible my understanding of time complexity is just wrong.
The asymptotic time complexity is taking n large. In the case of your example, since the question specifies that k is fixed, the only complexity relevant is the last one. See the Wikipedia formal definition, specifically:
As n grows to infinity, the recursion that dominates T(n) = T(n / 2) + 1. You can prove this as well using the formal definition, basically picking x_0 = 10 * k and showing that a finite M can be found using the first two cases. It should be clear that both log(n) and n^2 satisfy the definition, so the tighter bound is the asymptotic complexity.
What does O (f (n)) mean? It means the time is at most c * f (n), for some unknown and possibly large c.
kevmo claimed a complexity of O (log2 n). Well, you can check all the values n ≤ 10k, and let the largest value of T (n) be X. X might be quite large (about 167 k^3 in this case, I think, but it doesn't actually matter). For larger n, the time needed is at most X + log2 (n). Choose c = X, and this is always less than c * log2 (n).
Of course people usually assume that a O (log n) algorithm would be quick, and this one most certainly isn't if say k = 10,000. So you learned as well that O notation must be handled with care.

Big oh notation for heaps

I am trying to understand big oh notations. Any help would be appreciated.
Say there is a program that creates a max heap and then pushes and removes the item.
Say there is n items.
To create a heap,it takes O(n) to heapify if you have read it into an array and then, heapifies it.
To push an item, it takes O(1) and to remove it, it takes O(1)
To heapify it after that, it takes log n for each remove and n log n for n items
So the big oh notation is O(n + n log n)
OR, is it O(n log n) only because we choose the biggest one.
The complexity to heapify the new element in the heap is O(logN), not O(1)(unless you use an Fibonacci heap which it seems is not the case).
Also there is no notation O(N + NlogN) as NlogN grows faster than N so this notation is simply written as O(NlogN).
EDIT: The big-oh notation only describes the asymptotic behavior of a function, that is - how fast it grows. As you get close to infinity 2*f(x) and 11021392103*f(x) behave similarly and that is why when writing big-oh notation, we ignore any constants in front of the function.
Formally speaking, O(N + N log N) is equivalent to O(N log N).
That said, it's assumed that there are coefficients buried in each of these, ala: O( aN + bN log(cN) ). If you have very large N values, these coefficients become unimportant and the algorithm is bounded only by its largest term, which, in this case, is log(N).
But it doesn't mean the coefficients are entirely unimportant. This is why in discussions of graph algorithms you'll often see authors say something like "the Floyd-Warshall algorithm runs in O(N^3) time, but has small coefficients".
If we could somehow write O(0.5N^3) in this case, we would. But it turns out that the coefficients vary depending on how you implement an algorithm and which computer you run it on. Thus, we settle for asymptotic comparisons, not necessarily because it is the best way, but because there isn't really a good alternative.
You'll also see things like "Worst-case: O(N^2), Average case: O(N)". This is an attempt to capture how the behavior of the algorithm varies with the input. Often times, presorted or random inputs can give you that average case, whereas an evil villain can construct inputs that produce the worst case.
Ultimately, what I am saying is this: O(N + N log N)=O(N log N). This is true, and it's the right answer for your homework. But we use this big-O notation to communicate and, in the fullness of time, you may find situations where you feel that O(N + N log N) is more expressive, perhaps if your algorithm is generally used for small N. In this case, do not worry so much about the formalism - just be clear about what it is you are trying to convey with it.

When would O(n*n) be quicker then O (log n)?

I have this question on a practice test and I'm not sure of when code would run quicker on O(n*n) over O(log n).
Big oh notation gives upper bounds. Not more.
If algorithm A is O(n ^ 2), it could require exactly n ^ 2 steps.
If algorithm B is O(log n), it could require exactly 10000 * log n steps.
Algorithm A is a lot faster than algorithm B for small n.
Remember that Big-O is the upper bound. It's quite possible that because of constants that under smaller input sizes the O(n^2) algorithm can run faster than O(log n). It could be entirely possible that in most cases the n^2 can also run faster and that algorithm happens to run in n^2 only because of certain input sets that cause it to have to do a lot of work.
I am retracting my previous answer of never because technically it is possible for a O(n*n) algorithm to be faster than a O(log n) algorithm, though highly improbably. See my discussion with Jesus under his answer for more details. The graph below shows that an algorithm that has a time complexity of exactly log n is always faster than an algorithm that has a time complexity of exactly n*n.

Resources