Today I was reading a great article from Julienne Walker about sorting - Eternally Confuzzled - The Art of Sorting and one thing caught my eye. I don't quite understand the part where the author proves that for sorting by comparison we are limited by Ω(N·log N) lower bound
Lower bounds aren't as obvious. The lowest possible bound for most sorting algorithms is Ω(N·log N). This is because most sorting algorithms use item comparisons to determine the relative order of items. Any algorithm that sorts by comparison will have a minimum lower bound of Ω(N·log N) because a comparison tree is used to select a permutation that's sorted. A comparison tree for the three numbers 1, 2, and 3 can be easily constructed:
1 < 2
1 < 3 1 < 3
2 < 3 3,1,2 2,1,3 2 < 3
1,2,3 1,3,2 2,3,1 3,2,1
Notice how every item is compared with every other item, and that each path results in a valid permutation of the three items. The height of the tree determines the lower bound of the sorting algorithm. Because there must be as many leaves as there are permutations for the algorithm to be correct, the smallest possible height of the comparison tree is log N!, which is equivalent to Ω(N·log N).
It seems to be a very reasonable until the last part (bolded) which I don't quite understand - how log N! is equivalent to Ω(N·log N). I must be miss something from my CopmSci courses and can't get the last transition. I'm looking forward for help with this or for some link to other evidence that we are limited Ω(N·log N) if we use sorting by comparison.
You didn't miss anything from CompSci class. What you missed was math class. The Wikipedia page for Stirling's Approximation shows that log n! is asymptotically n log n + lower order terms.
N! < N^N
∴ log N! < log (N^N)
∴ log N! <N * log N
With this, you can prove θ(log N!) = O(N log N). Proving the same for Ω is left as an exercise for the reader, or a question for mathematics stackexchange or theoretical computer science stackexchange.
My favorite proof of this is very elementary.
N! = 1 * 2 * .. * N - 1 * N
We can can a very easy lower bound by pretending the first half of those products don't exist, and then that the second half are all just N/2.
(N/2)^(N/2) <= N!
log((N/2)^(N/2) = N/2 * log(N/2) = N/2 * (log(N) - 1) = O(n log n)
So even when you take only the second half of the expression, and pretend that all those factors are no bigger than N/2, you are still in O(n log n) territory for a lower bound, and this is super elementary. I could convince an average high school student of this. I can't even derive Stirling's formula by myself.
Related
I just came around this weird discovery, in normal maths, n*logn would be lesser than n, because log n is usually less than 1.
So why is O(nlog(n)) greater than O(n)? (ie why is nlogn considered to take more time than n)
Does Big-O follow a different system?
It turned out, I misunderstood Logn to be lesser than 1.
As I asked few of my seniors i got to know this today itself, that if the value of n is large, (which it usually is, when we are considering Big O ie worst case), logn can be greater than 1.
So yeah,
O(1) < O(logn) < O(n) < O(nlogn) holds true.
(I thought this to be a dumb question, and was about to delete it as well, but then realised, no question is dumb question and there might be others who get this confusion so I left it here.)
...because log n is always less than 1.
This is a faulty premise. In fact, logb n > 1 for all n > b. For example, log2 32 = 5.
Colloquially, you can think of log n as the number of digits in n. If n is an 8-digit number then log n ≈ 8. Logarithms are usually bigger than 1 for most values of n, because most numbers have multiple digits.
Plot both the graph( on desmos (https://www.desmos.com/calculator) or any other web) and look yourself the result on large values of n ( y=f(n)). I am saying that you should look for large value because for small value of n the program will not have time issue. For convenience I had attached a graph below you can try for other base of log.
The red represent time = n and blue represent time = nlog(n).
Here is a graph of the popular time complexities
n*log(n) is clearly greater than n for n>2 (log base 2)
An easy way to remember might be, taking two examples
Imagine the binary search algorithm with is Log N time complexity: O(log(N))
If, for each step of binary search, you had to iterate the array of N elements
The time complexity of that task would be O(N*log(N))
Which is more work than iterating the array once: O(N)
In computers, it's log base 2 and not base 10. So log(2) is 1 and log(n), where n>2, is a positive number which is greater than 1.
Only in the case of log (1), we have the value less than 1, otherwise, it's greater than 1.
Log(n) can be greater than 1 if n is greater than b. But this doesn't answer your question that why is O(n*logn) is greater than O(n).
Usually the base is less than 4. So for higher values n, n*log(n) becomes greater than n. And that is why O(nlogn) > O(n).
This graph may help. log (n) rises faster than n and is greater than 1 for n greater than logarithm's base. https://stackoverflow.com/a/7830804/11617347
No matter how two functions behave on small value of n, they are compared against each other when n is large enough. Theoretically, there is an N such that for each given n > N, then nlogn >= n. If you choose N=10, nlogn is always greater than n.
The assertion is not always accurate. When n is small, (n^2) requires more time than (log n), but when n is large, (log n) is more effective. The growth rate of (n^2) is less than (n) and (log n) for small values, so we can say that (n^2) is more efficient because it takes less time than (log n), but as n increases, (n^2) increases dramatically, whereas (log n) has a growth rate that is less than (n^2) and (n), so (log n) is more efficient.
For higher values of log n it becomes greater than 1. as we consider all possible values of n we can say that for most of the time log n is greater than 1. Hence we can say O(nlogn) > O(n) (Assuming higher values)
Remember "big O" is not about values, it is about the shape of the function I can have an O(n2) function that runs faster, even for values of n over a million than a O(1) function...
We have 4 algorithms, all of them with complexity depending on m and n, like:
Alg1: O(m+n)
Alg2: O(mlogm + nlogn)
Alg3: O(mlogn + nlogm)
Alg4: O(m+n!) (ouch, this one sucks, but whatever)
Now, how do we handle this if we now that n>m? My first thought is: Big O notation "discard" constant and smaller variables because it doesn't matter when, but sooner or later the "bigger term" will overwhelm all the others, making them irrelevant in the computation cost.
So, can we rewrite Alg1 as O(n), or Alg2 as O(mlogm)? If so, what about the others?
yes you can rewrite it if you know that it is always the case that n>m. Formally, have a look at this
if we know that n>m (always) then it follows that
O(m+n) < O(n+n) which is O(2n) = O(n) (we don't really care about the 2)
also we can say the same thing about the other algorithms as well
O(mlogm + nlogn) < O(nlogn + nlogn) = O(2nlogn) = O(nlogn)
I think you can see where the rest of them are going. But if you do not know that n > m then you cannot say the above.
EDIT: as #rici nicely pointed out, you also need to be careful as well, since it'll always depend on the given function. (Note that O((2n)!) can not be simplified to O(n!))
With a bit of playing around you can see how this is not true
(2n)! = (2n) * (2n-1) * (2n-2)... < 2(n) * 2(n-1) * 2(n-2) ...
=> (2n)! = (2n) * (2n-1) * (2n-2)... < 2^n * n! (After combining all of the 2 coefficients)
Thus we can see that O((2n)!) is more like O(2^n * n!) to get a more accurate calculation you can see this thread here Are the two complexities O((2n + 1)!) and O(n!) equal?
Consider the problem of finding the k largest elements of an array. There's a nice O(n log k)-time algorithm for solving this using only O(k) space that works by maintaining a binary heap of at most k elements. In this case, since k is guaranteed to be no bigger than n, we could have rewritten these bounds as O(n log n) time with O(n) memory, but that ends up being less precise. The direct dependency of the runtime and memory usage on k makes clear that this algorithm takes more time and uses more memory as k changes.
Similarly, consider many standard graph algorithms, like Dijkstra's algorithm. If you implement Dijkstra's algorithm using a Fibonacci heap, the runtime works out O(m + n log n), where m is the number of nodes and n is the number of edges. If you assume your input graph is connected, then the runtime also happens to be O(m + m log m), but that's a less precise bound than the one that we had.
On the other hand, if you implement Dijkstra's algorithm with a binary heap, then the runtime works out to O(m log n + n log n). In this case (again, assuming the graph is connected), the m log n term strictly dominates the n log n term, and so rewriting this as O(m log n) doesn't lose any precision.
Generally speaking, you'll want to give the most precise bounds that you can in the course of documenting the runtime and memory usage of a piece of code. If you have multiple terms where one clearly strictly dominates the other, you can safely discard those lower terms. But otherwise, I wouldn't recommend discarding one of the variables, since that loses precision in the bound you're giving.
I have n numbers between 0 and (n^4 - 1) what is the fastest way I can sort them.
Of course, nlogn is trivial, but I thought about the option of Radix Sort with base n and than it will be linear time, but I am not sure because of the -1.
Thanks for help!
I think you are misunderstanding the efficiency of Radix Sort. From Wikipedia:
Radix sort complexity is O(wn) for n keys which are integers of word size w. Sometimes w is presented as a constant, which would make radix sort better (for sufficiently large n) than the best comparison-based sorting algorithms, which all perform O(n log n) comparisons to sort n keys. However, in general w cannot be considered a constant: if all n keys are distinct, then w has to be at least log n for a random-access machine to be able to store them in memory, which gives at best a time complexity O(n log n).
I personally would implement quicksort choosing an intelligent pivot. Using this method you can achieve about 1.188 n log n efficiency.
If we use Radix Sort in base n we get the desired linear time complexity, the -1 doesn't matter.
We will represent the numbers in base n:
Then we get : <= (log(base n) of (n^4 - 1)) * (n + n) <= 4 * (2n) <= O(n).
n is for n numbers, the other n is just the digits span (overestimate) and log of n^4 - 1 is less than log n^4 which is 4 in base n. Overall linear time complexity.
Thanks for the help anyway! If I did something wrong please notify me!
I came to know that in case of Randomized quick sort, if we choose the pivot in such a way that it will at least give the split in the ration 25%-75%, then the run time is O(n log n).
Now I also came to know that we can prove this with Master Theorem.
But my problem is that if we split the array in 25%-75% in each step, then how I will define my T(n) and how can I prove that the runtime analysis in O(n log n)?
You can use Master theorem to find the complexity of this kind of algorithms. In this particular case assume, that when you divide the array into two parts each of these parts is not greater then 3/4 of the initial array. Then, T(n) < 2 * T(3/4 * n) + O(n), or T(n) = 2 * T(3/4 * n) + O(n) if you look for upper bound. Master theorem gives you the solution for this equation.
Update: though Master theorem may solve such recurrence equations, in this case it gives us a result which is worse than expected O(n*log n). Nevertheless, it can be solved in other way. If we assume that a pivot always splits the array in the way that the smaller part is >= 1/4 size, then we can limit the recursion depth as log_{4/3}N (because on each level the size of array decreases by at least 4/3 times). Time complexity on each recursion level is O(n) in total, thus we have O(n) * log{4/3}n = O(n*log n) overall complexity.
Furthermore, if you want some more strict analysis, you may consider a Wikipedia article, there are some good proofs.
I have read quite a bit on big-O notation and I have a basic understanding. This is a specific question that I hope will help me understand it better.
If I have and array of 100 integers (no duplicates, and randomly generated) and I use heapsort to sort it, I know that big-O notation for heapsort is n lg n. For n = 100, this works out to 100 × 6.64, which is roughly 664.
While I know this is the upper bound on the number of comparisons and my count can be less than 664, if I am trying to figure out the number of comparisons for a heap sorted array of 100 random numbers, it should always be less than or equal to 664?
I am trying to add counters to my heapsort to get the big-O comparison time and coming up with crazy numbers. I will continue to work it out, but wanted to just verify that I was thinking of the upper bound properly.
Thanks!
Big-O notation does not give you an exact upper bound on a function's runtime - instead, it tells you asymptotically how the function's runtime grows. If a function has runtime O(n log n), it means that the function grows at roughly the same rate as the function f(n) = n log n. That means, for example, that the actual runtime could be 23 n log n + 17 n, or it could be 0.05 n log n. Consequently, you can't use the fact that heapsort is O(n log n) to count the number of comparisons made. You'd need a more precise analysis.
It just so happens that you can get a very precise analysis of heapsort, but it requires you to do a more meticulous analysis of the algorithm. You can show, for example, that the number of comparisons required to call make-heap is at most 3n, and that the number of comparisons made during the repeated calls to extract-min is at most 2n log (n + 1) (the binary heap has log (n + 1) layers, and during each of the n extract-max's, at each layer at most two comparisons are made). This gives an overall number of comparisons upper-bounded by 2n log (n + 1) + 3n.
The famous Ω(n log n) sorting barrier can be used to get a matching lower bound. Any comparison-based sorting algorithm, of which heapsort is one, must make at least log n! = n log n - n + O(log n) (this is Stirling's approximation) comparisons on average, and so heapsort is required to make at least n log n - n comparisons in the worst-case. (Note that this is actually n log n, not some constant multiple of n log n. You can read over the proof of the Ω(n log n) barrier for why this is.)
Hope this helps!
Let's say that you know that your algorithm requires O( n log_2 n ) comparisons when sorting n elements.
This tells you the following, and only the following: there exists a constant number C such that, as n approaches infinity, the algorithm never requires more than C * n * log_2 n comparisons.
It does not tell you anything about the specific number of comparisons that might be required for any value of n -- it tells you about how the number of comparisons required grows in the limit as the number of elements grows.
You can not use the Big-O complexity of your sorting algorithm to prove anything about the behaviour of a particular finite n, such as 100 elements. Sorting 100 elements might require 64 comparisons, or 664, or 664 million. The latter is clearly not reasonable, but Big-O simply provides no information here.