Why is a list sort nlogn - algorithm

I have tried googling this but have been confused. As it is the very start of an online course and we have not been introduced to concepts such as merge sort.
We are given the pseudo code below and told it has nlogn operations.
MaxPairwiseProductBySorting(A[1 . . . n]):
Sort(A)
return A[n − 1] · A[n]
I understand why something like the below could have n^2 operations. But am totally lost at the former of where the nlogn comes from.
MaxPairwiseProductNaive(A[1 . . . n]): product ← 0
fori from1ton:
forj fromi+1ton:
product←max(product,A[i]·A[j])
return product

There are lots of ways to sort lists. Under certain conditions a list can be sorted as quickly as O(n), but generally it will take O(n log n). The exact analysis depends on the specific sort, but the gist is that most of these sorts work like this:
Break the problem of sorting the list into 2 smaller sorting problems
Repeat
... with some way of handling very small sorts.
The log(n) comes from repeatedly splitting the problem. The n comes from the fact that we have to sort all of the parts, which will total to n since we haven't gotten rid of anything.
It would help you to read up on a specific sort to understand this better. Mergesort & quicksort are two common sorts, and Wikipedia has good articles on both.

The assumption in this code is that you're dealing with a so called comparison-sort that orders elements of the array by comparing two at a time.
Now for sorting n elements that way, we can generate a decision-tree that considers all potential binary decisions until the tree is ordered starting from an arbitrary input-permutation. The minimum achievable height for that decision-tree would then be a lower bound for any comparison-sort algorithm.
E.g. for n = 3:
-------------1:2-----------------
<= >
--------2:3------- ------2:3-----
<= > <= >
{1,2,3} ----1:3---- ---1:3--- {3,2,1}
<= > <= >
{1,3,2} {3,1,2} {2,1,3} {2,3,1}
The decision-tree must obviously contain all possible permutations of n values as leafs. Otherwise there would exist input-permutations that our algorithm couldn't properly sort. This means the tree must have at least n! leafs. So we have
n! <= nleafs <= 2^h
where h is the height of our tree. Taking the logarithm of both sides, we obtain
n lg n >= lg 1 + lg 2 + ... + lg n = lg n! >= h
So h = Omega(n lg n). Since h is the length of the longest path in the decision-tree, it is also the lower bound on the number of comparisons in the worst case. So any comparison-sort is Omega(n lg n).

Related

Proof of Ω(n logk) worst case complexity in a comparison sort algorithm

I'm writing a comparison algorithm that takes n numbers and a number k as input.
It separates the n numbers to k groups so that all numbers in group 1 are smaller than all numbers of group 2 , ... , smaller than group k.
The numbers of the same group are not necessarily sorted.
I'm using a selection(A[],left,right,k) to find the k'th element , which in my case is the n/k element (to divide the whole array in to 2 pieces) and then repeat for each piece , until the initial array is divided to k parts of n/k numbers each.
It has a complexity of Θ(n logk) as its a tree of logk levels (depth) that cost maximum cn calculations each level. This is linear time as logk is considered a constant.
I am asked to prove that all comparison algorithms that sort an Array[n] to k groups in this way, cost Ω(nlogk) in the worst case.
I've searched around here , google and my algorithm's book (Jon Kleinberg Eva Tardos) I only find proof for comparison algorithms that sort ALL the elements. The proof of such algorithm complexity is not accepted in my case because all of these are under circumstances that do not meet my problem, nor can they be altered to meet my problem. ( also consider that regular quicksort with random selection results in Θ(nlogn) which is not linear as Ω(nlogk) is)
You can find the general algorithm proof here:
https://www.cs.cmu.edu/~avrim/451f11/lectures/lect0913.pdf
where it is also clearly explained why my problem does not belong in the comparison sort case of O(nlogn)
Sorting requires lg(n!) = Omega(n log n) comparisons because there are n! different output permutations.
For this problem there are
n!
-------
k
(n/k)!
equivalence classes of output permutations because the order within k independent groups of n/k elements does not matter. We compute
n!
lg ------- = lg (n!) - k lg((n/k)!)
k
(n/k)!
= n lg n - n - k ((n/k) lg (n/k) - n/k) ± O(lg n + k lg (n/k))
(write lg (...!) as a sum, bound with two integrals;
see https://en.wikipedia.org/wiki/Stirling's_approximation)
= n (lg n - lg (n/k)) ± O(lg n + k lg (n/k))
= n lg k ± O(lg n + k lg (n/k))
= Omega(n lg k).
(O(lg n + k lg (n/k)) = O(n), since k <= n)
prove that all comparison algorithms that sort an Array[n] to k groups in this way, cost Ω(nlogk) in the worst case.
I think the statement is false. If using quickselect with a poor pivot choice (such as always using first or last element), then the worst case is probably O(n^2).
Only some comparison algorithms will have a worst case of O(n log(k)). Using median of medians (the n/5 version) for the pivot prevents quickselect solves the pivot issue. There are other algorithms that would also be O(n log(k)).

What if we split the array in merge sort into 4 parts?? or eight parts?

I came across this question in one of the slides of Stanford, that what would be the effect on the complexity of the code of merge sort if we split the array into 4 or 8 instead of 2.
It would be the same: O(n log n). You will have a shorter tree and the base of the logarithm will change, but that doesn't matter for big-oh, because a logarithm in a base a differs from a logarithm in base b by a constant:
log_a(x) = log_b(x) / log_b(a)
1 / log_b(a) = constant
And big-oh ignores constants.
You will still have to do O(n) work per tree level in order to merge the 4 or 8 or however many parts, which, combined with more recursive calls, might just make the whole thing even slower in practice.
In general, you can split your array into equal size subarrays of any size and then sort the subarrays recursively, and then use a min-heap to keep extracting the next smallest element from the collection of sorted subarrays. If the number of subarrays you break into is constant, then the execution time for each min-heap per operation is constant, so you arrive at the same O(n log n) time.
Intuitively it would be the same as there is no much difference between splitting the array into two parts and then doing it again or splitting it to 4 parts from the beginning.
A more official proof by induction based on this (I'll assume that the array is split into k):
Definitions:
Let T(N) - number of array stores to mergesort of input of size N
Then mergesort recurrence T(N) = k*T(N/k) + N (for N > 1, T(1) = 0)
Claim:
If T(N) satisfies the recurrence above then T(N) = Nlg(N)
Note - all the logarithms below are on base k
Proof:
Base case: N=1
Inductive hypothesis: T(N) = NlgN
Goal: show that T(kN) = kN(lg(kN))
T(kN) = kT(N) + kN [mergesort recurrence]
= kNlgN + kN [inductive hypothesis]
= kN(lg[(kN/k)] [algebra]
= kN(lg(kN) - lgk) [algebra]
= kN(lg(kN) - 1) + kN [algebra - for base k, lg(k )= 1]
= kNlg(kN) [QED]

Prove 3-Way Quicksort Big-O Bound

For 3-way Quicksort (dual-pivot quicksort), how would I go about finding the Big-O bound? Could anyone show me how to derive it?
There's a subtle difference between finding the complexity of an algorithm and proving it.
To find the complexity of this algorithm, you can do as amit said in the other answer: you know that in average, you split your problem of size n into three smaller problems of size n/3, so you will get, in è log_3(n)` steps in average, to problems of size 1. With experience, you will start getting the feeling of this approach and be able to deduce the complexity of algorithms just by thinking about them in terms of subproblems involved.
To prove that this algorithm runs in O(nlogn) in the average case, you use the Master Theorem. To use it, you have to write the recursion formula giving the time spent sorting your array. As we said, sorting an array of size n can be decomposed into sorting three arrays of sizes n/3 plus the time spent building them. This can be written as follows:
T(n) = 3T(n/3) + f(n)
Where T(n) is a function giving the resolution "time" for an input of size n (actually the number of elementary operations needed), and f(n) gives the "time" needed to split the problem into subproblems.
For 3-Way quicksort, f(n) = c*n because you go through the array, check where to place each item and eventually make a swap. This places us in Case 2 of the Master Theorem, which states that if f(n) = O(n^(log_b(a)) log^k(n)) for some k >= 0 (in our case k = 0) then
T(n) = O(n^(log_b(a)) log^(k+1)(n)))
As a = 3 and b = 3 (we get these from the recurrence relation, T(n) = aT(n/b)), this simplifies to
T(n) = O(n log n)
And that's a proof.
Well, the same prove actually holds.
Each iteration splits the array into 3 sublists, on average the size of these sublists is n/3 each.
Thus - number of iterations needed is log_3(n) because you need to find number of times you do (((n/3) /3) /3) ... until you get to one. This gives you the formula:
n/(3^i) = 1
Which is satisfied for i = log_3(n).
Each iteration is still going over all the input (but in a different sublist) - same as quicksort, which gives you O(n*log_3(n)).
Since log_3(n) = log(n)/log(3) = log(n) * CONSTANT, you get that the run time is O(nlogn) on average.
Note, even if you take a more pessimistic approach to calculate the size of the sublists, by taking minimum of uniform distribution - it will still get you first sublist of size 1/4, 2nd sublist of size 1/2, and last sublist of size 1/4 (minimum and maximum of uniform distribution), which will again decay to log_k(n) iterations (with a different k>2) - which will yield O(nlogn) overall - again.
Formally, the proof will be something like:
Each iteration takes at most c_1* n ops to run, for each n>N_1, for some constants c_1,N_1. (Definition of big O notation, and the claim that each iteration is O(n) excluding recursion. Convince yourself why this is true. Note that in here - "iteration" means all iterations done by the algorithm in a certain "level", and not in a single recursive invokation).
As seen above, you have log_3(n) = log(n)/log(3) iterations on average case (taking the optimistic version here, same principles for pessimistic can be used)
Now, we get that the running time T(n) of the algorithm is:
for each n > N_1:
T(n) <= c_1 * n * log(n)/log(3)
T(n) <= c_1 * nlogn
By definition of big O notation, it means T(n) is in O(nlogn) with M = c_1 and N = N_1.
QED

Number of Comparisons in Merge-Sort

I was studying the merge-sort subject that I ran into this concept that the number of comparisons in merge-sort (in the worst-case, and according to Wikipedia) equals (n ⌈lg n⌉ - 2⌈lg n⌉ + 1); in fact it's between (n lg n - n + 1) and (n lg n + n + O(lg n)). The problem is that I cannot figure out what these complexities try to say. I know O(nlogn) is the complexity of merge-sort but the number of comparisons?
Why to count comparisons
There are basically two operations to any sorting algorithm: comparing data and moving data. In many cases, comparing will be more expensive than moving. Think about long strings in a reference-based typing system: moving data will simply exchange pointers, but comparing might require iterating over a large common part of the strings before the first difference is found. So in this sense, comparison might well be the operation to focus on.
Why an exact count
The numbers appear to be more detailed: instead of simply giving some Landau symbol (big-Oh notation) for the complexity, you get an actual number. Once you have decided what a basic operation is, like a comparison in this case, this approach of actually counting operations becomes feasible. This is particularly important when comparing the constants hidden by the Landau symbol, or when examining the non-asymptotic case of small inputs.
Why this exact count formula
Note that throughout this discussion, lg denotes the logarithm with base 2. When you merge-sort n elements, you have ⌈lg n⌉ levels of merges. Assume you place ⌈lg n⌉ coins on each element to be sorted, and a merge costs one coin. This will certainly be enough to pay for all the merges, as each element will be included in ⌈lg n⌉ merges, and each merge won't take more comparisons than the number of elements involved. So this is the n⌈lg n⌉ from your formula.
As a merge of two arrays of length m and n takes only m + n − 1 comparisons, you still have coins left at the end, one from each merge. Let us for the moment assume that all our array lengths are powers of two, i.e. that you always have m = n. Then the total number of merges is n − 1 (sum of powers of two). Using the fact that n is a power of two, this can also be written as 2⌈lg n⌉ − 1, and subtracting that number of returned coins from the number of all coins yields n⌈lg n⌉ − 2⌈lg n⌉ + 1 as required.
If n is 1 less than a power of two, then there are ⌈lg n⌉ merges where one element less is involved. This includes a merge of two one-element lists which used to take one coin and which now disappears altogether. So the total cost reduces by ⌈lg n⌉, which is exactly the number of coins you'd have placed on the last element if n were a power of two. So you have to place fewer coins up front, but you get back the same number of coins. This is the reason why the formula has 2⌈lg n⌉ instead of n: the value remains the same unless you drop to a smaller power of two. The same argument holds if the difference between n and the next power of two is greater than 1.
On the whole, this results in the formula given in Wikipedia:
n ⌈lg n⌉ − 2⌈lg n⌉ + 1
Note: I'm pretty happy with the above proof. For those who like my formulation, feel free to distribute it, but don't forget to attribute it to me as the license requires.
Why this lower bound
To proove the lower bound formula, let's write ⌈lg n⌉ = lg n + d with 0 ≤ d < 1. Now the formula above can be written as
n (lg n + d) − 2lg n + d + 1 =
n lg n + nd − n2d + 1 =
n lg n − n(2d − d) + 1 ≥
n lg n − n + 1
where the inequality holds because 2d − d ≤ 1 for 0 ≤ d < 1
Why this upper bound
I must confess, I'm rather confused why anyone would name n lg n + n + O(lg n) as an upper bound. Even if you wanted to avoid the floor function, the computation above suggests something like n lg n − 0.9n + 1 as a much tighter upper bound for the exact formula. 2d − d has its minimum (ln(ln(2)) + 1)/ln(2) ≈ 0.914 for d = −ln(ln(2))/ln(2) ≈ 0.529.
I can only guess that the quoted formula occurs in some publication, either as a rather loose bound for this algorithm, or as the exact number of comparisons for some other algorithm which is compared against this one.
(Two different counts)
This issue has been resolved by the comment below; one formula was originally quoted incorrectly.
equals (n lg n - n + 1); in fact it's between (n lg n - n + 1) and (n lg n + n + O(lg n))
If the first part is true, the second is trivially true as well, but explicitely stating the upper bound seems kind of pointless. I haven't looked at the details myself, but these two statements appear strange when taken together like this. Either the first one really is true, in which case I'd omit the second one as it is only confusing, or the second one is true, in which case the first one is wrong and should be omitted.

Lower bounds on comparison sorts for a small fraction of inputs?

Can someone please walk me through mathematical part of the solution of the following problem.
Show that there is no comparison sort whose running time is linear for at least half
of the n! inputs of length n. What about a fraction of 1/n of the inputs of length n?
What about a fraction (1/(2)^n)?
Solution:
If the sort runs in linear time for m input permutations, then the height h of the
portion of the decision tree consisting of the m corresponding leaves and their
ancestors is linear.
Use the same argument as in the proof of Theorem 8.1 to show that this is impossible
for m = n!/2, n!/n, or n!/2n.
We have 2^h ≥ m, which gives us h ≥ lgm. For all the possible ms given here,
lgm = Ω(n lg n), hence h = Ω(n lg n).
In particular,
lgn!/2= lg n! − 1 ≥ n lg n − n lg e − 1
lgn!/n= lg n! − lg n ≥ n lg n − n lg e − lg n
lgn!/2^n= lg n! − n ≥ n lg n − n lg e − n
Each of these proofs are a straightforward modification of the more general proof that you can't have a comparison sort that sorts any faster than Ω(n log n) (you can see this proof in this earlier answer). Intuitively, the argument goes as follows. In order for a sorting algorithm to work correctly, it has to be able to determine what the initial ordering of the elements is. Otherwise, it can't reorder the values to put them in ascending order. Given n elements, there are n! different permutations of those elements, meaning that there are n! different inputs to the sorting algorithm.
Initially, the algorithm knows nothing about the input sequence, and it can't distinguish between any of the n! different permutations. Every time the algorithm makes a comparison, it gains a bit more information about how the elements are ordered. Specifically, it can tell whether the input permutation is in the group of permutations where the comparison yields true or in the group of permutations where the comparison yields false. You can visualize how the algorithm works as a binary tree, where each node corresponds to some state of the algorithm, and the (up to) two children of a particular node indicate the states of the algorithm that would be entered if the comparison yields true or false.
In order for the sorting algorithm to be able to sort correctly, it has to be able to enter a unique state for each possible input, since otherwise the algorithm couldn't distinguish between two different input sequences and would therefore sort at least one of them incorrectly. This means that if you consider the number of leaf nodes in the tree (parts where the algorithm has finished comparing and is going to sort), there must be at least one leaf node per input permutation. In the general proof, there are n! permutations, so there must be at least n! leaf nodes. In a binary tree, the only way to have k leaf nodes is to have height at least Ω(log k), meaning that you have to do at least Ω(log k) comparisons. Thus the general sorting lower bound is Ω(log n!) = Ω(n log n) by Stirling's approximation.
In the cases that you're considering, we're restricting ourselves to a subset of those possible permutations. For example, suppose that we want to be able to sort n! / 2 of the permutations. This means that our tree must have height at least lg (n! / 2) = lg n! - 1 = Ω(n log n). As a result. you can't sort in time O(n), because no linear function grows at the rate Ω(n log n). For the second part, seeing if you can get n! / n sorted in linear time, again the decision tree would have to have height lg (n! / n) = lg n! - lg n = Ω(n log n), so you can't sort in O(n) comparisons. For the final one, we have that lg n! / 2n = lg n! - n = Ω(n log n) as well, so again it can't be sorted in O(n) time.
However, you can sort 2n permutations in linear time, since lg 2n = n = O(n).
Hope this helps!

Resources