A friend and I were discussing the following problem and had two differing opinions. Who is correct here and why?
The problem is as follows:
Consider a modification of the radix-sort algorithm, in which we use Quick-Sort to sort each of the digits. Is this algorithm a valid sorting algorithm? If the answer is no, explain why. If the answer is yes, explain what is its best-time complexity of the modification? Your answer should depend on π, The number of digits needed to represent each number in the array.
My friend believes that it is not a valid sorting algorithm because quicksort is unstable and radix sort is stable.
I believe it's possible (although pointless) and would have a best-case runtime of log d of n.
In the πth pass, radix sort would place numbers in buckets based on the πth digit of each number and so arrive at an order of the numbers by their πth digit.
There are at least two versions of radix sort:
One where the digits are visited from least significant to most significant, each time (re)distributing them over the ten buckets and collecting them again in the order in which they appear from bucket 0 to bucket 9
One where the digits are visited from most significant to least significant, where -- if a bucket has more than one number -- the procedure is repeated separately for the numbers in that bucket, using the next digit.
The first variant only works correctly if the distribution into buckets is stable, i.e. where numbers that arrive in the same bucket, maintain their relative order with respect to the previous pass.
There is no such requirement for the second variant.
Secondly, there exist quicksort variants that are stable, but if we assume that an unstable variant of quicksort is used to sort the numbers by their πth digit, instead of using the buckets, then we can state the following:
If the first version of radix sort is used, then an unstable sorting method for the sorting by the πth digit will not be reliable, and can produce wrong results:
For instance, if we have π=2 and input 31, 32 then the first pass will will sort the numbers by their final digit, which gives 31, 32. The second pass will sort the numbers by their first digit, and if the unstable aspect of the chosen comparison sort kicks in, this may produce 32, 31 as output, which obviously is not the desired result.
If the second version of radix sort is used, then even an unstable comparison-based sorting method can be used for the sorting by the πth digit, and the results will be correct, but not necessarily stable.
The time complexity will suffer, as in the first pass all π digits (of that many numbers) are sorted by a comparison-based sorting algorithm. Quick sort has an average (not worst) time complexity of O(πlogπ). So the first pass has an average time complexity of O(πlogπ), the second O(10((π/10)log(π/10))), then O(10Β²((π/10Β²)log(π/10Β²))), ... etc (π passes)
Summing up, gives O(π(logπ + log(π/10) + ... + log(π/10πβ1)).
Let's say π = 1000 and for simplicity we choose the base of the log as 10, then the inner expression is 3+2+1+0 (or fewer terms if π is relatively small compared to π). We can see that this is a triangular number, and so we can arrive at O(logΒ²π) for that inner expression, giving a total average time complexity of O(πlogΒ²π).
Related
How the elements of an array got sorted by using counting sort two times??
Is it fixed that there would be the use of counting sort exact two times?
I know it is related to radix sort (which is the subroutine of counting sort) in which elements are sorted by considering each digit at a time.
for more detail:- https://www.geeksforgeeks.org/sort-n-numbers-range-0-n2-1-linear-time/
(please don't declare it as a duplicate I have already checked posts related to this.)
How will the elements be of two-digit after converting the numbers to base n? Can you tell me this, please?
Thanks in advance.
The trick described in the article is to consider the elements of the array as 2-digit numbers in base n, and them sort them using two passes of counting sort.
The idea is that sorting the numbers by the bottom digit (using a stable sorting algorithm), and then sorting the numbers by the higher digit (using the same stable sorting algorithm) results in the numbers being sorted. These two steps can be considered a form of radix sort except rather than sorting by bit, it's sorting by digits base n.
Given that the digits are all in the range 0..n-1, it's possible to use counting sort to do each of the two sorting steps in linear time. There's some fiddly details to get right so that the counting sort is stable, but those details (and code) are provided in the article.
I am trying to determine the running time in Big O of merge sort for:
(A) sorted input
(B) reverse-ordered input
(C) random input
My answer is that it would take O(n lgn) for all three scenarios, since regardless of the default order of the input, merge sort will always divide the input into the smallest unit of 1 element. Then it will compare each element with each element in the adjacent list to sort and merge the two adjacent lists. It will continue to do this until finally all the elements are sorted and merged.
That said, all we really need to find then is the Big O complexity of merge sort in general, since the worst, average, and best cases will all take the same time.
My question is, can somebody tell me if my answers are correct, and if so, explain why the Big O complexity of merge sort ends up being O(n lgn)?
The answer to this question depends on your implementation of Merge Sort.
When naively implemented, merge sort indeed uses O(n * log n) time as it will always divide the input down to the smallest unit. However, there's a specific implementation called Natural Merge Sort that will keep numbers in their correct order if they're already ordered in the input array by essentially first looking at the given input and deciding which parts need to be ordered, that is, divided and later merged again.
Natural Merge Sort will only take O(n) time for an ordered input and in general be faster for a random input than for a reverse-ordered input. In the latter two cases, runtime will be O(n * log n).
To answer your last question, I'll look at the "normal" Mergesort; the explanation is easier that way.
Note that Mergesort can be visualized as a binary tree where in the root we have the whole input, on the next layer the two halves you get from dividing the input once, on the third layer we have four quarters and so on... On the last layer we finally have individual numbers.
Then note that the whole tree is O(log n) deep (this can also be proved mathematically). On each layer we have to make some comparisons and swaps on n numbers in total - this is because the total amount of numbers on a layer doesn't decrease when we go down the tree. In the picture, we need to do comparisons and swaps on 8 numbers on each layer. The way Mergesort works, we'll actually have to do exactly 8 comparisons and up to 8 swaps per layer. If we have an input of length n instead of 8, we'll need n comparisons and up to n swaps per layer (this is O(n)). We have O(log n) layers, so the whole runtime will be O(n * log n).
Consider two k-bit numbers (in binary representation):
$$A = A_1 A_2 A_3 A_4 ... A_k $$
$$B = B_1 B_2 B_3 B_4 ... B_k $$
to compare we scan from left to right looking for an occurrence of a 0 and check opposite number if that digit is also a 0 (for both numbers) noticing that if ever such a case is found then the source of the 0 is less than the source of the 1. But what if the numbers are:
111111111111
111111111110
clearly this will require scanning the whole number and if we are told nothing about the numbers ahead of time and simply given them then:
Comparison take $O(k)$ time.
Therefore when we look at the code for a sorting method such as high-performance quick sort:
HPQuicksort(list): T(n)
check if list is sorted: if so return list
compute median: O(n) time (or technically: O(nk))
Create empty list $L_1$, $L_2$, and $L_3$ O(1) time
Scan through list O(n)
if element is less place into $L_1$ O(k)
if element is more place into $L_2$ O(k)
if element is equal place into $L_3$ O(k)
return concatenation of HP sorted $L_1$, $L_3$, $L_2$ 2 T(n/2)
Thus: T(n) = O(n) + O(nk) + 2*T(n/2) ---> T(n) = O(nklog(n))
Which means quicksort is slower than radix sort.
Why do we still use it then?
There seem to be two independent questions here:
Why do we claim that comparisons take time O(1) when analyzing sorting algorithms, when in reality they might not?
Why would we use quicksort on large integers instead of radix sort?
For (1), typically, the runtime analysis of sorting algorithms is measured in terms of the number of comparisons made rather than in terms of the total number of operations performed. For example, the famous sorting lower bound gives a lower bound in terms of number of comparisons, and the analyses of quicksort, heapsort, selection sort, etc. all work by counting comparisons. This is useful for a few reasons. First, typically, a sorting algorithm will be implemented by being given an array and some comparison function used to compare them (for example, C's qsort or Java's Arrays.sort). From the perspective of the sorting algorithm, this is a black box. Therefore, it makes sense to analyze the algorithm by trying to minimize the number of calls to the black box. Second, if we do perform our analyses of sorting algorithms by counting comparisons, it's easy to then determine the overall runtime by multiplying the number of comparisons by the cost of a comparison. For example, you correctly determined that sorting n k-bit integers will take expected time O(kn log n) using quicksort, since you can just multiply the number of comparisons by the cost of a comparison.
For your second question - why would we use quicksort on large integers instead of radix sort? - typically, you would actually use radix sort in this context, not quicksort, for the specific reason that you pointed out. Quicksort is a great sorting algorithm for sorting objects that can be compared to one another and has excellent performance, but radix sort frequently outperforms it on large arrays of large strings or integers.
Hope this helps!
First, I know
lower bound is O(nlogn)
and how to prove it
And I agree the lower bound should be O(nlogn).
What I don't quite understand is:
For some special cases, the # of comparisons could actually be even lower than the lower bound. For example, use bubble sort to sort an already sorted array. The # of comparisons is O(n).
So how to actually understand the idea of lower bound?
The classical definition on Wikipedial: http://en.wikipedia.org/wiki/Upper_and_lower_bounds does not help much.
My current understanding of this is:
lower bound of the comparison-based sorting is actually the upper bound for the worst case.
namely, how best you could in the worst case.
Is this correct? Thanks.
lower bound of the comparison-based sorting is actually the upper bound for the best case.
No.
The function that you are bounding is the worst-case running time of the best possible sorting algorithm.
Imagine the following game:
We choose some number n.
You pick your favorite sorting algorithm.
After looking at your algorithm, I pick some input sequence of length n.
We run your algorithm on my input, and you give me a dollar for every executed instruction.
The O(n log n) upper bound means you can limit your cost to at most O(n log n) dollars, no matter what input sequence I choose.
The Ξ©(n log n) lower bound means that I can force you to pay at least Ξ©(n log n) dollars, no matter what sorting algorithm you choose.
Also: "The lower bound is O(n log n)" doesn't make any sense. O(f(n)) means "at most a constant times f(n)". But "lower bound" means "at least ...". So saying "a lower bound of O(n log n)" is exactly like saying "You can save up to 50% or more!" βΒ it's completely meaningless! The correct notation for lower bounds is Ξ©(...).
The problem of sorting can be viewed as following.
Input: A sequence of n numbers .
Output: A permutation (reordering) of the input sequence such that aβ1 <= aβ2 β¦.. <= aβn.
A sorting algorithm is comparison based if it uses comparison operators to find the order between two numbers. Comparison sorts can be viewed abstractly in terms of decision trees. A decision tree is a full binary tree that represents the comparisons between elements that are performed by a particular sorting algorithm operating on an input of a given size. The execution of the sorting algorithm corresponds to tracing a path from the root of the decision tree to a leaf. At each internal node, a comparison ai aj is made. The left subtree then dictates subsequent comparisons for ai aj, and the right subtree dictates subsequent comparisons for ai > aj. When we come to a leaf, the sorting algorithm has established the ordering. So we can say following about the decison tree.
1) Each of the n! permutations on n elements must appear as one of the leaves of the decision tree for the sorting algorithm to sort properly.
2) Let x be the maximum number of comparisons in a sorting algorithm. The maximum height of the decison tree would be x. A tree with maximum height x has at most 2^x leaves.
After combining the above two facts, we get following relation.
n! <= 2^x
Taking Log on both sides.
\log_2n! <= x
Since \log_2n! = \Theta(nLogn), we can say
x = \Omega(nLog_2n)
Therefore, any comparison based sorting algorithm must make at least \Omega(nLog_2n) comparisons to sort the input array, and Heapsort and merge sort are asymptotically optimal comparison sorts.
When you do asymptotic analysis you derive an O or Ξ or Ξ© for all input.
But you can also make analysis on whether properties of the input affect the runtime.
For example algorithms that take as input something almost sorted have better performance than the formal asymptotic formula due to the input characteristics and the structure of the algorithm. Examples are bubblesort and quicksort.
It is not that you can go bellow the lower boundaries. It only behavior of the implementation on specific input.
Imagine all the possible arrays of things that could be sorted. Lets say they are arrays of length 'n' and ignore stuff like arrays with one element (which, of course, are always already sorted.
Imagine a long list of all possible value combinations for that array. Notice that we can simplify this a bit since the values in the array always have some sort of ordering. So if we replace the smallest one with the number 1, the next one with 1 or 2 (depending on whether its equal or greater) and so forth, we end up with the same sorting problem as if we allowed any value at all. (This means an array of length n will need, at most, the numbers 1-n. Maybe less if some are equal.)
Then put a number beside each one telling how much work it takes to sort that array with those values in it. You could put several numbers. For example, you could put the number of comparisons it takes. Or you could put the number of element moves or swaps it takes. Whatever number you put there indicates how many operations it takes. You could put the sum of them.
One thing you have to do is ignore any special information. For example, you can't know ahead of time that the arrangement of values in the array are already sorted. Your algorithm has to do the same steps with that array as with any other. (But the first step could be to check if its sorted. Usually that doesn't help in sorting, though.)
So. The largest number, measured by comparisons, is the typical number of comparisons when the values are arranged in a pathologically bad way. The smallest number, similarly, is the number of comparisons needed when the values are arranged in a really good way.
For a bubble sort, the best case (shortest or fastest) is if the values are in order already. But that's only if you use a flag to tell whether you swapped any values. In that best case, you look at each adjacent pair of elements one time and find they are already sorted and when you get to the end, you find you haven't swapped anything so you are done. that's n-1 comparisons total and forms the lowest number of comparisons you could ever do.
It would take me a while to figure out the worst case. I haven't looked at a bubble sort in decades. But I would guess its a case where they are reverse ordered. You do the 1st comparison and find the 1st element needs to move. You slide up to the top comparing to each one and finally swap it with the last element. So you did n-1 comparisons in that pass. The 2nd pass starts at the 2nd element and does n-2 comparisons and so forth. So you do (n-1)+(n-2)+(n-3)+...+1 comparisons in this case which is about (n**2)/2.
Maybe your variation on bubble sort is better than the one I described. No matter.
For bubble sort then, the lower bound is n-1 and the upper bound is (n**2)/2
Other sort algorithms have better performance.
You might want to remember that there are other operations that cost besides comparisons. We use comparisons because much sorting is done with strings and a string comparison is costly in compute time.
You could use element swaps to count (or the sum of swaps and elements swaps) but they are typically shorter than comparisons with strings. If you have numbers, they are similar.
You could also use more esoteric things like branch prediction failure or memory cache misses or for measuring.
I know that counting and radix sorts are generally considered to run in O(n) time, and I believe I understand why. I'm being asked in an assignment, however, to explain why these sorts may not necessarily sort a list of distinct, positive integers in O(n) time. I can't come up with any reason.
Any help is appreciated!
To say that counting or radix sort is O(n) is actually not correct.
Counting sort is O(n+k) where k is the maximum value of any element in the array.
The reason is the you have to step through the entire list to populate the counts (O(n)), then step through the counts (O(k)).
Radix sort is O(mn) where m is the maximum number of digits of a number.
The reason is that you have to step through the array once for each digit (O(n) m times).