A basic confusion on quicksort - algorithm

Suppose we choose a pivot as the first element of an array in case of a quicksort. Now the best/worst case complexity is O(n^2) whereas in average case it is O(nlogn). Is not it weird (best case complexity is greater than worst case complexity)?

The best case complexity is O(nlogn), as the average case. The worst case is O(n^2). Check http://en.wikipedia.org/wiki/Quick_sort.
While other algorithms like Merge Sort and Heap Sort have a better worst case complexity (O(nlogn)), usually Quick Sort is faster - this is why it's the most common used sorting algorithm. An interesting answer about this can be found at Why is quicksort better than mergesort?.

The best-case of quicksort 0(nlogn) is when the chosen pivot splits the subarray in two +- equally sized parts in every iteration.
Worst-case of quicksort is when the chosen pivot is the smallest element in the subarray, so that the array is split in two parts where one part consists of one element (the pivot) and the other part of all the other elements of the subarray.
So choosing the first element as the pivot in an already sorted array, will get you 0(n^2). ;)
Therefore it is important to choose a good pivot. For example by using the median of the first, middle and last element of the subarray, as the pivot.

Related

Time complexity while choosing various pivots in quick sort for sorted, reverse sorted, and repeated elements array

I have to solve the following question in an assignment:
Calculate the time and space complexities of the Quick Sort for following input. Also, discuss the method of calculating the complexity.
(a) When input array is already sorted.
(b) When input array is reverse sorted.
(c) When all the elements in the input array are the same.
I am having trouble in calculating the time complexities of the different cases. The following table shows pivot vs case, with cells in bold being the one where I have doubts.
TIME
First
Middle
Last
Sorted
O(n^2)
O(n*logn)
O(n^2)
Same
O(n^2)
O(n^2)
O(n^2)
Reverse
O(n^2)
O(n*logn)
O(n^2)
Are these right? If not, what am I doing wrong?
When all elements are the same, classic Lomuto partition scheme has worst case O(n^2) complexity, while Hoare partition scheme has best case O(n log(n)) complexity with middle value as pivot.
https://en.wikipedia.org/wiki/Quicksort#Repeated_elements

Finding ratio objects/key which Counting sort worst case faster than Quicksort worst case

I have a question which I am trying to solve for personal understanding of comparing algorithms as follows. I am given n to be the number of objects and m be the number of keys. I want to find the ratios objects/keys (m/n) that CountingSort is faster in the worst case compared to QuickSort when quicksort chooses the last element as the pivot.
So I have that the worst case running time for CountingSort is O(n+k) and the case when quicksort chooses the last element as the pivot is Theta(n^2). I'm confused of how to approach this question and would welcome some guidance so I could reach a solution.
My idea is that it will be for when the number of keys is double the number of objects? So we have that the runtime for worst case of quicksort is O(n^2) so we want to have counting sort less than this. Therefore if we have rations of m/n < m^2/n this will be the case?

O(nlogn) in-place sorting algorithm

This question was in the preparation exam for my midterm in introduction to computer science.
There exists an algorithm which can find the kth element in a list in
O(n) time, and suppose that it is in place. Using this algorithm,
write an in place sorting algorithm that runs in worst case time
O(n*log(n)), and prove that it does. Given that this algorithm exists,
why is mergesort still used?
I assume I must write some alternate form of the quicksort algorithm, which has a worst case of O(n^2), since merge-sort is not an in-place algorithm. What confuses me is the given algorithm to find the kth element in a list. Isn't a simple loop iteration through through the elements of an array already a O(n) algorithm?
How can the provided algorithm make any difference in the running time of the sorting algorithm if it does not change anything in the execution time? I don't see how used with either quicksort, insertion sort or selection sort, it could lower the worst case to O(nlogn). Any input is appreciated!
Check wiki, namely the "Selection by sorting" section:
Similarly, given a median-selection algorithm or general selection algorithm applied to find the median, one can use it as a pivot strategy in Quicksort, obtaining a sorting algorithm. If the selection algorithm is optimal, meaning O(n), then the resulting sorting algorithm is optimal, meaning O(n log n). The median is the best pivot for sorting, as it evenly divides the data, and thus guarantees optimal sorting, assuming the selection algorithm is optimal. A sorting analog to median of medians exists, using the pivot strategy (approximate median) in Quicksort, and similarly yields an optimal Quicksort.
The short answer why mergesort is prefered over quicksort in some cases is that it is stable (while quicksort is not).
Reasons for merge sort. Merge Sort is stable. Merge sort does more moves but fewer compares than quick sort. If the compare overhead is greater than move overhead, then merge sort is faster. One situation where compare overhead may be greater is sorting an array of indices or pointers to objects, like strings.
If sorting a linked list, then merge sort using an array of pointers to the first nodes of working lists is the fastest method I'm aware of. This is how HP / Microsoft std::list::sort() is implemented. In the array of pointers, array[i] is either NULL or points to a list of length pow(2,i) (except the last pointer points to a list of unlimited length).
I found the solution:
if(start>stop) 2 op.
pivot<-partition(A, start, stop) 2 op. + n
quickSort(A, start, pivot-1) 2 op. + T(n/2)
quickSort(A, pibvot+1, stop) 2 op. + T(n/2)
T(n)=8+2T(n/2)+n k=1
=8+2(8+2T(n/4)+n/2)+n
=24+4T(n/4)+2n K=2
...
=(2^K-1)*8+2^k*T(n/2^k)+kn
Recursion finishes when n=2^k <==> k=log2(n)
T(n)=(2^(log2(n))-1)*8+2^(log2(n))*2+log2(n)*n
=n-8+2n+nlog2(n)
=3n+nlog2(n)-8
=n(3+log2(n))-8
is O(nlogn)
Quick sort have worstcase O(n^2), but that only occurs if you have bad luck when choosing the pivot. If you can select the kth element in O(n) that means you can choose a good pivot by doing O(n) extra steps. That yields a woest-case O(nlogn) algorithm. There are a couple of reasons why mergesort is still used. First, this selection algorithm is more or less cumbersome to implement in-place, and also adds several extra operations to the regular quicksort, so it is not that fastest than merge sort, as one might expect.
Nevertheless, MergeSort is not still used because of its worst time complexity, in fact HeapSort achieves the same worst case bounds and is also in place, and didn't replace MergeSort, though it has also other disadvantages against quicksort. The main reason why MergeSort survives is because it is the fastest stable sort algorithm know so far. There are several applications in which is paramount to have an stable sorting algorithm. And that is the strength of MergeSort.
A stable sort is such that the equal items preserve the original relative order. For example, this is very useful when you have two keys, and you want to sort by first key first and then by second key, preserving the first key order.
The problem with HeapSort against quicksort is that it is cache inefficient, since you swap/compare elements too far from each other in the array, while quicksort compares consequent elements, these elements are more likely to be in the cache at the same time.

What is the cut point where "quick sort" transforms from nlgn to n2?

I know the worst case of the algorithm - which is when the elements are already sorted or when all the elements are same,but want to know the point at which the algorithm moves from a complexity of nlgn to n2.
It depends on how we choose the pivot.
One view says that when all the elements are already sorted. Well, it is not 100% right. In this condition, if we choose the first element as the pivot, the complexity becomes N^2.
Since we have,
T(N) = T(N-1) + cN (N >1), if you are good at basic math, then:
T(N) = O(N^2)
As mentioned above, it depends on how we choose the pivot. Although in some textbooks, it chooses the first pivot mainly, that is not recommenced.
One popular method is : median-of-three partitioning. It choose median value of a[left],a[right] and a[(left+right)/2].
It will perform worst i.e,
O(n^2)
in following cases
If the list is already sorted and pivot is first element
If list is sorted in reverse order and pivot is last element.
If all elements are same in the list. In this case pivot selection do not matter.
Note- Already sorted cannot be the worst case if pivot is selected as median.
The worst case time for quick sort occurs when the chosen pivot does not divide the array. For example, if we choose the first element as the pivot every time and the array is already sorted, then the array is not divided at all. Hence the complexity is O(n^2).
To avoid this we randomize the index for the pivot. Assuming that the pivot splits the array in two equal sized parts we have a complexity of O(n log n).
For exact analysis see Formal analysis section in https://en.wikipedia.org/wiki/Quicksort

Sorting algorithm best case/worst case scenario

This is a practice exam question i'm working on, i have a general idea of what the answer is but would like some clarification.
The
following
is
a
sorting
algorithm
for
n
integers
in
an
array.
In
step
1,
you
iterate
through
the
array,
and
compare
each
pair
of
adjacent
integers
and
swap
each
pair
if
they
are
in
the
wrong
order.
In
step
2,
you
repeat
step
1
as
many
times
as
necessary
until
there
is
an
iteration
where
no
swaps
are
made
(in
which
case
the
list
is
sorted
and
you
can
stop).
What
is
the
worst case
complexity
of
this algorithm?
What is the best case complexity of this algorithm?
Basically the algorithm presented here is a bubble sort.
The worst case complexity here is O(n^2).
The best case complexity is O(n).
Here is the explanation:
The best case situation here would be "Already sorted array". so all you need is N comparisions(To be precise its n-1) so the complexity is O(n).
The worst case situation is reverse ordered array.
To better understand why its O(n^2), consider just first element of reverse ordered array which indeed is a largest element, to make this array sorted you need to get that element to the last index of the array. Through the algorithm explained in the question, on each iteration it takes the largest element one index towards its actual position(last index here) and it requires O(n) comparisions to move one posistion. and hence O(n^2) comparision to move it to its actual position.
In the best case, no swapping will be required and a single pass of the array would suffice. So the complexity is O(n).
In the worst case, the elements of the array could be in the reverse order. So the first iteration requires (n-1) swaps, the next one (n-2) and do on...
So it would lead to O(n^2) complexity.
As others have said, this is bubble sort. But if you are measuring complexity in terms of comparisons, you can easily be more precise than big-O.
In the best case, you need only compare n-1 pairs to verify they're all in the right order.
In the worst case, the first element is the one that should be in the last position, so n-1 passes will be needed, each advancing that element one more position toward the end of the list. Each pass requires n-1 comparisons. In all, then, (n-1)^2 comparisons are needed.

Resources