I've noticed a discrepancy in the way quicksort is called recursively.
One way is
quicksort(Array, left, right)
x = partition(Array, left, right)
quicksort(Array, left, x-1)
quicksort(Array, x+1, right)
partition(array, left, right)
pivotIndex := choose-pivot(array, left, right)
pivotValue := array[pivotIndex]
swap array[pivotIndex] and array[right]
storeIndex := left
for i from left to right - 1
if array[i] ≤ pivotValue
swap array[i] and array[storeIndex]
storeIndex := storeIndex + 1
swap array[storeIndex] and array[right] // Move pivot to its final place
return storeIndex
[EXAMPLE]
This makes sense because quicksort works by partitioning the other elements around the pivot so the element Array[x] should be in its final position. Therefore the range [left, partion-1] and [partition+1, right] remains.
The other way
quicksort(Array, left, right)
x = partition(Array, left, right)
quicksort(Array, left, x)
quicksort(Array, x+1, right)
PARTITION(A,p,r)
x A[p]
i p - 1
j r + 1
while TRUE
do repeat j j - 1
until A[j] x
repeat i i + 1
until A[i] x
if i < j
then exchange A[i] A[j]
else return j
[EXAMPLE]
Notice the -1 is missing. These seems to suggest that the array was partitioned correctly but no single element is in its final position. These two ways are not interchangeable, if I put in a -1 in the second way an input array is improperly sorted.
What causes the difference? Obviously it's somewhere in the partition method, does it have to do with Hoare's or Lumuto's algorithm was used?
There is not actually that much difference in efficiency between the two versions, except when operating on the smallest arrays. The majority of the work is done in separating one large array of size n, whose values can be at many as n spaces away from their proper positions, into two smaller arrays which, being smaller, cannot have values as far displaced from their proper positions, even in the worst case. The "one way" essentially creates three partitions at each step - but since the third one is just one space large, it only makes an O(1) contribution towards the progress of the algorithm.
That being said, it's very easy to implement that final switch, so I'm not sure why the code of your "other way" example doesn't take that step. They even point out a pitfall (if the last rather than the first element is chosen for the pivot, the recursion never ends) which would be avoided entirely by implementing that switch that eliminates the pivot element at the end. The only situation I can imagine where that would be the preferable code to use would be where code space was at an absolute premium.
If nothing else, excluding or passing the partition index might be the difference between closed and half-open intervals: right might be the first index not to touch - no telling from incomplete snippets without references.
The difference is caused because the return value of partition() means different things.
In One way, the return value of partition() is where the pivot that was used for the partition ended up in i.e. Array[x] after parition() is the pivot that was used in partition().
In Other way, the return value of partition() is NOT where the pivot that was used for the partition ended up in i.e. Array[x] after partition() is an element that was less than the pivot that was used in partition(), but we don't know much other than that. The actual pivot could be located anywhere in the upper half of the array.
From this it follows that the first recursive call with x-1 instead of x in the Other way could quite easily give incorrect results e.g. pivot = 8, Array[x] = 5 and Array[x-1] = 7.
If you think about it, the other way would not make any difference to the algorithm. If the partition algorithm is the same as in the first one, then including the pivot in one of the sub-arrays would not have any effect, since in that case none of the other elements would swap its place with the pivot in the sub array.
At most it would increase the number of comparisons by some number. Although I'm unsure if it would adversely affect the sorting time for large arrays.
Related
According to Wikipedia, Hoare's partition (partial code) looks like:
// Sorts a (portion of an) array, divides it into partitions, then sorts those
algorithm quicksort(A, lo, hi) is
if lo >= 0 && hi >= 0 && lo < hi then
p := partition(A, lo, hi)
quicksort(A, lo, p) // Note: the pivot is now included
quicksort(A, p + 1, hi)
I was curious why the pivot is included in the lo...p call but not in the p + 1...hi call (whereas they are both excluded in Lomuto's partitioning).
Wikipedia wrote:
With this formulation it is possible that one sub-range turns out to be the whole original range, which would prevent the algorithm from advancing. Hoare therefore stipulates that at the end, the sub-range containing the pivot element (which still is at its original position) can be decreased in size by excluding that pivot, after (if necessary) exchanging it with the sub-range element closest to the separation; thus, termination of quicksort is ensured.
Why are we allowed to include the pivot in the lo...p subrange, but not in the p + 1...hi subrange? By the same logic in the above Wikipedia page, if the lo...p subrange is exactly the original range, wouldn't we run into the same infinite recursion problems?
The index p may not be the pivot index. Elements equal to the pivot or the pivot itself can end up anywhere during a partition step. After a partition step, elements <= pivot are to the left or at p, elements >= pivot are to the right of p. By allowing the pivot or elements equal to the pivot to be swapped, the inner loops do not need to do bounds checking. Another advantage of Hoare partition scheme is it becomes faster as the number of duplicates increases (despite often swapping equal elements), while Lomuto becomes slower, degrading to O(n^2) time complexity if all elements are equal.
Can you give an example that 2 partition scheme give different result ?
With Lomuto's we have to write:
quicksort(A,l,p)
quicksort(A,p+1,h)
While with Hoare's:
quicksort(A,l,p+1)
quicksort(A,p+1,h)
(Operations performed in [low,high))
What's the difference ?
The basic Lomuto partition scheme swaps the pivot out of the way, does the partition, swaps the pivot into place and then returns an index to the pivot at its sorted position. In this case, the pivot can be excluded from the recursive calls:
The basic Hoare partition scheme scans from both ends towards some point within the partition, putting all elements less than the pivot to the left of all elements greater than the pivot, but any elements equal to the pivot, including the pivot itself, can end up anywhere in the partition, and the index returned is the split point between the left (elements <= pivot) and right (elements >= pivot), so the calling code cannot exclude the element at the index returned from Hoare partition function from recursive calls. If the Hoare scheme is modified to be similar to Lomuto, where it swaps the pivot to either end, does the partition, then swaps the pivot to the split index, then the calling code can exclude the pivot, but this ends up being slower.
Difference between these partitions is not in reqursive calls.
Really any partition (that support correct interface) might be used with the same implementation of the main routine.
Partition functiion usually returns index of pivot. This pivot already stands at the final place. It is not needed to treat this index again.
So for the case when low is included in treatment but highis not, we can write
pivotindex = partition(arr, low, high);
// Separately sort elements before pivotindex and after pivotindex
quickSort(arr, low, pivotindex);
quickSort(arr, pivotindex + 1, high);
To understand the difference, we also need to focus on the partition method and not just on the calls to quicksort.
Lomuto partition scheme:
algorithm partition(A, lo, hi) is
pivot := A[hi]
i := lo
for j := lo to hi - 1 do
if A[j] < pivot then
swap A[i] with A[j]
i := i + 1
swap A[i] with A[hi]
return i
Hoare partition scheme:
algorithm partition(A, lo, hi) is
pivot := A[lo + (hi - lo) / 2]
i := lo - 1
j := hi + 1
loop forever
do
i := i + 1
while A[i] < pivot
do
j := j - 1
while A[j] > pivot
if i >= j then
return j
swap A[i] with A[j]
(Adding the above as image because I couldn't insert formatted table here. Please click on the image for better view.)
Also, Hoare’s scheme is more efficient than Lomuto’s partition scheme because it does three times fewer swaps on average, and it creates efficient partitions even when all values are equal.
I have just mentioned the key differentiating points. I suggest you to read the above two hyperlinks. You might want to gain more knowledge on this topic by reading this.
Comment if you have any further doubts and we will help you solve the doubts.
I'm having a hard time understanding quicksort, most of the demonstrations and explanations leave out what actually happens (http://me.dt.in.th/page/Quicksort/ for example).
Wikipedia says:
Pick an element, called a pivot, from the array. Partitioning: reorder
the array so that all elements with values less than the pivot come
before the pivot, while all elements with values greater than the
pivot come after it (equal values can go either way). After this
partitioning, the pivot is in its final position. This is called the
partition operation. Recursively apply the above steps to the
sub-array of elements with smaller values and separately to the
sub-array of elements with greater values.
How would that work with an array of 9,1,7,8,8 for example with 7 as the pivot? The 9 needs to move to the right of the pivot, all quicksort implementations are in place operations it seems so we can't add it after the 8,8, so the only option is to swap the 9 with the 7.
Now the array is 7,1,9,8,8. The idea behind quicksort is that now we have to recursively sort the parts to the left and right of the pivot. The pivot is now at position 0 of the array, meaning there's no left part, so we can only sort the right part. This is of no use as 7>1 so the pivot ended up in the wrong place.
In this image 4 is the pivot, then why is 5 going almost all the way to the left? It's bigger than 4! After a lot of swapping it ends up being sorted but I don't understand how that happened.
Quicksort
The Quicksort steps are:
Pick an element, called a pivot, from the list.
Reorder the list so that all elements with values less than the pivot come before the pivot, while all elements with values greater than the pivot come after it (equal values can go either way). After this partitioning, the pivot is in its final position. This is called the partition operation.
Recursively sort the sub-list of lesser elements and the sub-list of greater elements.
The base case of the recursion are lists of size zero or one, which never need to be sorted.
Lomuto partition scheme
This scheme chooses a pivot which is typically the last element in
the array.
The algorithm maintains the index to put the pivot in variable i and each time it finds an element less than or equal to pivot, this
index is incremented and that element would be placed before the
pivot.
As this scheme is more compact and easy to understand, it is frequently used in introductory material.
Is less efficient than Hoare's original scheme.
Partition algorithm (using Lomuto partition scheme)
algorithm partition(A, lo, hi) is
pivot := A[hi]
i := lo // place for swapping
for j := lo to hi – 1 do
if A[j] ≤ pivot then
swap A[i] with A[j]
i := i + 1
swap A[i] with A[hi]
return i
Quicksort algorithm (using Lomuto partition scheme)
algorithm quicksort(A, lo, hi) is
if lo < hi then
p := partition(A, lo, hi)
quicksort(A, lo, p – 1)
quicksort(A, p + 1, hi)
Hoare partition scheme
Uses two indices that start at the ends of the array being
partitioned, then move toward each other, until they detect an
inversion: a pair of elements, one greater than the pivot, one
smaller, that are in the wrong order relative to each other. The
inverted elements are then swapped.
There are many variants of this algorithm, for example, selecting pivot from A[hi] instead of A[lo]
partition algorithm (using Hoare partition scheme)
algorithm partition(A, lo, hi) is
pivot := A[lo]
i := lo – 1
j := hi + 1
loop forever
do
i := i + 1
while A[i] < pivot
do
j := j – 1
while A[j] > pivot
if i >= j then
return j
swap A[i] with A[j]
quicksort algorithm(using Hoare partition scheme)
algorithm quicksort(A, lo, hi) is
if lo < hi then
p := partition(A, lo, hi)
quicksort(A, lo, p)
quicksort(A, p + 1, hi)
Hoare partition scheme vs Lomuto partition scheme
The pivot selection
The execution speed of the algorithm depends largely on how this mechanism is implemented, poor implementation can assume that the algorithm is run at a slow speed.
The choice of pivot determines partitions the data list, therefore, this is the most critical part of the implementation of the Quicksort algorithm. It is important to try that selecting the pivot left and right partitions have an identical size as much as possible.
Best and worst case
Worst case
The most unbalanced partition occurs when the pivot divides the list into two sublists of sizes _0 and n − 1. This may occur if the pivot happens to be the smallest or largest element in the list, or in some implementations when all the elements are equal.
Best Case
In the most balanced case, each time we perform a partition we divide the list into two nearly equal pieces. This means each recursive call processes a list of half the size.
Formal analysis
Worst-case analysis = O(n²)
Best-case analysis = O(n) factor
Average-case analysis = O(n log n)
Examples source
Using additional memory
def quicksort(array):
less = []
equal = []
greater = []
if len(array) > 1:
pivot = array[0]
for x in array:
if x < pivot:
less.append(x)
if x == pivot:
equal.append(x)
if x > pivot:
greater.append(x)
return sort(less)+equal+sort(greater)
else:
return array
Usage:
quicksort([12,4,5,6,7,3,1,15])
Without additional memory
def partition(array, begin, end):
pivot = begin
for i in xrange(begin+1, end+1):
if array[i] <= array[begin]:
pivot += 1
array[i], array[pivot] = array[pivot], array[i]
array[pivot], array[begin] = array[begin], array[pivot]
return pivot
def quicksort(array, begin=0, end=None):
if end is None:
end = len(array) - 1
if begin >= end:
return
pivot = partition(array, begin, end)
quicksort(array, begin, pivot-1)
quicksort(array, pivot+1, end)
Usage:
quicksort([97, 200, 100, 101, 211, 107])
In your example
Debug Lomuto partition
References:
http://www.cs.bilkent.edu.tr/~atat/473/lecture05.pdf
http://codefap.com/2012/08/the-quick-sort-algorithm/
http://visualgo.net/sorting
https://en.wikipedia.org/wiki/Quicksort
Some day I found this jewel, which animates the different Sorting Algorhitms which helped me a lot in understanding them! But this is just a graphical explanation, the poster prior to me (#Hydex), already answered in a academically way ;-)
Here is my implementation of an in-place quicksort algorithm, an adaptation from this video:
def partition(arr, start, size):
if (size < 2):
return
index = int(math.floor(random.random()*size))
L = start
U = start+size-1
pivot = arr[start+index]
while (L < U):
while arr[L] < pivot:
L = L + 1
while arr[U] > pivot:
U = U - 1
temp = arr[L]
arr[L] = arr[U]
arr[U] = temp
partition(arr, start, L-start)
partition(arr, L+1, size-(L-start)-1)
There seems to be a few implementations of the scanning step where the array (or current portion of the array) is divided into 3 segments: elements lower than the pivot, the pivot, and elements greater than the pivot. I am scanning from the left for elements greater than or equal to the pivot, and from the right for elements less than or equal to the pivot. Once one of each is found, the swap is made, and the loop continues until the left marker is equal to or greater than the right marker. However, there is another method following this diagram that results in less partition steps in many cases. Can someone verify which method is actually more efficient for the quicksort algorithm?
Both the methods you used are basically the same . In the above code
index = int(math.floor(random.random()*size))
Index is chosen randomly, so it can be first element or the last element. In the link https://s3.amazonaws.com/hr-challenge-images/quick-sort/QuickSortInPlace.png they Initailly take the last element as pivot and Move in same way as you do in the code.
So both methods are same. In your code you randomly select the pivot, In the Image - you state the Pivot.
Suppose I had an unsorted array A of size n.
How to find the n/2, n/2−1, n/2+1th smallest element from the original unsorted list in linear time?
I tried to use the selection algorithm in wikipedia (Partition-based general selection algorithm is what I am implementing).
function partition(list, left, right, pivotIndex)
pivotValue := list[pivotIndex]
swap list[pivotIndex] and list[right] // Move pivot to end
storeIndex := left
for i from left to right-1
if list[i] < pivotValue
swap list[storeIndex] and list[i]
increment storeIndex
swap list[right] and list[storeIndex] // Move pivot to its final place
return storeIndex
function select(list, left, right, k)
if left = right // If the list contains only one element
return list[left] // Return that element
select pivotIndex between left and right //What value of pivotIndex shud i select??????????
pivotNewIndex := partition(list, left, right, pivotIndex)
pivotDist := pivotNewIndex - left + 1
// The pivot is in its final sorted position,
// so pivotDist reflects its 1-based position if list were sorted
if pivotDist = k
return list[pivotNewIndex]
else if k < pivotDist
return select(list, left, pivotNewIndex - 1, k)
else
return select(list, pivotNewIndex + 1, right, k - pivotDist)
But I have not understood 3 or 4 steps. I have following doubts:
Did I pick the correct algorithm and will it really work in linear time for my program. I am bit confused as it resembles like quick sort.
While Calling the Function Select from the main function, what will be the value of left, right and k. Consider my array is list [1...N].
Do I have to call select function three times, one time for finding n/2th smallest, another time for finding n/2+1 th smallest and one more time for n/2-1th smallest, or can it be done on a single call, if yes, how?
Also in function select (third step) "select pivotIndex between left and right", what value of pivotIndex should I select for my program/purpose.
Thanks!
It is like quicksort, but it's linear because in quicksort, you then need to handle both the left and right side of the pivot, while in quickselect, you only handle one side.
The initial call should be Select(A, 0, N, (N-1)/2) if N is odd; you'll need to decide exactly what you want to do if N is even.
To find the median and left/right, you probably want to call it to find the median, and then just do the max of the components of the array to the left and the min of the components to the right, because you know once the median selection phase is done that all elements to the left of the median will be less than it and to the right will be greater (or equal). This is O(n) + n/2 + n/2 = O(n) total time.
There are lots of ways to choose pivot indices. For casual purposes, either the middle element or a random index will probably suffice.