I was reading the quickselect algorithm on wikipedia: https://en.wikipedia.org/wiki/Quickselect
function select(list, left, right, k)
if left = right // If the list contains only one element,
return list[left] // return that element
pivotIndex := ... // select a pivotIndex between left and right,
// e.g., left + floor(rand() % (right - left + 1))
pivotIndex := partition(list, left, right, pivotIndex)
// The pivot is in its final sorted position
if k = pivotIndex
return list[k]
else if k < pivotIndex
return select(list, left, pivotIndex - 1, k)
else
return select(list, pivotIndex + 1, right, k - pivotIndex)
Isn't the last recursive call incorrect? I believe it the last argument should just be k rather than k - pivotIndex. Am I missing something here?
You are right - the last correction from September 20 introduced this error.
Top comment says that
// Returns the k-th smallest element of list within left..right inclusive
// (i.e. left <= k <= right).
and k is defined over all index range, it is absolute, not relative to local low border, as you noticed in comment.
Aslo checked my implementation of kselect, it uses k in the second call.
Related
I am studying quickselect for a midterm in my algorithms analysis course and the algorithm I have been working with is the following:
Quickselect(A[L...R],k)
// Input: Array indexed from 0 to n-1 and an index of the kth smallest element
// Output: Value of the kth position
s = LomutoPartition(A[L...R]) // works by taking the first index and value as the
// pivot and returns it's index in the sorted position
if(s == k-1) // we have our k-th element, it's k-1 because arrays are 0-indexed
return A[s]
else if(s> L+k-1) // this is my question below
Quickselect(L...s-1,k) // basically the element we want is somewhere to the left
// of our pivot so we search that side
else
Quickselect(s+1...R, k-1-s)
/* the element we want is greater than our pivot so we search the right-side
* however if we do we must scale the k-th position accordingly by removing
* 1 and s so that the new value will not push the sub array out of bounds
*/
My question is why in the first if do we need L + k - 1? Doing a few examples on paper I have come to the conclusion that no matter the context L is always an index and that index is always 0. Which does nothing for the algorithm right?
There seems to be a discrepancy between the line
if(s == k-1)
and the line
else if(s> L+k-1)
The interpretations are incompatible.
As Trincot correctly notes, from the second recursive call on, it's possible that L is not 0. Your Lomuto subroutine doesn't take an array, a low index, and a high index (as the one in Wikipedia does, for example). Instead it just takes an array (which happens to be a subarray between low and hight of some other array). The index s it returns is thus relative to the subarray, and to translate it to the position within the original array, you need to add L. This is consistent with your first line, except that the line following it should read
return A[L + s]
Your second line should therefore also compare to k - 1, not L + k - 1.
Edit
Following the comment, here is the pseudo-code from Wikipedia:
// Returns the n-th smallest element of list within left..right inclusive
// (i.e. left <= n <= right).
// The search space within the array is changing for each round - but the list
// is still the same size. Thus, n does not need to be updated with each round.
function select(list, left, right, n)
if left = right // If the list contains only one element,
return list[left] // return that element
pivotIndex := ... // select a pivotIndex between left and right,
// e.g., left + floor(rand() % (right - left + 1))
pivotIndex := partition(list, left, right, pivotIndex)
// The pivot is in its final sorted position
if n = pivotIndex
return list[n]
else if n < pivotIndex
return select(list, left, pivotIndex - 1, n)
else
return select(list, pivotIndex + 1, right, n)
Note the conditions
if n = pivotIndex
and
else if n < pivotIndex
which are consistent in their interpretation of the indexing returned in partitioning.
Once again, it's possible to define the partitioning sub-routine either as returning the index relative to the start of the sub-array, or as returning the index relative to the original array, but there must be consistency in this.
My question is very similar to Q1 and Q2, except that I want to deal with the case where the array may have duplicate entries.
Assume the array A consists of integers sorted in increasing order. If its entries are all distinct, you can do this easily in O(log n) with binary search. But if there are duplicate entries, it's more complicated. Here's my approach:
int search(const vector<int>& A) {
int left = 0, right = A.size() - 1;
return binarySearchHelper(A, left, right);
}
int binarySearchHelper(const vector<int>& A, int left, int right) {
int indexFound = -1;
if (left <= right) {
int mid = left + (right - left) / 2;
if (A[mid] == mid) {
return mid;
} else {
if (A[mid] <= right) {
indexFound = binarySearchHelper(A, mid + 1, right);
}
if (indexFound == -1 && A[left] <= mid) {
indexFound = binarySearchHelper(A, left, mid - 1);
}
}
}
return indexFound;
}
In the worst case (A has no element equal to its index), binarySearchHelper makes 2 recursive calls with input size halved at each level of recursion, meaning it has a worst-case time complexity of O(n). That's the same as the O(n) approach where you just read through the array in order. Is this really the best you can do? Also, is there a way to measure the algorithm's average time complexity? If not, is there some heuristic for deciding when to use the basic O(n) read-through approach and when to try a recursive approach such as mine?
If A has negative integers, then it's necessary to check the condition if (left <= right) in binarySearchHelper. Since, for example, if A = [-1], then the algorithm would recurse from bsh(A, 0, 0) to bsh(A,1,0) and to bsh(A,0,-1). My intuition leads me to believe the check if (left <= right) is necessary if and only if A has some negative integers. Can anyone help me verify this?
I would take a different approach. First I would eliminate all negative numbers in O(log n) simply by doing a binary search for the first positive number. This is allowed because no negative number can be equal to its index. Let's say the index of the first positive element is i.
Now I will keep doing the following until I find the element or find that it doesn't exist:
If i not inside A, return false.
If i < A[i] do i = A[i]. It would take A[i] - i duplicates to have i 'catch up' to A[i], so we would increment i by A[i] - i, this is equivalent to setting i to A[i]. Go to 1.
If i == A[i] return true (and index if you want to).
Find the first index greater than i such that i <= A[i]. You can do this doing a 'binary search from the left' by incrementing i by 1, 2, 4, 8, etc and then doing a binary search on the last interval you found it in. If it doesn't exist, return false.
In the worst case the above is stil O(n), but it has many tricks to speed it up way beyond that in better cases.
I am revising the quick sort algorithm however it is prooving to be a bit more complex that I thought.
Suppose My Array has the following A = {7,1,5,8,2,0}
Now I select my pivot as the index 2 of the array and it has the value 5. (Eventually all elements less than 5 would be on LHS and elements greater would be on RHS)
Now I start moving from left (index 0) towards right(index 2) till I reach a value that is greater than 5. If a value on the left side is greater than the pivot value 5 then it needs to move to the right side. For it to move to the right side it requires an empty slot so that both the value can be interchanged. So the first interchange gives me the array
A = {0,1,5,8,2,7}
Now two elements still remain on the left side the 2 and 7 (The right side also moves towards the pivot - leftwards and if it is less than th epivot it is suppose to move to the other side).
Now here is the question what would happen if there is no slot in the right side and an element on the left side needs to be moved to the right side of the pivot ? Am I missing something ?
Well, the "partition" step you're tazlking about, can be implemented in various ways.
The easiest way to implement is imo this way:
1) Pick a pivot element
2) Move the pivot element as the most rightmost element
3) Do a left scan and stack all the elements that are smaller than pivot sequentially.
4) Finally you know how many elements are smaller -> do the final swap to make sure pivot element ends up in the correct place.
I've taken this from the wiki, and added number steps to the code, just to make it clear.
// left is the index of the leftmost element of the subarray
// right is the index of the rightmost element of the subarray (inclusive)
// number of elements in subarray = right-left+1
partition(array, left, right)
pivotIndex := choosePivot(array, left, right) // step 1
pivotValue := array[pivotIndex]
swap array[pivotIndex] and array[right] // step 2
storeIndex := left
for i from left to right - 1 // step 3
if array[i] < pivotValue
swap array[i] and array[storeIndex]
storeIndex := storeIndex + 1
swap array[storeIndex] and array[right] // step 4
return storeIndex
the basic idea of quick sort is ,
you choose a pivot element, and try to place all the elements less than pivot left to pivot element , and greater than or equal to to the right. this process happens recursively.
As yo have chosen 5 , a point from left and other from right moves on towards each other comparing each element with pivot, and if these two pointers cross over you swap left pointer with pivot.
In the first case , you have swapped 0 and 7 , which is fine ,but now the index advances from one point , now the left pointer points to element 1, and right pointer to 2 . Right pointer stops at 2 as it is less than pivot 5., left pointer comes till 8 and swaps 8 and 2. the pointer advance one more time, the left pointer cross over the right pointer , hence it swaps with 2.
now if you see, 5 is in its correct place.
it would look like
0,1,2,5,8,7
link useful: https://www.youtube.com/watch?v=8hHWpuAPBHo
Algorithm:
// left is the index of the leftmost element of the subarray
// right is the index of the rightmost element of the subarray (inclusive)
// number of elements in subarray = right-left+1
partition(array, left, right)
pivotIndex := choosePivot(array, left, right)
pivotValue := array[pivotIndex]
swap array[pivotIndex] and array[right]
storeIndex := left
for i from left to right - 1
if array[i] < pivotValue
swap array[i] and array[storeIndex]
storeIndex := storeIndex + 1
swap array[storeIndex] and array[right] // Move pivot to its final place
return storeIndex
Given an array, such as [7,8,9,0,1,2,3,4,5,6], is it possible to determine the index around which a rotation has occurred faster than O(n)?
With O(n), simply iterate through all the elements and mark the first decreasing element as the index.
A possibly better solution would be to iterate from both ends towards the middle, but this still has a worst case of O(n).
(EDIT: The below assumes that elements are distinct. If they aren't distinct, I don't think there's anything better than just scanning the array.)
You can binary search it. I won't post any code, but here's the general idea: (I'll assume that a >= b for the rest of this. If a < b, then we know it's still in its sorted order)
Take the first element, call it a, the last element b, and the middle element, calling it c.
If a < c, then you know that the pivot is between c and b, and you can recurse (with c and b as your new ends). If a > c, then you know that the pivot is somewhere between the two, and recurse in that half (with a and c as ends) instead.
ADDENDUM: To extend to cases with repeats, if we have a = c > b then we recurse with c and b as our ends, while if a = c = b, we scan from a to c to see if there is some element d such that it differs. If it doesn't exist, then all of the numbers between a and c are equal, and thus we recurse with c and b as our ends. If it does, there are two scenarios:
a > d < b: Here, d is then the smallest element since we scanned from the left, and we're done.
a < d > b: Here, we know the answer is somewhere between d and b, and so we recurse with those as our ends.
In the best case scenario, we never have to use the equality case, giving us O(log n). Worst case, those scans encompass almost all of the array, giving us O(n).
For an array of N size if the array has been rotated minimum 1 time and less than N times, i think it will work fine:
int low=0, high = n-1;
int mid = (low +high)/2;
while( mid != low && mid != high)
{
if(a[low] < a[mid])
low = mid;
else
high = mid;
mid = (low + high)/2;
}
return high;
You can use a binary search. If you pick 1 as the central value, you know the break is in the first half because 7 > 1 < 6.
One observation is that the shift is equal to the index of the minimal element. So all you have to do is to use the binary search to find the minimal element. The only catch is that if the array has the equal elements, the task gets a bit tricky: you cannot achieve a better big O efficiency than O(N) time because you can have in input like [0, 0, 0, 0, ..., 100, 0, 0, 0, ..., 0] where you cannot find the only non-zero element quicker than linearly of course. But still the following algorithm achieves O(Mins + Log(N)) where Mins is the number of minimal elements iff the array[0] is one of the minima (otherwise Mins = 0 giving no penalty).
l = 0;
r = len(array) - 1;
while( l < r && array[l] == array[r] ) {
l = l + 1;
}
while( l < r ) {
m = (l + r) / 2;
if( array[m] > array[r] ) {
l = m + 1;
} else {
r = m;
}
}
// Here l is the answer: shifting the array l elements left will make it sorted
This works in O(log N) for unique-element arrays and O(N) for non-unique element ones (but still faster than a naive solution for majority of inputs).
Precondition
Array is sorted in ascending order
Array is left rotated
Approach
First we need to find the index at which smallest element is there.
The number of times array has been rotated will be equal to the difference of length of array and the index at which the smallest element is there.
so the task is to find the index of the smallest element. we can find the index of lowest element in two ways
Method 1 -
Just traverse the array and when the current element is smaller than the next element then the current index is the index of smallest element. In worst case this will take O(n)
Method 2
Find the middle element using (lowIndex+highIndex)/2
and then we need to find that in which side of the middle element we can find the smallest element because it can be found either left or right of the middle elment
we compare the first element to the middle element and if first element is greater than the middle element then it means smallest element lies in the left side of middle element and
if the first element is smaller than the middle element then lowest element can be found in the right side of the middle element
so this can be applied like a binary search and in O(log(n)) we cab find the index of smallest element
Using a recursive method:
static void Main(string[] args)
{
var arr = new int[]{7,8,9,0,1,2,3,4,5,6};
Console.WriteLine(FindRotation(arr));
}
private static int FindRotation(int[] arr)
{
var mid = arr.Length / 2;
return CheckRotation(arr, 0, mid, arr.Length-1);
}
private static int CheckRotation(int[] arr, int start, int mid, int end)
{
var returnVal = 0;
if (start < end && end - start > 1)
{
if (arr[start] > arr[mid])
{
returnVal = CheckRotation(arr, start, start + ((mid - start) / 2), mid);
}
else if (arr[end] < arr[mid])
{
returnVal = CheckRotation(arr, mid, mid + ((end - mid) / 2), end);
}
}
else
{
returnVal = end;
}
return returnVal;
}
Suppose I had an unsorted array A of size n.
How to find the n/2, n/2−1, n/2+1th smallest element from the original unsorted list in linear time?
I tried to use the selection algorithm in wikipedia (Partition-based general selection algorithm is what I am implementing).
function partition(list, left, right, pivotIndex)
pivotValue := list[pivotIndex]
swap list[pivotIndex] and list[right] // Move pivot to end
storeIndex := left
for i from left to right-1
if list[i] < pivotValue
swap list[storeIndex] and list[i]
increment storeIndex
swap list[right] and list[storeIndex] // Move pivot to its final place
return storeIndex
function select(list, left, right, k)
if left = right // If the list contains only one element
return list[left] // Return that element
select pivotIndex between left and right //What value of pivotIndex shud i select??????????
pivotNewIndex := partition(list, left, right, pivotIndex)
pivotDist := pivotNewIndex - left + 1
// The pivot is in its final sorted position,
// so pivotDist reflects its 1-based position if list were sorted
if pivotDist = k
return list[pivotNewIndex]
else if k < pivotDist
return select(list, left, pivotNewIndex - 1, k)
else
return select(list, pivotNewIndex + 1, right, k - pivotDist)
But I have not understood 3 or 4 steps. I have following doubts:
Did I pick the correct algorithm and will it really work in linear time for my program. I am bit confused as it resembles like quick sort.
While Calling the Function Select from the main function, what will be the value of left, right and k. Consider my array is list [1...N].
Do I have to call select function three times, one time for finding n/2th smallest, another time for finding n/2+1 th smallest and one more time for n/2-1th smallest, or can it be done on a single call, if yes, how?
Also in function select (third step) "select pivotIndex between left and right", what value of pivotIndex should I select for my program/purpose.
Thanks!
It is like quicksort, but it's linear because in quicksort, you then need to handle both the left and right side of the pivot, while in quickselect, you only handle one side.
The initial call should be Select(A, 0, N, (N-1)/2) if N is odd; you'll need to decide exactly what you want to do if N is even.
To find the median and left/right, you probably want to call it to find the median, and then just do the max of the components of the array to the left and the min of the components to the right, because you know once the median selection phase is done that all elements to the left of the median will be less than it and to the right will be greater (or equal). This is O(n) + n/2 + n/2 = O(n) total time.
There are lots of ways to choose pivot indices. For casual purposes, either the middle element or a random index will probably suffice.