QuickSelect with Hoare partition scheme - algorithm

Is it possible to implement QuickSelect algorithm using Hoare partitioning?
At least at first glance it seems that it cannot be done because Hoare partitioning does not return the index of the pivot necessarily.
Am I missing something ?

With Hoare partition scheme, since the pivot or elements equal to the pivot can end up anywhere after a partition step, the base (terminating) case occurs when the partition size is reduced to a single element. Example code. QuickSelectr is the actual function. QuickSelect validates the parameters.
int QuickSelectr(int a[], int lo, int hi, int k )
{
if (lo == hi) // recurse until lo == hi
return a[lo];
int p = a[(lo+hi)/2]; // Hoare partition
int i = lo - 1;
int j = hi + 1;
while (1){
while (a[++i] < p);
while (a[--j] > p);
if (i >= j)
break;
std::swap(a[i], a[j]);
}
if(k <= j)
return QuickSelectr(a, lo, j-0, k); // include a[j]
else
return QuickSelectr(a, j+1, hi, k); // exclude a[j]
}
// parameter check
int QuickSelect(int *a, int lo, int hi, int k)
{
if(a == (int *)0 || k < lo || k > hi || lo > hi)
return 0;
return QuickSelectr(a, lo, hi, k);
}
Using i instead of j for the split:
int QuickSelectr(int a[], int lo, int hi, int k )
{
if (lo == hi) // recurse until lo == hi
return a[lo];
int p = a[(lo+hi+1)/2]; // Carefully note the +1 compared
// to the variant where we use j
int i = lo - 1;
int j = hi + 1;
while (1){
while (a[++i] < p);
while (a[--j] > p);
if (i >= j)
break;
std::swap(a[i], a[j]);
}
if(k < i)
return QuickSelectr(a, lo, i-1, k); // exclude a[i]
else
return QuickSelectr(a, i+0, hi, k); // include a[i]
}

I believe the existing answer presents a sub-optimal solution. You can simply amend Hoare's algorithm to return the index of the pivot, re:
because Hoare partitioning does not return the index of the pivot necessarily.
To do this, you select the first element of your array as the pivot and then you essentially ignore it, partitioning the remaining sub-array arr[1:] as you would normally. Then, at the end, you swap arr[0] with the element of the index you normally return.
This works since (vanilla) Hoare's algorithm returns an index idx such that:
for all j in [lo, idx], arr[j] <= arr[idx]
for all j in [idx, hi], arr[idx] <= arr[j]
Swapping your pivot with the element at arr[j] maintains this invariant.
Here's an example implementation written in Solidity (since I've had to implement such a thing in a smart contract in the past):
function partition
(
uint256[] memory arr,
uint256 lo,
uint256 hi
)
public
pure
returns (uint256)
{
uint pivot = arr[lo];
uint i = lo;
uint j = hi + 1;
while (true) {
do {
i++;
} while (i < arr.length && arr[i] < pivot);
do {
j--;
} while (arr[j] > pivot);
if (i >= j) {
// swap with pivot
(arr[lo], arr[j]) = (arr[j], arr[lo]);
return j;
}
(arr[i], arr[j]) = (arr[j], arr[i]);
}
}

Related

Quick sort -- What am i doing wrong?

Trying to do Quick sort.
logic -> maintaining two variables to place pivot element at correct index. Taking 1st element as pivot. int i for RHS of pivot and Int j for LHS, if they cross each other then j is correct index for pivot.
#include<iostream>
using namespace std;
int partition(int arr[], int low, int high){
int pivot = arr[low];
int i = low+1;
int j = high;
while (i<j)
{
while(arr[i]<=pivot) i++;
while(arr[j]> pivot) j--;
if(i<j) {
swap(arr[i], arr[j]);
}
swap(arr[j], arr[low]);
return j;
}
}
void QuickSort(int arr[], int low , int high){
if(low >= high ) return;
if(high>low){
int pivotindx = partition(arr, low , high);
QuickSort(arr,low, pivotindx-1);
QuickSort( arr, pivotindx+1, high);
}
}
void printquicksort(int arr[] , int n){
cout << " Quick SORT IS HERE BROOOO " << endl;
for (int i = 0; i < n; i++)
{
cout << " " << arr[i] << " " ;
}
}
int main()
{
int arr []={3,4,5,1};
int n= sizeof (arr)/ sizeof (arr[0]);
QuickSort(arr,0,n-1);
printquicksort(arr,n);
return 0;
}
Using i and j for LHS and RHS is type of Hoare partition scheme. The code has a potential issue when using low for the pivot, the while(arr[i]<=pivot) i++; may never encounter an element > pivot and scan past the end of the array. For Hoare partition scheme, the pivot and elements equal to the pivot can end up anywhere, and the partition index separate elements <= pivot and elements >= pivot, so the index needs to be included in one of the recursive calls. Example of a post-increment and post-decrement version of Hoare with the partition code included in QuickSort:
void QuickSort(int *a, int lo, int hi)
{
int i, j;
int p, t;
if(lo >= hi)
return;
p = a[lo + (hi-lo)/2];
i = lo;
j = hi;
while (i <= j){
while (a[i] < p)i++;
while (a[j] > p)j--;
if (i > j)
break;
t = a[i]; // swap
a[i] = a[j];
a[j] = t;
i++;
j--;
}
QuickSort(a, lo, j);
QuickSort(a, i, hi);
}
Example of a classic pre-increment and pre-decrement version of Hoare with the partition code included in QuickSort:
void QuickSort(int a[], int lo, int hi)
{
if(lo >= hi)
return;
int p = a[lo+(hi-lo)/2];
int i = lo-1;
int j = hi+1;
int t;
while(1){
while (a[++i] < p);
while (a[--j] > p);
if(i >= j)
break;
t = a[i]; // swap
a[i] = a[j];
a[j] = t;
}
i = j++;
QuickSort(a, lo, i);
QuickSort(a, j, hi);
}

Find a hash function to malfunction insertion sort

Below is the original pseudocode for Insertion Sort:
function INSERTIONSORT(A[0..n−1])
for i←1 to n−1 do
j←i−1
while j≥0 and A[j+1]<A[j] do
SWAP(A[j+1],A[j])
j←j−1
A company is using Insertion Sort inside one of their products. You are a cybersecurity expert that was hired by this company to assess any security flaws with their code. After a few tries, you managed to attack their Insertion Sort code and modify it in the following way:
function INSERTIONSORT(A[0..n−1])
for i←1 to n−1 do
j←i−1
while j≥0 and HASH(A,j+1) < HASH(A,j) do
SWAP(A[j+1],A[j])
j←j−1
In other words, instead of indexing the array as A[j] and A[j+1]inside the “while” condition, you now have a hash function that takes the array and a index as the arguments and return an integer. Your job is to implement specific hash functions that will cause the algorithm to malfunction in different ways.
a) Implement a hash function that causes Insertion Sort to keep the original array unchanged. Explain why your solution works.
b) Implement a hash function that causes Insertion Sort to always run in the worst case complexity, even if the resulting array does not end up getting sorted. Explain why your solution works.
c) Implement a hash function that causes Insertion Sort to sort the array in reverse. Explain why your solution works.
I think (a) and (b) is hash(A,j)=j and hash(A,j)=-j, but have no idea if that is correct and have no clue to c.
**Part a) Original array unchanged
#include <stdio.h>
int hash(int arr[], int i) {
return i;
}
void insertionSort(int arr[], int n) {
int i, j, temp;
for (i = 1 ; i <= n - 1; i++)
{
j = i-1;
while ( j >= 0 && hash(arr, j+1) < hash(arr, j))
{
temp = arr[j];
arr[j] = arr[j+1];
arr[j+1] = temp;
j--;
}
}
}
int main()
{
int i;
int arr[] = {5, 6, 7, 3, 2 , 9, 4};
int n = sizeof(arr)/sizeof(arr[0]);
insertionSort(arr, n);
printf("Original array unchanged:\n");
for (i = 0; i <= n - 1; i++)
{
printf("%d\n", arr[i]);
}
return 0;
}
Part b) Worst Case insertion sort
#include <stdio.h>
int hash(int arr[], int i) {
return -i;
}
void insertionSort(int arr[], int n) {
int i, j, temp;
for (i = 1 ; i <= n - 1; i++)
{
j = i-1;
while ( j >= 0 && hash(arr, j+1) < hash(arr, j))
{
temp = arr[j];
arr[j] = arr[j+1];
arr[j+1] = temp;
j--;
}
}
}
int main()
{
int i;
int arr[] = {5, 6, 7, 3, 2 , 9, 4};
int n = sizeof(arr)/sizeof(arr[0]);
insertionSort(arr, n);
printf("In worst case(number of swaps maximum)\n");
for (i = 0; i <= n - 1; i++)
{
printf("%d\n", arr[i]);
}
return 0;
}
Part c) Sorted in reverse order.**
#include <stdio.h>
int hash(int arr[], int i) {
return -arr[i];
}
void insertionSort(int arr[], int n) {
int i, j, temp;
for (i = 1 ; i <= n - 1; i++)
{
j = i-1;
while ( j >= 0 && hash(arr, j+1) < hash(arr, j))
{
temp = arr[j];
arr[j] = arr[j+1];
arr[j+1] = temp;
j--;
}
}
}
int main()
{
int i;
int arr[] = {5, 6, 7, 3, 2 , 9, 4};
int n = sizeof(arr)/sizeof(arr[0]);
insertionSort(arr, n);
printf("Sorted in reverse order:\n");
for (i = 0; i <= n - 1; i++)
{
printf("%d\n", arr[i]);
}
return 0;
}

Kth largest number, why the runtime of this is O(n) not O(nlogn)

I came across kth largest number problem in Leetcode
Input: [3,2,1,5,6,4] and k = 2, Output: 5
Suggested Solution:
public int findKthLargest(int[] nums, int k) {
shuffle(nums);
k = nums.length - k;
int lo = 0;
int hi = nums.length - 1;
while (lo < hi) {
final int j = partition(nums, lo, hi);
if(j < k) {
lo = j + 1;
} else if (j > k) {
hi = j - 1;
} else {
break;
}
}
return nums[k];
}
private int partition(int[] a, int lo, int hi) {
int i = lo;
int j = hi + 1;
while(true) {
while(i < hi && less(a[++i], a[lo]));
while(j > lo && less(a[lo], a[--j]));
if(i >= j) {
break;
}
exch(a, i, j);
}
exch(a, lo, j);
return j;
}
private void exch(int[] a, int i, int j) {
final int tmp = a[i];
a[i] = a[j];
a[j] = tmp;
}
private boolean less(int v, int w) {
return v < w;
}
Doesn't partition take O(n) and the while loop in the main function take O(log n) so it should be O(nlog n)? This looks like it uses Quicksort but the runtime for quicksort is O(nlogn). If quicksort takes O(n), this makes sense but it does not. Please help me understand what is going on?
This is a randomized algorithm that has average/expected O(n) runtime. This is because after randomly shuffling the input list, we typically have pivots good enough to expect that after each partition function call if we don't find the target yet we reduce our list (to be search next) roughly by half. This means even though if we not lucky and have to continuously call partition function we continuously keep reducing our list's size by half, therefore the average runtime is still only O(n) since O(n) + O(n/2) + O(n/4) + ... + O(1) is still O(n).

Find a subsequence of length k whose sum is equal to given sum

Given an array A and a sum, I want to find out if there exists a subsequence of length K such that the sum of all elements in the subsequence equals the given sum.
Code:
for i in(1,N):
for len in (i-1,0):
for sum in (0,Sum of all element)
Possible[len+1][sum] |= Possible[len][sum-A[i]]
Time complexity O(N^2.Sum). Is there any way to improve the time complexity to O(N.Sum)
My function shifts a window of k adjacent array items across the array A and keeps the sum up-to-data until it matches of the search fails.
int getSubSequenceStart(int A[], size_t len, int sum, size_t k)
{
int sumK = 0;
assert(len > 0);
assert(k <= len);
// compute sum for first k items
for (int i = 0; i < k; i++)
{
sumK += A[i];
}
// shift k-window upto end of A
for (int j = k; j < len; j++)
{
if (sumK == sum)
{
return j - k;
}
sumK += A[j] - A[j - k];
}
return -1;
}
Complexity is linear with the length of array A.
Update for the non-contiguous general subarray case:
To find a possibly non-contiguous subarray, you could transform your problem into a subset sum problem by subtracting sum/k from every element of A and looking for a subset with sum zero. The complexity of the subset sum problem is known to be exponential. Therefore, you cannot hope for a linear algorithm, unless your array A has special properties.
Edit:
This could actually be solved without the queue in linear time (negative numbers allowed).
C# code:
bool SubsequenceExists(int[] a, int k, int sum)
{
int currentSum = 0;
if (a.Length < k) return false;
for (int i = 0; i < a.Length; i++)
{
if (i < k)
{
currentSum += a[i];
continue;
}
if (currentSum == sum) return true;
currentSum += a[i] - a[i-k];
}
return false;
}
Original answer:
Assuming you can use a queue of length K something like that should do the job in linear time.
C# code:
bool SubsequenceExists(int[] a, int k, int sum)
{
int currentSum = 0;
var queue = new Queue<int>();
for (int i = 0; i < a.Length; i++)
{
if (i < k)
{
queue.Enqueue(a[i]);
currentSum += a[i];
continue;
}
if (currentSum == sum) return true;
currentSum -= queue.Dequeue();
queue.Enqueue(a[i]);
currentSum += a[i];
}
return false;
}
The logic behind that is pretty much straightforward:
We populate a queue with first K elements while also storing its sum somewhere.
If the resulting sum is not equal to sum then we dequeue an element from the queue and add the next one from A (while updating the sum).
We repeat step 2 until we either reach the end of sequence or find the matching subsequence.
Ta-daa!
Let is_subset_sum(int set[], int n, int sum) be the function to find whether there is a subset of set[] with sum equal to sum. n is the number of elements in set[].
The is_subset_sum problem can be divided into two subproblems
Include the last element, recur for n = n-1, sum = sum – set[n-1]
Exclude the last element, recur for n = n-1.
If any of the above subproblems return true, then return true.
Following is the recursive formula for is_subset_sum() problem.
is_subset_sum(set, n, sum) = is_subset_sum(set, n-1, sum) || is_subset_sum(set, n-1, sum-set[n-1])
Base Cases:
is_subset_sum(set, n, sum) = false, if sum > 0 and n == 0
is_subset_sum(set, n, sum) = true, if sum == 0
We can solve the problem in Pseudo-polynomial time using Dynamic programming. We create a boolean 2D table subset[][] and fill it in a bottom-up manner. The value of subset[i][j] will be true if there is a subset of set[0..j-1] with sum equal to i., otherwise false. Finally, we return subset[sum][n]
The time complexity of the solution is O(sum*n).
Implementation in C
// A Dynamic Programming solution for subset sum problem
#include <stdio.h>
// Returns true if there is a subset of set[] with sun equal to given sum
bool is_subset_sum(int set[], int n, int sum) {
// The value of subset[i][j] will be true if there is a
// subset of set[0..j-1] with sum equal to i
bool subset[sum+1][n+1];
// If sum is 0, then answer is true
for (int i = 0; i <= n; i++)
subset[0][i] = true;
// If sum is not 0 and set is empty, then answer is false
for (int i = 1; i <= sum; i++)
subset[i][0] = false;
// Fill the subset table in botton up manner
for (int i = 1; i <= sum; i++) {
for (int j = 1; j <= n; j++) {
subset[i][j] = subset[i][j-1];
if (i >= set[j-1])
subset[i][j] = subset[i][j] || subset[i - set[j-1]][j-1];
}
}
/* // uncomment this code to print table
for (int i = 0; i <= sum; i++) {
for (int j = 0; j <= n; j++)
printf ("%4d", subset[i][j]);
printf("\n");
} */
return subset[sum][n];
}
// Driver program to test above function
int main() {
int set[] = {3, 34, 4, 12, 5, 2};
int sum = 9;
int n = sizeof(set)/sizeof(set[0]);
if (is_subset_sum(set, n, sum) == true)
printf("Found a subset with given sum");
else
printf("No subset with given sum");
return 0;
}

Quick sort logic

I am trying to implement quick sort in java and I have one doubt. So here's my quick sort code:
package com.sorting;
public class QuickSort implements Sort {
#Override
public int [] sort(int[] arr) {
return quickSort(arr, 0, arr.length - 1);
}
private int [] quickSort(int[] numbers, int low, int high) {
if (low < high) {
int q = partitionTheArrayAroundPivot(numbers, low, high);
if (low < q)
quickSort(numbers, low, q);
if ((q+1) < high)
quickSort(numbers, q + 1, high);
}
return numbers;
}
private int partitionTheArrayAroundPivot(int[] numbers, int low, int high) {
int pivot = selectPivot(numbers, low, high);
int i = low;
int j = high;
while (true) {
while (numbers[i] < pivot) {
i++;
}
while (numbers[j] > pivot) {
j--;
}
if ( i <= j) {
swap(numbers, i, j);
i++;
j--;
} else {
return j;
}
}
}
private int selectPivot(int[] numbers, int low, int high) {
return numbers[high];
}
private void swap(int[] numbers, int i, int j) {
int temp = numbers[i];
numbers[i] = numbers[j];
numbers[j] = temp;
}
}
Doubt 1: We keep increasing the index i till we hit a number which is >= pivot
while (numbers[i] < pivot)
i++;
Similarly we keep decreasing the index j till we hit a number which is <= pivot
while (numbers[j] > pivot)
j--;
So, this means that both indexes will also come out of the loop if both hits pivots at two different places e.g. 1,0,1 here if pivot is 1, then i will be 0 and j will be 2. And the below condition will be satisfied
if (i <= j) {
....
}
but in that case it won't be able to sort the above array (1,0,1) because after swapping we are increasing i and decreasing j so the value become i = j = 1. After that i will hit the third element i.e 1 and will again come out of the loop with value i = 2 and similarly j = 0 and we will not be able to sort the array.
So where's the problem? Am I missing something?
I would rewrite the code a little so that selectPivot returns the index instead:
private int selectPivotIndex(int[] numbers, int low, int high) {
return high;
}
Then the partitioning funcion can move the pivot aside, and sort the remaining items according to pivot value. A single loop will do it, in this implementation duplicate pivots will end up on right side:
private int partitionTheArrayAroundPivot(int[] numbers, int low, int high) {
int pivotIndex = selectPivotIndex(numbers, low, high);
swap(numbers, pivotIndex, high); // Not needed if selectPivotIndex always returns high
int newPivotIndex = low;
for(int i = low; i < high; i++)
{
if(numbers[i] < numbers[pivotIndex])
{
swap(numbers, i, newPivotIndex);
newPivotIndex++;
}
}
swap(numbers, newPivotIndex, pivotIndex);
return newPivotIndex;
}
Finally, a small adjustment needs to be done in the quickSort method so that we don't end up in an eternal loop:
if (low < q)
quickSort(numbers, low, q - 1);
This approach is IMHO easier to understand and debug, hope it works for you.
Use
while (numbers[i] <= pivot) and
while (numbers[j] >= pivot) and your code will work

Resources