Binary search bounds

Binary search bounds - algorithm

I always have the hardest time with this and I have yet to see a definitive explanation for something that is supposedly so common and highly-used.
We already know the standard binary search. Given starting lower and upper bounds, find the middle point at (lower + higher)/2, and then compare it against your array, and then re-set the bounds accordingly, etc.
However what are the needed differences to adjust the search to find (for a list in ascending order):
Smallest value >= target
Smallest value > target
Largest value <= target
Largest value < target
It seems like each of these cases requires very small tweaks to the algorithm but I can never get them to work right. I try changing inequalities, return conditions, I change how the bounds are updated, but nothing seems consistent.
What are the definitive ways to handle these four cases?

I had exactly the same issue until I figured out loop invariants along with predicates are the best and most consistent way of approaching all binary problems.
Point 1: Think of predicates
In general for all these 4 cases (and also the normal binary search for equality), imagine them as a predicate. So what this means is that some of the values are meeting the predicate and some some failing. So consider for example this array with a target of 5:
[1, 2, 3, 4, 6, 7, 8]. Finding the first number greater than 5 is basically equivalent of finding the first one in this array: [0, 0, 0, 0, 1, 1, 1].
Point 2: Search boundaries inclusive
I like to have both ends always inclusive. But I can see some people like start to be inclusive and end exclusive (on len instead of len -1). I like to have all the elements inside of the array, so when referring to a[mid] I don't think whether that will give me an array out of bound. So my preference: Go inclusive!!!
Point 3: While loop condition <=
So we even want to process the subarray of size 1 in the while loop, and when the while loop finishes there should be no unprocessed element. I really like this logic. It's always solid as a rock. Initially all the elements are not inspected, basically they are unknown. Meaning that everything in the range of [st = 0, to end = len - 1] are not inspected. Then when the while loop finishes, the range of uninspected elements should be array of size 0!
Point 4: Loop invariants
Since we defined start = 0, end = len - 1, invariants will be like this:
Anything left of start is smaller than target.
Anything right of end is greater than or equal to the target.
Point 5: The answer
Once the loop finishes, basically based on the loop invariants anything to the left of start is smaller. So that means that start is the first element greater than or equal to the target.
Equivalently, anything to the right of end is greater than or equal to the target. So that means the answer is also equal to end + 1.
The code:
public int find(int a[], int target){
int start = 0;
int end = a.length - 1;
while (start <= end){
int mid = (start + end) / 2; // or for no overflow start + (end - start) / 2
if (a[mid] < target)
start = mid + 1;
else // a[mid] >= target
end = mid - 1;
}
return start; // or end + 1;
}
variations:
<
It's equivalent of finding the first 0. So basically only return changes.
return end; // or return start - 1;
>
change the if condition to <= and else will be >. No other change.
<=
Same as >, return end; // or return start - 1;
So in general with this model for all the 5 variations (<=, <, >, >=, normal binary search) only the condition in the if changes and the return statement. And figuring those small changes is super easy when you consider the invariants (point 4) and the answer (point 5).
Hope this clarifies for whoever reads this. If anything is unclear of feels like magic please ping me to explain. After understanding this method, everything for binary search should be as clear as day!
Extra point: It would be a good practice to also try including the start but excluding the end. So the array would be initially [0, len). If you can write the invariants, new condition for the while loop, the answer and then a clear code, it means you learnt the concept.

Binary search(at least the way I implement it) relies on a simple property - a predicate holds true for one end of the interval and does not hold true for the other end. I always consider my interval to be closed at one end and opened at the other. So let's take a look at this code snippet:
int beg = 0; // pred(beg) should hold true
int end = n;// length of an array or a value that is guranteed to be out of the interval that we are interested in
while (end - beg > 1) {
int mid = (end + beg) / 2;
if (pred(a[mid])) {
beg = mid;
} else {
end = mid;
}
}
// answer is at a[beg]
This will work for any of the comparisons you define. Simply replace pred with <=target or >=target or <target or >target.
After the cycle exits, a[beg] will be the last element for which the given inequality holds.
So let's assume(like suggested in the comments) that we want to find the largest number for which a[i] <= target. Then if we use predicate a[i] <= target the code will look like:
int beg = 0; // pred(beg) should hold true
int end = n;// length of an array or a value that is guranteed to be out of the interval that we are interested in
while (end - beg > 1) {
int mid = (end + beg) / 2;
if (a[mid] <= target) {
beg = mid;
} else {
end = mid;
}
}
And after the cycle exits, the index that you are searching for will be beg.
Also depending on the comparison you may have to start from the right end of the array. E.g. if you are searching for the largest value >= target, you will do something of the sort of:
beg = -1;
end = n - 1;
while (end - beg > 1) {
int mid = (end + beg) / 2;
if (a[mid] >= target) {
end = mid;
} else {
beg = mid;
}
}
And the value that you are searching for will be with index end. Note that in this case I consider the interval (beg, end] and thus I've slightly modified the starting interval.

The basic binary search is to search the position/value which equals with the target key. While it can be extended to find the minimal position/value which satisfy some condition, or find the maximal position/value which satisfy some condition.
Suppose the array is ascending order, if no satisfied position/value found, return -1.
Code sample:
// find the minimal position which satisfy some condition
private static int getMinPosition(int[] arr, int target) {
int l = 0, r = arr.length - 1;
int ans = -1;
while(l <= r) {
int m = (l + r) >> 1;
// feel free to replace the condition
// here it means find the minimal position that the element not smaller than target
if(arr[m] >= target) {
ans = m;
r = m - 1;
} else {
l = m + 1;
}
}
return ans;
}
// find the maximal position which satisfy some condition
private static int getMaxPosition(int[] arr, int target) {
int l = 0, r = arr.length - 1;
int ans = -1;
while(l <= r) {
int m = (l + r) >> 1;
// feel free to replace the condition
// here it means find the maximal position that the element less than target
if(arr[m] < target) {
ans = m;
l = m + 1;
} else {
r = m - 1;
}
}
return ans;
}
int[] a = {3, 5, 5, 7, 10, 15};
System.out.println(BinarySearchTool.getMinPosition(a, 5));
System.out.println(BinarySearchTool.getMinPosition(a, 6));
System.out.println(BinarySearchTool.getMaxPosition(a, 8));

What you need is a binary search that lets you participate in the process at the last step. The typical binary search would receive (array, element) and produce a value (normally the index or not found). But if you have a modified binary that accept a function to be invoked at the end of the search you can cover all cases.
For example, in Javascript to make it easy to test, the following binary search
function binarySearch(array, el, fn) {
function aux(left, right) {
if (left > right) {
return fn(array, null, left, right);
}
var middle = Math.floor((left + right) / 2);
var value = array[middle];
if (value > el) {
return aux(left, middle - 1);
} if (value < el) {
return aux(middle + 1, right);
} else {
return fn(array, middle, left, right);
}
}
return aux(0, array.length - 1);
}
would allow you to cover each case with a particular return function.
default
function(a, m) { return m; }
Smallest value >= target
function(a, m, l, r) { return m != null ? a[m] : r + 1 >= a.length ? null : a[r + 1]; }
Smallest value > target
function(a, m, l, r) { return (m || r) + 1 >= a.length ? null : a[(m || r) + 1]; }
Largest value <= target
function(a, m, l, r) { return m != null ? a[m] : l - 1 > 0 ? a[l - 1] : null; }
Largest value < target
function(a, m, l, r) { return (m || l) - 1 < 0 ? null : a[(m || l) - 1]; }

Related

Binary search, when to use right = mid - 1, and when to use right = mid?

I was working through this problem on leetcode https://leetcode.com/problems/leftmost-column-with-at-least-a-one/ and I cant think of an intuitive answer.
Why is the right (or high) pointer sometimes set to mid - 1, and why is it sometimes correct to set to mid?
I am aware that we must always set left = mid + 1 because of integer division. When only two elements remain, we need to set mid + 1 to avoid an infinite loop.
But what are the cases to use right = mid - 1, vs right = mid?
Thanks.

Let's say you are doing binary search on a sequence like below
.....0, 0, 0, 0, 1, 1, 1, 1, ....
Your decision function fn returns true if the value holds true for 1.
Now consider your target is to find the last position for 0. In each step of binary search, we will reduce search space such that we are certain the position is within the range.
If fn returns true for mid you know that the last position for 0 will be less than mid (because you want the last occurrence of 0 which must be before the first occurrence of 1). So, you will update right=mid-1. If fn return false left=mid.
Now consider your target is to find the first occurrence for 1.
Now if fn returns true you will update right=mid because you know the first occurrence of 1 will be on this position or left of it. In this case, if fn returns false, you will need to update left=mid+1.

There are three templates to implement binary search
Template 1: No depend on other elements and neighbor elements.
while(left <= right){
int mid = (int)Math.floor((left + right)/2);
if(nums[mid] == target) return mid;
else if(nums[mid] < target) {
left = mid + 1;
} else {
right = mid - 1;
}
}
Template 2: Use when searching a value requires accessing the current index and its immediate right neighbor's index
while(left < right){
// Prevent (left + right) overflow
int mid = left + (right - left) / 2;
if(nums[mid] == target){ return mid; }
else if(nums[mid] < target) {
left = mid + 1; }
else {
right = mid; }
}
Template 3: Use when searching a value requires accessing the current index and both its immediate left and right neighbor's index
while (left + 1 < right){
// Prevent (left + right) overflow
int mid = left + (right - left) / 2;
if (nums[mid] == target) {
return mid;
} else if (nums[mid] < target) {
left = mid;
} else {
right = mid;
}
}
Source: LeetCode Explore

Linear search on a small array outperforms binary search. Thus it should not really matter if you are using or not using -1, because your search should break out when right-left < N and then you can do a linear search between the two (N is a parameter which you can find for your particular application by running a benchmark).
Iirc N for integer numbers was ~700 when I measured it last.

Binary Search Explanation

Link: https://leetcode.com/problems/first-bad-version/discuss/71386/An-clear-way-to-use-binary-search
I am doing a question wherein, given a string like this "FFTTTT", I have to find either the rightmost F or the leftmost T.
The following is the code:
To find the leftmost T
public int firstBadVersionLeft(int n) {
int i = 1;
int j = n;
while (i < j) {
int mid = i + (j - i) / 2;
if (isBadVersion(mid)) {
j = mid;
} else {
i = mid + 1;
}
}
return i;
}
I have the following doubts:
I am unable to understand the intuition behind returning i. I mean, why didn't we return j. I did a trial run of the code in my mind, and it works out, but how do we know we have to return i.
Why didn't we do while (i<=j) and just while(i<j). I mean, how do we determine this?

You stop when i==j as far as I can tell. So you can return either of them since they have the same value.
The difference between the two conditions is i == j . In that case mid == i + 0/2 == i. So isBadVersion can return true or false. If it returns true then you do j= mid, but we already had i == j == mid, so you don't do anything and you have an infinite loop. If isBadVersion returns false then you will make i == j+1 and the loop will end. Since the boundaries are inclusive (they include the i'th and j'th element) this can only happen if there's no 'T' in the string.
So you do (i < j) to avoid that infinite loop case.
P.S. This code will return n if there's not 'T' in the string or the last character is a 'T'. Not sure if that's intended or not.

SUM exactly using K elements solution

Problem: On a given array with N numbers, find subset of size M (exactly M elements) that equal to SUM.
I am looking for a Dynamic Programming(DP) solution for this problem. Basically looking to understand the matrix filled approach. I wrote below program but didn't add memoization as i am still wondering how to do that.
#include <stdio.h>
#define SIZE(a) sizeof(a)/sizeof(a[0])
int binary[100];
int a[] = {1, 2, 5, 5, 100};
void show(int* p, int size) {
int j;
for (j = 0; j < size; j++)
if (p[j])
printf("%d\n", a[j]);
}
void subset_sum(int target, int i, int sum, int *a, int size, int K) {
if (sum == target && !K) {
show(binary, size);
} else if (sum < target && i < size) {
binary[i] = 1;
foo(target, i + 1, sum + a[i], a, size, K-1);
binary[i] = 0;
foo(target, i + 1, sum, a, size, K);
}
}
int main() {
int target = 10;
int K = 2;
subset_sum(target, 0, 0, a, SIZE(a), K);
}
Is the below recurrence solution makes sense?
Let DP[SUM][j][k] sum up to SUM with exactly K elements picked from 0 to j elements.
DP[i][j][k] = DP[i][j-1][k] || DP[i-a[j]][j-1][k-1] { input array a[0....j] }
Base cases are:
DP[0][0][0] = DP[0][j][0] = DP[0][0][k] = 1
DP[i][0][0] = DP[i][j][0] = 0
It means we can either consider this element ( DP[i-a[j]][j-1][k-1] ) or we don't consider the current element (DP[i][j-1][k]). If we consider current element, k is reduced by 1 which reduces the elements that needs to be considered and same goes when current element is not considered i.e. K is not reduced by 1.

Your solution looks right to me.
Right now, you're basically backtracking over all possibilities and printing each solution. If you only want one solution, you could add a flag that you set when one solution was found and check before continuing with recursive calls.
For memoization, you should first get rid of the binary array, after which you can do something like this:
int memo[NUM_ELEMENTS][MAX_SUM][MAX_K];
bool subset_sum(int target, int i, int sum, int *a, int size, int K) {
if (sum == target && !K) {
memo[i][sum][K] = true;
return memo[i][sum][K];
} else if (sum < target && i < size) {
if (memo[i][sum][K] != -1)
return memo[i][sum][K];
memo[i][sum][K] = foo(target, i + 1, sum + a[i], a, size, K-1) ||
foo(target, i + 1, sum, a, size, K);
return memo[i][sum][K]
}
return false;
}
Then, look at memo[_all indexes_][target][K]. If this is true, there exists at least one solution. You can store addition information to get you that next solution, or you can iterate with an i from found_index - 1 to 0 and check for which i you have memo[i][sum - a[i]][K - 1] == true. Then recurse on that, and so on. This will allow you to reconstruct the solution using just the memo array.

To my understanding, if only the feasibility of the input has to be checked, the problem can be solved with a two-dimensional state space
bool[][] IsFeasible = new bool[n][k]
where IsFeasible[i][j] is true if and only if there is a subset of the elements 1 to i which sum up to exactly j for every
1 <= i <= n
1 <= j <= k
and for this state space, the recurrence relation
IsFeasible[i][j] = IsFeasible[i-1][k-a[i]] || IsFeasible[i-1][k]
can be used, where the left-hand side of the or-operator || corresponds to selecting the i-th item and the right-hand side corresponds to to not selecting the i-th item. The actual choice of items could be obtained by backtracking or auxiliary information saved during evaluation.

Given a bitonic array and element x in the array, find the index of x in 2log(n) time

First, a bitonic array for this question is defined as one such that for some index K in an array of length N where 0 < K < N - 1 and 0 to K is a monotonically increasing sequence of integers, and K to N - 1 is a monotonically decreasing sequence of integers.
Example: [1, 3, 4, 6, 9, 14, 11, 7, 2, -4, -9]. It monotonically increases from 1 to 14, then decreases from 14 to -9.
The precursor to this question is to solve it in 3log(n), which is much easier. One altered binary search to find the index of the max, then two binary searchs for 0 to K and K + 1 to N - 1 respectively.
I presume the solution in 2log(n) requires you solve the problem without finding the index of the max. I've thought about overlapping the binary searches, but beyond that, I'm not sure how to move forward.

The algorithms presented in other answers (this and this) are unfortunately incorrect, they are not O(logN) !
The recursive formula f(L) = f(L/2) + log(L/2) + c doesn't lead to f(L) = O(log(N)) but leads to f(L) = O((log(N))^2) !
Indeed, assume k = log(L), then log(2^(k-1)) + log(2^(k-2)) + ... + log(2^1) = log(2)*(k-1 + k-2 + ... + 1) = O(k^2). Hence, log(L/2) + log(L/4) + ... + log(2) = O((log(L)^2)).
The right way to solve the problem in time ~ 2log(N) is to proceed as follows (assuming the array is first in ascending order and then in descending order):
Take the middle of the array
Compare the middle element with one of its neighbor to see if the max is on the right or on the left
Compare the middle element with the desired value
If the middle element is smaller than the desired value AND the max is on the left side, then do bitonic search on the left subarray (we are sure that the value is not in the right subarray)
If the middle element is smaller than the desired value AND the max is on the right side, then do bitonic search on the right subarray
If the middle element is bigger than the desired value, then do descending binary search on the right subarray and ascending binary search on the left subarray.
In the last case, it might be surprising to do a binary search on a subarray that may be bitonic but it actually works because we know that the elements that are not in the good order are all bigger than the desired value. For instance, doing an ascending binary search for the value 5 in the array [2, 4, 5, 6, 9, 8, 7] will work because 7 and 8 are bigger than the desired value 5.
Here is a fully working implementation (in C++) of the bitonic search in time ~2logN:
#include <iostream>
using namespace std;
const int N = 10;
void descending_binary_search(int (&array) [N], int left, int right, int value)
{
// cout << "descending_binary_search: " << left << " " << right << endl;
// empty interval
if (left == right) {
return;
}
// look at the middle of the interval
int mid = (right+left)/2;
if (array[mid] == value) {
cout << "value found" << endl;
return;
}
// interval is not splittable
if (left+1 == right) {
return;
}
if (value < array[mid]) {
descending_binary_search(array, mid+1, right, value);
}
else {
descending_binary_search(array, left, mid, value);
}
}
void ascending_binary_search(int (&array) [N], int left, int right, int value)
{
// cout << "ascending_binary_search: " << left << " " << right << endl;
// empty interval
if (left == right) {
return;
}
// look at the middle of the interval
int mid = (right+left)/2;
if (array[mid] == value) {
cout << "value found" << endl;
return;
}
// interval is not splittable
if (left+1 == right) {
return;
}
if (value > array[mid]) {
ascending_binary_search(array, mid+1, right, value);
}
else {
ascending_binary_search(array, left, mid, value);
}
}
void bitonic_search(int (&array) [N], int left, int right, int value)
{
// cout << "bitonic_search: " << left << " " << right << endl;
// empty interval
if (left == right) {
return;
}
int mid = (right+left)/2;
if (array[mid] == value) {
cout << "value found" << endl;
return;
}
// not splittable interval
if (left+1 == right) {
return;
}
if(array[mid] > array[mid-1]) {
if (value > array[mid]) {
return bitonic_search(array, mid+1, right, value);
}
else {
ascending_binary_search(array, left, mid, value);
descending_binary_search(array, mid+1, right, value);
}
}
else {
if (value > array[mid]) {
bitonic_search(array, left, mid, value);
}
else {
ascending_binary_search(array, left, mid, value);
descending_binary_search(array, mid+1, right, value);
}
}
}
int main()
{
int array[N] = {2, 3, 5, 7, 9, 11, 13, 4, 1, 0};
int value = 4;
int left = 0;
int right = N;
// print "value found" is the desired value is in the bitonic array
bitonic_search(array, left, right, value);
return 0;
}

The algorithm works recursively by combining bitonic and binary searches:
def bitonic_search (array, value, lo = 0, hi = array.length - 1)
if array[lo] == value then return lo
if array[hi] == value then return hi
mid = (hi + lo) / 2
if array[mid] == value then return mid
if (mid > 0 & array[mid-1] < array[mid])
| (mid < array.length-1 & array[mid+1] > array[mid]) then
# max is to the right of mid
bin = binary_search(array, value, low, mid-1)
if bin != -1 then return bin
return bitonic_search(array, value, mid+1, hi)
else # max is to the left of mid
bin = binary_search(array, value, mid+1, hi)
if bin != -1 then return bin
return bitonic_search(array, value, lo, mid-1)
So the recursive formula for the time is f(l) = f(l/2) + log(l/2) + c where log(l/2) comes from the binary search and c is the cost of the comparisons done in the function body.

Answers those provided have time complexity of (N/2)*logN. Because the worst case may include too many sub-searches which are unnecessary. A modification is to compare the target value with the left and right element of sub series before searching. If target value is not between two ends of the monotonic series or less than both ends of the bitonic series, subsequent search is redundant. This modification leads to 2lgN complexity.

There are 5 main cases depending on where the max element of array is, and whether middle element is greater than desired value
Calculate middle element.
Compare middle element desired value, if it matches search ends. Otherwise proceed to next step.
Compare middle element with neighbors to see if max element is on left or right. If both of the neighbors are less than middle element, then element is not present in the array, hence exit.(Array mentioned in the question will hit this case first as 14, the max element, is in middle)
If middle element is less than desired value and max element is on right, do bitonic search in right subarray
If middle element is less than desired value and max element is on left, do bitonic search in left subarray
If middle element is greater than desired value and max element is on left, do descending binary search in right subarray
If middle element is greater than desired value and max element is on right, do ascending binary search in left subarray
In the worst case we will be doing two comparisons each time array is divided in half, hence complexity will be 2*logN

public int FindLogarithmicGood(int value)
{
int lo = 0;
int hi = _bitonic.Length - 1;
int mid;
while (hi - lo > 1)
{
mid = lo + ((hi - lo) / 2);
if (value < _bitonic[mid])
{
return DownSearch(lo, hi - lo + 1, mid, value);
}
else
{
if (_bitonic[mid] < _bitonic[mid + 1])
lo = mid;
else
hi = mid;
}
}
return _bitonic[hi] == value
? hi
: _bitonic[lo] == value
? lo
: -1;
}
where DownSearch is
public int DownSearch(int index, int count, int mid, int value)
{
int result = BinarySearch(index, mid - index, value);
if (result < 0)
result = BinarySearch(mid, index + count - mid, value, false);
return result;
}
and BinarySearch is
/// <summary>
/// Exactly log(n) on average and worst cases.
/// Note: System.Array.BinarySerch uses 2*log(n) in the worst case.
/// </summary>
/// <returns>array index</returns>
public int BinarySearch(int index, int count, int value, bool asc = true)
{
if (index < 0 || count < 0)
throw new ArgumentOutOfRangeException();
if (_bitonic.Length < index + count)
throw new ArgumentException();
if (count == 0)
return -1;
// "lo minus one" trick
int lo = index - 1;
int hi = index + count - 1;
int mid;
while (hi - lo > 1)
{
mid = lo + ((hi - lo) / 2);
if ((asc && _bitonic[mid] < value) || (!asc && _bitonic[mid] > value))
lo = mid;
else
hi = mid;
}
return _bitonic[hi] == value ? hi : -1;
}
github

Finding the change of sign among the first order differences, by standard dichotomic search, will take 2Lg(n) array accesses.
You can do slightly better by using the search strategy for the maximum of a unimodal function known as Fibonacci search. After n steps each involving a single lookup, you reduce the interval size by a factor Fn, corresponding to about Log n/Log φ ~ 1.44Lg(n) accesses to find the maximum.
This marginal gain makes a little more sense when array accesses are instead costly funciton evaluations.

When it comes to searching Algorithms in O(log N) time, You gotta think of binary search only.
The concept here is to first find the peak point,
for ex: Array = [1 3 5 6 7 12 6 4 2 ] -> Here, 12 is the peak. Once detected and gotta mark as mid, Now simply do a binary search in Array[0:mid] and Array[mid:len(Array)].
Note: The second array from mid -> len is a descending array and need to make a small variation in binary search.
For finding the Bitonic Point :-) [ Written in Python ]
start, end = 0, n-1
while start <= end:
mid = start + end-start//2
if (mid == 0 or arr[mid-1] < arr[mid]) and (mid==n-1 or arr[mid+1] < arr[mid]):
return mid
if mid > 0 and arr[mid-1] > arr[mid]:
end = mid-1
else:
start = mid+1
Once found the index, Do the respective Binary Search. Woola...All done :-)

For a binary split, there are three cases:
max item is at right, then binary search left, and bitoinc search right.
max item is at left, then binary search right, and bitoinc search left.
max item is at the split point exactly, then binary both left and right.
caution: the binary search used in left and right are different because of increasing/decreasing order.
public static int bitonicSearch(int[] a, int lo, int hi, int key) {
int mid = (lo + hi) / 2;
int now = a[mid];
if (now == key)
return mid;
// deal with edge cases
int left = (mid == 0)? a[mid] : a[mid - 1];
int right = (mid == a.length-1)? a[mid] : a[mid + 1];
int leftResult, rightResult;
if (left < now && now < right) { // max item is at right
leftResult = binarySearchIncreasing(a, lo, mid - 1, key);
if (leftResult != -1)
return leftResult;
return bitonicSearch(a, mid + 1, hi, key);
}
else if (left > now && now > right) { // max item is at left
rightResult = binarySearchDecreasing(a, mid + 1, hi, key);
if (rightResult != -1)
return rightResult;
return bitonicSearch(a, lo, mid - 1, key);
}
else { // max item stands at the split point exactly
leftResult = binarySearchIncreasing(a, lo, mid - 1, key);
if (leftResult != -1)
return leftResult;
return binarySearchDecreasing(a, mid + 1, hi, key);
}
}

Searching for an element in a circular sorted array

We want to search for a given element in a circular sorted array in complexity not greater than O(log n).
Example: Search for 13 in {5,9,13,1,3}.
My idea was to convert the circular array into a regular sorted array then do a binary search on the resulting array, but my problem was the algorithm I came up was stupid that it takes O(n) in the worst case:
for(i = 1; i < a.length; i++){
if (a[i] < a[i-1]){
minIndex = i; break;
}
}
then the corresponding index of ith element will be determined from the following relation:
(i + minInex - 1) % a.length
it is clear that my conversion (from circular to regular) algorithm may take O(n), so we need a better one.
According to ire_and_curses idea, here is the solution in Java:
public int circularArraySearch(int[] a, int low, int high, int x){
//instead of using the division op. (which surprisingly fails on big numbers)
//we will use the unsigned right shift to get the average
int mid = (low + high) >>> 1;
if(a[mid] == x){
return mid;
}
//a variable to indicate which half is sorted
//1 for left, 2 for right
int sortedHalf = 0;
if(a[low] <= a[mid]){
//the left half is sorted
sortedHalf = 1;
if(x <= a[mid] && x >= a[low]){
//the element is in this half
return binarySearch(a, low, mid, x);
}
}
if(a[mid] <= a[high]){
//the right half is sorted
sortedHalf = 2;
if(x >= a[mid] && x<= a[high] ){
return binarySearch(a, mid, high, x);
}
}
// repeat the process on the unsorted half
if(sortedHalf == 1){
//left is sorted, repeat the process on the right one
return circularArraySearch(a, mid, high, x);
}else{
//right is sorted, repeat the process on the left
return circularArraySearch(a, low, mid, x);
}
}
Hopefully this will work.

You can do this by taking advantage of the fact that the array is sorted, except for the special case of the pivot value and one of its neighbours.
Find the middle value of the array a.
If a[0] < a[mid], then all values in
the first half of the array are
sorted.
If a[mid] < a[last], then all
values in the second half of the
array are sorted.
Take the sorted
half, and check whether your value
lies within it (compare to the
maximum idx in that half).
If so, just binary
search that half.
If it doesn't, it
must be in the unsorted half. Take
that half and repeat this process,
determining which half of that half
is sorted, etc.

Not very elegant, but of the top off my head - just use binary search to find the pivot of the rotated array, and then perform binary search again, compensating for the offset of the pivot. Kind of silly to perform two full searches, but it does fulfill the condition, since O(log n) + O(log n) == O(log n). Keep it simple and stupid(tm)!

This is an example that works in Java. Since this is a sorted array you take advantage of this and run a Binary Search, however it needs to be slightly modified to cater for the position of the pivot.
The method looks like this:
private static int circularBinSearch ( int key, int low, int high )
{
if (low > high)
{
return -1; // not found
}
int mid = (low + high) / 2;
steps++;
if (A[mid] == key)
{
return mid;
}
else if (key < A[mid])
{
return ((A[low] <= A[mid]) && (A[low] > key)) ?
circularBinSearch(key, mid + 1, high) :
circularBinSearch(key, low, mid - 1);
}
else // key > A[mid]
{
return ((A[mid] <= A[high]) && (key > A[high])) ?
circularBinSearch(key, low, mid - 1) :
circularBinSearch(key, mid + 1, high);
}
}
Now to ease any worries, here's a nice little class that verifies the algorithm:
public class CircularSortedArray
{
public static final int[] A = {23, 27, 29, 31, 37, 43, 49, 56, 64, 78,
91, 99, 1, 4, 11, 14, 15, 17, 19};
static int steps;
// ---- Private methods ------------------------------------------
private static int circularBinSearch ( int key, int low, int high )
{
... copy from above ...
}
private static void find ( int key )
{
steps = 0;
int index = circularBinSearch(key, 0, A.length-1);
System.out.printf("key %4d found at index %2d in %d steps\n",
key, index, steps);
}
// ---- Static main -----------------------------------------------
public static void main ( String[] args )
{
System.out.println("A = " + Arrays.toString(A));
find(44); // should not be found
find(230);
find(-123);
for (int key: A) // should be found at pos 0..18
{
find(key);
}
}
}
That give you an output of:
A = [23, 27, 29, 31, 37, 43, 49, 56, 64, 78, 91, 99, 1, 4, 11, 14, 15, 17, 19]
key 44 found at index -1 in 4 steps
key 230 found at index -1 in 4 steps
key -123 found at index -1 in 5 steps
key 23 found at index 0 in 4 steps
key 27 found at index 1 in 3 steps
key 29 found at index 2 in 4 steps
key 31 found at index 3 in 5 steps
key 37 found at index 4 in 2 steps
key 43 found at index 5 in 4 steps
key 49 found at index 6 in 3 steps
key 56 found at index 7 in 4 steps
key 64 found at index 8 in 5 steps
key 78 found at index 9 in 1 steps
key 91 found at index 10 in 4 steps
key 99 found at index 11 in 3 steps
key 1 found at index 12 in 4 steps
key 4 found at index 13 in 5 steps
key 11 found at index 14 in 2 steps
key 14 found at index 15 in 4 steps
key 15 found at index 16 in 3 steps
key 17 found at index 17 in 4 steps
key 19 found at index 18 in 5 steps

You have three values, l,m,h for the values at the low, mid and high indexes of your search. If you think were you would continue searching for each possibility:
// normal binary search
l < t < m - search(t,l,m)
m < t < h - search(t,m,h)
// search over a boundary
l > m, t < m - search(t,l,m)
l > m, t > l - search(t,l,m)
m > h, t > m - search(t,m,h)
m > h, t < h - search(t,m,h)
It's a question of considering where the target value could be, and searching that half of the space. At most one half of the space will have the wrap-over in it, and it is easy to determine whether or not the target value is in that half or the other.
It's sort of a meta question - do you think of binary search it terms of how it is often presented - finding a value between two points, or more generally as a repeated division of an abstract search space.

You can use binary search to find the location of smallest element and reduce it to O(Log n).
You can find the location by (this is just a sketch of algorithm, it's inaccurate but you can get the idea from it):
1. i <- 1
2. j <- n
3. while i < j
3.1. k <- (j-i) / 2
3.2. if arr[k] < arr[i] then j <- k
3.3. else i <- k
After finding the location of the smallest element you can treat the array as two sorted arrays.

You just use a simple binary search as if it were a regular sorted array. The only trick is you need to rotate the array indexes:
(index + start-index) mod array-size
where the start-index is the offset of the first element in the circular array.

public static int _search(int[] buff, int query){
int s = 0;
int e = buff.length;
int m = 0;
while(e-s>1){
m = (s+e)/2;
if(buff[offset(m)] == query){
return offset(m);
} else if(query < buff[offset(m)]){
e = m;
} else{
s = m;
}
}
if(buff[offset(end)]==query) return end;
if(buff[offset(start)]==query) return start;
return -1;
}
public static int offset(int j){
return (dip+j) % N;
}

Check this coe,
def findkey():
key = 3
A=[10,11,12,13,14,1,2,3]
l=0
h=len(A)-1
while True:
mid = l + (h-l)/2
if A[mid] == key:
return mid
if A[l] == key:
return l
if A[h] == key:
return h
if A[l] < A[mid]:
if key < A[mid] and key > A[l]:
h = mid - 1
else:
l = mid + 1
elif A[mid] < A[h]:
if key > A[mid] and key < A[h]:
l = mid + 1
else:
h = mid - 1
if __name__ == '__main__':
print findkey()

Here is an idea, related to binary search. Just keep backing up your index for the right array-index bound, the left index bound is stored in the step size:
step = n
pos = n
while( step > 0 ):
test_idx = pos - step #back up your current position
if arr[test_idx-1] < arr[pos-1]:
pos = test_idx
if (pos == 1) break
step /= 2 #floor integer division
return arr[pos]
To avoid the (pos==1) thing, we could back up circularly (go into negative numbers) and take (pos-1) mod n.

I think you can find the offset using this code:
public static int findOffset(int [] arr){
return findOffset(arr,0,arr.length-1);
}
private static int findOffset(int[] arr, int start, int end) {
if(arr[start]<arr[end]){
return -1;
}
if(end-start==1){
return end;
}
int mid = start + ((end-start)/2);
if(arr[mid]<arr[start]){
return findOffset(arr,start,mid);
}else return findOffset(arr,mid,end);
}

Below is a implementation in C using binary search.
int rotated_sorted_array_search(int arr[], int low, int high, int target)
{
while(low<=high)
{
int mid = (low+high)/2;
if(target == arr[mid])
return mid;
if(arr[low] <= arr[mid])
{
if(arr[low]<=target && target < arr[mid])
{
high = mid-1;
}
else
low = mid+1;
}
else
{
if(arr[mid]< target && target <=arr[high])
{
low = mid+1;
}
else
high = mid-1;
}
}
return -1;
}

Here's a solution in javascript. Tested it with a few different arrays and it seems to work. It basically uses the same method described by ire_and_curses:
function search(array, query, left, right) {
if (left > right) {
return -1;
}
var midpoint = Math.floor((left + right) / 2);
var val = array[midpoint];
if(val == query) {
return midpoint;
}
// Look in left half if it is sorted and value is in that
// range, or if right side is sorted and it isn't in that range.
if((array[left] < array[midpoint] && query >= array[left] && query <= array[midpoint])
|| (array[midpoint] < array[right]
&& !(query >= array[midpoint] && query <= array[right]))) {
return search(array, query, left, midpoint - 1);
} else {
return search(array, query, midpoint + 1, right);
}
}

Simple binary search with a little change.
Index of rotating array= (i+pivot)%size
pivot is the index i+1 where a[i]>a[i+1].
#include <stdio.h>
#define size 5
#define k 3
#define value 13
int binary_search(int l,int h,int arr[]){
int mid=(l+h)/2;
if(arr[(mid+k)%size]==value)
return (mid+k)%size;
if(arr[(mid+k)%size]<value)
binary_search(mid+1,h,arr);
else
binary_search(l,mid,arr);
}
int main() {
int arr[]={5,9,13,1,3};
printf("found at: %d\n", binary_search(0,4,arr));
return 0;
}

A simple method in Ruby
def CircularArraySearch(a, x)
low = 0
high = (a.size) -1
while low <= high
mid = (low+high)/2
if a[mid] == x
return mid
end
if a[mid] <= a[high]
if (x > a[mid]) && (x <= a[high])
low = mid + 1
elsif high = mid -1
end
else
if (a[low] <= x) && (x < a[mid])
high = mid -1
else
low = mid +1
end
end
end
return -1
end
a = [12, 14, 18, 2, 3, 6, 8, 9]
x = gets.to_i
p CircularArraySearch(a, x)

Though approved answer is optimal but we can also do with a similar and cleaner algorithm.
run a binary search to find pivot element (where the array is rotated). O(logn)
left half of the pivot will be sorted in decreasing order, run a backward binary search here for the key. O(logn)
right half of the pivot will be sorted in increasing order, run a forward binary search in this half for the key. O(logn)
return found key index from step 2 and 3.
Total Time Complexity: O(logn)
Thoughts welcome.

guirgis: It is lame to post an interview question, guess you didn't get the job :-(
Use a special cmp function and you need only one pass with regular binary search. Something like:
def rotatedcmp(x, y):
if x and y < a[0]:
return cmp(x, y)
elif x and y >= a[0]:
return cmp(x, y)
elif x < a[0]:
return x is greater
else:
return y is greater
If you can depend on int underflow subtract a[0] - MIN_INT from each element as it is accessed and use regular compare.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Binary search bounds - algorithm

Related

Binary search, when to use right = mid - 1, and when to use right = mid?

Binary Search Explanation

SUM exactly using K elements solution

Given a bitonic array and element x in the array, find the index of x in 2log(n) time

Searching for an element in a circular sorted array

Categories

Resources