What the condition means actually? - algorithm

Would anyone kindly explain how the condition inside this if else statement is working?
if ((mid == 0 || arr[mid - 1] <= arr[mid]) &&
(mid == n - 1 || arr[mid + 1] <= arr[mid]))
return mid;
The above code is a part of the binary search algorithm.
Below is the full code -
function findPeakUtil(arr, low, high, n)
{
let l = low;
let r = high - 1;
let mid;
while(l <= r)
{
// finding the mid index by right shifting
mid = (l + r) >> 1;
// first case if mid is the answer
if((mid == 0 || arr[mid - 1] <= arr[mid]) &&
(mid == n - 1 || arr[mid + 1] <= arr[mid]))
break;
// change the right pointer to mid-1
if(mid > 0 && arr[mid - 1] > arr[mid])
r = mid - 1;
// change the left pointer to mid+1
else
l = mid + 1;
}
return mid;
}

Please next time add more detail about the code and question because this code isn't a binary search is from a question that asks for peak element in array -> https://www.geeksforgeeks.org/find-a-peak-in-a-given-array/
and what are you asking is what is the base case of that recursive idea.
the base case is when the middle pointer is not smaller (aka equal or bigger) than its neighbors which is when arr[mid - 1] <= arr[mid] AND arr[mid + 1] <= arr[mid]
Ther other conditions (arr[mid] = 0 / n-1) take care of when the mid is on of the corner elements.
but to understand why you always have a peak you should look on how high and low are peaked:
// If middle element is not peak and its
// left neighbour is greater than it,
// then left half must have a peak element
else if (mid > 0 && arr[mid - 1] > arr[mid])
return findPeakUtil(arr, low, (mid - 1), n);
// If middle element is not peak and its
// right neighbour is greater than it,
// then right half must have a peak element
else
return findPeakUtil(
arr, (mid + 1), high, n);
This make sure that there's always a peak element in the sub-array

The condition should be clear when simplified to
arr[mid - 1] <= arr[mid] && arr[mid] >= arr[mid + 1]
But when mid is one of the extreme positions in the array (0 or n-1), there are two issues:
the expression cannot be evaluated,
you need to give the expression an appropriate meaning anyway.
This is worked around by "guarding" the comparisons with tests on the value of mid. I leave it to you as an exercise to evaluate the condition in these two special cases, by substituting mid with 0 or n-1.

Related

Binary search, when to use right = mid - 1, and when to use right = mid?

I was working through this problem on leetcode https://leetcode.com/problems/leftmost-column-with-at-least-a-one/ and I cant think of an intuitive answer.
Why is the right (or high) pointer sometimes set to mid - 1, and why is it sometimes correct to set to mid?
I am aware that we must always set left = mid + 1 because of integer division. When only two elements remain, we need to set mid + 1 to avoid an infinite loop.
But what are the cases to use right = mid - 1, vs right = mid?
Thanks.
Let's say you are doing binary search on a sequence like below
.....0, 0, 0, 0, 1, 1, 1, 1, ....
Your decision function fn returns true if the value holds true for 1.
Now consider your target is to find the last position for 0. In each step of binary search, we will reduce search space such that we are certain the position is within the range.
If fn returns true for mid you know that the last position for 0 will be less than mid (because you want the last occurrence of 0 which must be before the first occurrence of 1). So, you will update right=mid-1. If fn return false left=mid.
Now consider your target is to find the first occurrence for 1.
Now if fn returns true you will update right=mid because you know the first occurrence of 1 will be on this position or left of it. In this case, if fn returns false, you will need to update left=mid+1.
There are three templates to implement binary search
Template 1: No depend on other elements and neighbor elements.
while(left <= right){
int mid = (int)Math.floor((left + right)/2);
if(nums[mid] == target) return mid;
else if(nums[mid] < target) {
left = mid + 1;
} else {
right = mid - 1;
}
}
Template 2: Use when searching a value requires accessing the current index and its immediate right neighbor's index
while(left < right){
// Prevent (left + right) overflow
int mid = left + (right - left) / 2;
if(nums[mid] == target){ return mid; }
else if(nums[mid] < target) {
left = mid + 1; }
else {
right = mid; }
}
Template 3: Use when searching a value requires accessing the current index and both its immediate left and right neighbor's index
while (left + 1 < right){
// Prevent (left + right) overflow
int mid = left + (right - left) / 2;
if (nums[mid] == target) {
return mid;
} else if (nums[mid] < target) {
left = mid;
} else {
right = mid;
}
}
Source: LeetCode Explore
Linear search on a small array outperforms binary search. Thus it should not really matter if you are using or not using -1, because your search should break out when right-left < N and then you can do a linear search between the two (N is a parameter which you can find for your particular application by running a benchmark).
Iirc N for integer numbers was ~700 when I measured it last.

Binary search bounds

I always have the hardest time with this and I have yet to see a definitive explanation for something that is supposedly so common and highly-used.
We already know the standard binary search. Given starting lower and upper bounds, find the middle point at (lower + higher)/2, and then compare it against your array, and then re-set the bounds accordingly, etc.
However what are the needed differences to adjust the search to find (for a list in ascending order):
Smallest value >= target
Smallest value > target
Largest value <= target
Largest value < target
It seems like each of these cases requires very small tweaks to the algorithm but I can never get them to work right. I try changing inequalities, return conditions, I change how the bounds are updated, but nothing seems consistent.
What are the definitive ways to handle these four cases?
I had exactly the same issue until I figured out loop invariants along with predicates are the best and most consistent way of approaching all binary problems.
Point 1: Think of predicates
In general for all these 4 cases (and also the normal binary search for equality), imagine them as a predicate. So what this means is that some of the values are meeting the predicate and some some failing. So consider for example this array with a target of 5:
[1, 2, 3, 4, 6, 7, 8]. Finding the first number greater than 5 is basically equivalent of finding the first one in this array: [0, 0, 0, 0, 1, 1, 1].
Point 2: Search boundaries inclusive
I like to have both ends always inclusive. But I can see some people like start to be inclusive and end exclusive (on len instead of len -1). I like to have all the elements inside of the array, so when referring to a[mid] I don't think whether that will give me an array out of bound. So my preference: Go inclusive!!!
Point 3: While loop condition <=
So we even want to process the subarray of size 1 in the while loop, and when the while loop finishes there should be no unprocessed element. I really like this logic. It's always solid as a rock. Initially all the elements are not inspected, basically they are unknown. Meaning that everything in the range of [st = 0, to end = len - 1] are not inspected. Then when the while loop finishes, the range of uninspected elements should be array of size 0!
Point 4: Loop invariants
Since we defined start = 0, end = len - 1, invariants will be like this:
Anything left of start is smaller than target.
Anything right of end is greater than or equal to the target.
Point 5: The answer
Once the loop finishes, basically based on the loop invariants anything to the left of start is smaller. So that means that start is the first element greater than or equal to the target.
Equivalently, anything to the right of end is greater than or equal to the target. So that means the answer is also equal to end + 1.
The code:
public int find(int a[], int target){
int start = 0;
int end = a.length - 1;
while (start <= end){
int mid = (start + end) / 2; // or for no overflow start + (end - start) / 2
if (a[mid] < target)
start = mid + 1;
else // a[mid] >= target
end = mid - 1;
}
return start; // or end + 1;
}
variations:
<
It's equivalent of finding the first 0. So basically only return changes.
return end; // or return start - 1;
>
change the if condition to <= and else will be >. No other change.
<=
Same as >, return end; // or return start - 1;
So in general with this model for all the 5 variations (<=, <, >, >=, normal binary search) only the condition in the if changes and the return statement. And figuring those small changes is super easy when you consider the invariants (point 4) and the answer (point 5).
Hope this clarifies for whoever reads this. If anything is unclear of feels like magic please ping me to explain. After understanding this method, everything for binary search should be as clear as day!
Extra point: It would be a good practice to also try including the start but excluding the end. So the array would be initially [0, len). If you can write the invariants, new condition for the while loop, the answer and then a clear code, it means you learnt the concept.
Binary search(at least the way I implement it) relies on a simple property - a predicate holds true for one end of the interval and does not hold true for the other end. I always consider my interval to be closed at one end and opened at the other. So let's take a look at this code snippet:
int beg = 0; // pred(beg) should hold true
int end = n;// length of an array or a value that is guranteed to be out of the interval that we are interested in
while (end - beg > 1) {
int mid = (end + beg) / 2;
if (pred(a[mid])) {
beg = mid;
} else {
end = mid;
}
}
// answer is at a[beg]
This will work for any of the comparisons you define. Simply replace pred with <=target or >=target or <target or >target.
After the cycle exits, a[beg] will be the last element for which the given inequality holds.
So let's assume(like suggested in the comments) that we want to find the largest number for which a[i] <= target. Then if we use predicate a[i] <= target the code will look like:
int beg = 0; // pred(beg) should hold true
int end = n;// length of an array or a value that is guranteed to be out of the interval that we are interested in
while (end - beg > 1) {
int mid = (end + beg) / 2;
if (a[mid] <= target) {
beg = mid;
} else {
end = mid;
}
}
And after the cycle exits, the index that you are searching for will be beg.
Also depending on the comparison you may have to start from the right end of the array. E.g. if you are searching for the largest value >= target, you will do something of the sort of:
beg = -1;
end = n - 1;
while (end - beg > 1) {
int mid = (end + beg) / 2;
if (a[mid] >= target) {
end = mid;
} else {
beg = mid;
}
}
And the value that you are searching for will be with index end. Note that in this case I consider the interval (beg, end] and thus I've slightly modified the starting interval.
The basic binary search is to search the position/value which equals with the target key. While it can be extended to find the minimal position/value which satisfy some condition, or find the maximal position/value which satisfy some condition.
Suppose the array is ascending order, if no satisfied position/value found, return -1.
Code sample:
// find the minimal position which satisfy some condition
private static int getMinPosition(int[] arr, int target) {
int l = 0, r = arr.length - 1;
int ans = -1;
while(l <= r) {
int m = (l + r) >> 1;
// feel free to replace the condition
// here it means find the minimal position that the element not smaller than target
if(arr[m] >= target) {
ans = m;
r = m - 1;
} else {
l = m + 1;
}
}
return ans;
}
// find the maximal position which satisfy some condition
private static int getMaxPosition(int[] arr, int target) {
int l = 0, r = arr.length - 1;
int ans = -1;
while(l <= r) {
int m = (l + r) >> 1;
// feel free to replace the condition
// here it means find the maximal position that the element less than target
if(arr[m] < target) {
ans = m;
l = m + 1;
} else {
r = m - 1;
}
}
return ans;
}
int[] a = {3, 5, 5, 7, 10, 15};
System.out.println(BinarySearchTool.getMinPosition(a, 5));
System.out.println(BinarySearchTool.getMinPosition(a, 6));
System.out.println(BinarySearchTool.getMaxPosition(a, 8));
What you need is a binary search that lets you participate in the process at the last step. The typical binary search would receive (array, element) and produce a value (normally the index or not found). But if you have a modified binary that accept a function to be invoked at the end of the search you can cover all cases.
For example, in Javascript to make it easy to test, the following binary search
function binarySearch(array, el, fn) {
function aux(left, right) {
if (left > right) {
return fn(array, null, left, right);
}
var middle = Math.floor((left + right) / 2);
var value = array[middle];
if (value > el) {
return aux(left, middle - 1);
} if (value < el) {
return aux(middle + 1, right);
} else {
return fn(array, middle, left, right);
}
}
return aux(0, array.length - 1);
}
would allow you to cover each case with a particular return function.
default
function(a, m) { return m; }
Smallest value >= target
function(a, m, l, r) { return m != null ? a[m] : r + 1 >= a.length ? null : a[r + 1]; }
Smallest value > target
function(a, m, l, r) { return (m || r) + 1 >= a.length ? null : a[(m || r) + 1]; }
Largest value <= target
function(a, m, l, r) { return m != null ? a[m] : l - 1 > 0 ? a[l - 1] : null; }
Largest value < target
function(a, m, l, r) { return (m || l) - 1 < 0 ? null : a[(m || l) - 1]; }

Longest slice of a binary array that can be split into two parts

how to find longest slice of a binary array that can be split into two parts: in the left part, 0 should be the leader; in the right part, 1 should be the leader ?
for example :
[1,1,0,1,0,0,1,1] should return 7 so that the first part is [1,0,1,0,0] and the second part is [1,1]
i tried the following soln and it succeeds in some test cases but i think it is not efficient:
public static int solution(int[] A)
{
int length = A.Length;
if (length <2|| length>100000)
return 0;
if (length == 2 && A[0] != A[1])
return 0;
if (length == 2 && A[0] == A[1])
return 2;
int zerosCount = 0;
int OnesCount = 0;
int start = 0;
int end = 0;
int count=0;
//left hand side
for (int i = 0; i < length; i++)
{
end = i;
if (A[i] == 0)
zerosCount++;
if (A[i] == 1)
OnesCount++;
count = i;
if (zerosCount == OnesCount )
{
start++;
break;
}
}
int zeros = 0;
int ones = 0;
//right hand side
for (int j = end+1; j < length; j++)
{
count++;
if (A[j] == 0)
zeros++;
if (A[j] == 1)
ones++;
if (zeros == ones)
{
end--;
break;
}
}
return count;
}
I agree brute force is time complexity: O(n^3).
But this can be solved in linear time. I've implemented it in C, here is the code:
int f4(int* src,int n)
{
int i;
int sum;
int min;
int sta;
int mid;
int end;
// Find middle
sum = 0;
mid = -1;
for (i=0 ; i<n-1 ; i++)
{
if (src[i]) sum++;
else sum--;
if (src[i]==0 && src[i+1]==1)
{
if (mid==-1 || sum<min)
{
min=sum;
mid=i+1;
}
}
}
if (mid==-1) return 0;
// Find start
sum=0;
for (i=mid-1 ; i>=0 ; i--)
{
if (src[i]) sum++;
else sum--;
if (sum<0) sta=i;
}
// Find end
sum=0;
for (i=mid ; i<n ; i++)
{
if (src[i]) sum++;
else sum--;
if (sum>0) end=i+1;
}
return end-sta;
}
This code is tested: brute force results vs. this function. They have same results. I tested all valid arrays of 10 elements (1024 combinations).
If you liked this answer, don't forget to vote up :)
As promissed, heres the update:
I've found a simple algorithm with linear timecomplexity to solve the problem.
The math:
Defining the input as int[] bits, we can define this function:
f(x) = {bits[x] = 0: -1; bits[x] = 1: 1}
Next step would be to create a basic integral of this function for the given input:
F(x) = bits[x] + F(x - 1)
F(-1) = 0
This integral is from 0 to x.
F(x) simply represents the number of count(bits , 1 , 0 , x + 1) - count(bits , 0 , 0 , x + 1). This can be used to define the following function: F(x , y) = F(y) - F(x), which would be the same as count(bits , 1 , x , y + 1) - count(bits , 0 , x , y + 1) (number of 1s minus number of 0s in the range [x , y] - this is just to show how the algorithm basically works).
Since the searched sequence of the field must fulfill the following condition: in the range [start , mid] 0 must be leading, and in the range [mid , end] 1 must be leading and end - start + 1 must be the biggest possible value, the searched mid must fulfill the following condition: F(mid) < F(start) AND F(mid) < F(end). So first step is to search the minimum of 'F(x)', which would be the mid (every other point must be > than the minimum, and thus will result in a smaller / equally big range [end - start + 1]. NOTE: this search can be optimized by taking into the following into account: f(x) is always either 1 or -1. Thus, if f(x) returns 1bits for the next n steps, the next possible index with a minimum would be n * 2 ('n' 1s since the last minimum means, that 'n' -1s are required afterwards to reach a minimum - or atleast 'n' steps).
Given the 'x' for the minimum of F(x), we can simply find start and end (biggest/smallest value b, s ∈ [0 , length(bits) - 1] such that: F(s) > F(mid) and F(b) > F(mid), which can be found in linear time.
Pseudocode:
input: int[] bits
output: int
//input verification left out
//transform the input into F(x)
int temp = 0;
for int i in [0 , length(bits)]
if bits[i] == 0
--temp;
else
++temp;
//search the minimum of F(x)
int midIndex = -1
int mid = length(bits)
for int i in [0 , length(bits - 1)]
if bits[i] > mid
i += bits[i] - mid //leave out next n steps (see above)
else if bits[i - 1] > bits[i] AND bits[i + 1] > bits[i]
midIndex = i
mid = bits[i]
if midIndex == -1
return //only 1s in the array
//search for the endindex
int end
for end in [length(bits - 1) , mid]
if bits[end] > mid
break
else
end -= mid - bits[end] //leave out next n searchsteps
//search for the startindex
int start
for start in [0 , mid]
if bits[start] > mid
break
else
start += mid - bits[start]
return end - start

Neatly checking if a nearby cell is empty

I have a matrix of cells (buttons in my case), if I click on one button, I need to check if a nearby (plus shape) cell is empty, and if a cell is empty (only one can be), I need to swap the two cells (the empty one and the clicked one).
What I do now is:
if(j < 3)
if (!fbarr[i, j + 1].Visible)
swap(fbarr[i, j], fbarr[i, j + 1]);
if(j > 0)
if (!fbarr[i, j - 1].Visible)
swap(fbarr[i, j], fbarr[i, j - 1]);
if(i < 3)
if (!fbarr[i + 1, j].Visible)
swap(fbarr[i, j], fbarr[i + 1, j]);
if(i > 0)
if (!fbarr[i - 1, j].Visible)
swap(fbarr[i, j], fbarr[i - 1, j]);
Now personally I think this is ugly as hell.
Is there a nicer way to do this? (This is C# if it matters)
Thanks
Your current technique isn't necessarily bad, it just isn't DRY enough. You can also make the search space more explicit by getting the offsets into some kind of data structure. Here's an example using Tuples:
var offsets = new List<Tuple<int, int>>
{
Tuple.Create(0, 1),
Tuple.Create(0, -1),
Tuple.Create( 1, 0),
Tuple.Create(-1, 0)
};
foreach (var offset in offsets) {
int newI = i + offset.Item1;
int newJ = j + offset.Item2;
// New position must be within range
if (newI >= 0 && newI <= 3 && newJ >= 0 && newJ <= 3) {
if (!fbarr[newI, newJ].Visible) {
swap(fbarr[i, j], fbarr[newI, newJ]);
}
}
}

When to use low < high or low + 1 < high for loop invariant

I've read multiple articles including Jon Bentleys chapter on binary search. This is what I understand about CORRECT binary search logic and it works in the simple tests I did:
binarysearch (arr, low, high, k)
1. while (low < high)
2. mid = low + (high - low)/2
3. if (arr[mid] == k)
return mid
4. if (arr[mid] < k )
high = mid -1
5. else
low = mid + 1
Now to find the 1st occurence with sorted duplicates, you'd chance line 3 if condition to continue
instead of returning mid as
binarysearch_get_first_occur_with_duplicates (arr, low, high, k)
1. while (low < high)
2. mid = low + (high - low)/2
3. if (arr[mid] == k)
high = mid - 1
low_so_far = arr[mid]
4. if (arr[mid] < k )
high = mid -1
5. else
low = mid + 1
return low_so_far
Similarly to get highest index of repeated element, you'd do low = mid + 1 and continue if arr[mid]==k
This logic seems to be working but in multiple places I see the loop invariant as
while (low + 1 < high)
I am confused and want to understand when you might want to use low + 1 < high instead
of low < high.
In the logic I described above low + 1 < high condition leads to errors if you test with simple example.
Can someone clarify why and when we might want to use low + 1 < high in the while loop instead of low < high?
If your invariant is that the target must lie in low <= i <= high, then you use while (low < high); if your invariant is that the target must lie in low <= i < high then you use while (low + 1 < high). [Thanks to David Eisenstat for confirming this.]

Resources