Binary search for first occurrence of k - algorithm

I have code that searches a sorted array and returns the index of the first occurrence of k.
I am wondering whether its possible to write this code using
while(left<right)
instead of
while(left<=right)
Here is the full code:
public static int searchFirstOfK(List<Integer> A, int k) {
int left = 0, right = A.size() - 1, result = -1;
// A.subList(left, right + 1) is the candidate set.
while (left <= right) {
int mid = left + ((right - left) / 2);
if (A.get(mid) > k) {
right = mid - 1;
} else if (A.get(mid) == k) {
result = mid;
// Nothing to the right of mid can be the first occurrence of k.
right = mid - 1;
} else { // A.get(mid) < k
left = mid + 1;
}
}
return result;
}
How do I know when to use left is less than or equal to right, or just use left is less than right.

Building on this answer to another binary search question: How can I simplify this working Binary Search code in C?
If you want to find the position of the first occurrence, you can't stop when you find a matching element. Your search should look like this (of course this assumes that the list is sorted):
int findFirst(List<Integer> list, int valueToFind)
{
int pos=0;
int limit=list.size();
while(pos<limit)
{
int testpos = pos+((limit-pos)>>1);
if (list.get(testpos)<valueToFind)
pos=testpos+1;
else
limit=testpos;
}
if (pos < list.size() && list.get(pos)==valueToFind)
return pos;
else
return -1;
}
Note that we only need to do one comparison per iteration. The binary search finds the unique position where all the preceding elements are less than valueToFind and all the following elements are greater or equal, and then it checks to see if the value you're looking for is actually there.
The linked answer highlights several advantages of writing a binary search this way.

Simply put No.
Consider the case of array having only one element i.e., {0} and the element to be searched is 0 as well.
In this case, left == right, but if your condition is while(left<right), then searchFirstOfK will return -1.
This answer is in context of the posted code. If we are talking about alternatives so that we can use while(left<right) then Matt Timmermans's answer is correct and is an even better approach.
Below is a comparison of Matt (OP - Let's call it Normal Binary) and Matt Timmermans (Let's call it Optimized Binary) approaches for a list containing values between 0 and 5000000:

This is an extremely interesting question. The thing is there is a way by which you can make your binary search right always. The thing is determining the correct ranges and avoiding the single element stuck-out behavior.
while(left+1<right)
{
m = (left+right)/2;
if(check condition is true)
left = m;
else
right = m;
}
Only key thing to remember is you always make the left as the smallest condition unsatisfying element and right as the biggest condition satisfying element. That way you won't get stuck up. Once you understand the range division by this method, you will never fail at binary search.
The above initialization will give you the largest condition satisfying element.
By changing the initialization you can get variety of elements (like small condition satisfying element).

Related

How can I stop recursion after I succeed my insertion?

The problem definition is that
Given an additional digit 0 ≤ x ≤ 9, write a function that returns
the integer that results from inserting x in n, such that its digits
also appear in ascending order from left to right. For instance, if n
= 24667 and x = 5, the function should return 245667.
My code
// the divisions are integer division, no floating point
int x(int n, int insertValue)
{
if (n == 0) return 0;
int val = x(n/10, insertValue);
if((n%10) > insertValue)
{
int q = insertValue * 10 + (n%10);
return val * 100 + q;
}
return val*10 + (n%10);
}
For the case of, for example, x(2245,3), it outputs 223435. But I have already done with it while processing 224. It shouldn't go on adding the value to be inserted any more, I mean the 3 shouldn't be there before 5 .
I can come up with a solution that I can put in each recursion step a boolean flag that identify by taking modulo by 10 and dividing by 10 up to reach the single digit case. If there is no any identification, go in to that if block, else not. But it sounds too silly.
When you're dividing recursively, you are actually going right to left, not left to right, so you should check if a digit is smaller than the one inserted and not greater (unless you always let the recursion reach n==0 condition and make your comparisons on your way out of it but that would be ineffective).
The second thing is that you do not break recursion once you inserted the digit (which you were aware of as I can see now in the question title), so it gets inserted repeatedly before every digit that is larger than insertValue. As to how to do it: you were already stopping recursion with if(n==0) condition, i.e. if n==0 the function stops calling itself (it returns immediately). When inserting the digit the difference is that you need to use the original value (n) to return from the function instead of passing it further.
At this point it works well for your example but there's also one edge case you need to consider if you want the function to work property. When you have nothing to divide anymore [if(n==0)] you need to insert your digit anyway (return insertValue) so it does not get lost on the left edge like in x(2245,1) call.
Improvements for brevity:
% has the same precedence as * and /, so brackets around it are not needed here.
I removed val variable as it was now used only once and its calculation was not always necessary.
Here's the working code:
int x(int n, int insertValue){
if(n == 0) return insertValue;
//if insertion point was found, use original value (n)
if(n%10 <= insertValue)
return n*10 + insertValue;
//if not there yet, keep calling x()
return x(n/10, insertValue)*10 + n%10;
}

How to optimise my solution to HackerRank's Largest Rectangle problem? [duplicate]

I have a histogram with integer heights and constant width 1. I want to maximize the rectangular area under a histogram.
e.g.:
_
| |
| |_
| |
| |_
| |
The answer for this would be 6, 3 * 2, using col1 and col2.
O(n^2) brute force is clear to me, I would like an O(n log n) algorithm. I'm trying to think dynamic programming along the lines of maximum increasing subsequence O(n log n) algo, but am not going forward. Should I use divide and conquer algorithm?
PS: People with enough reputation are requested to remove the divide-and-conquer tag if there is no such solution.
After mho's comments: I mean the area of largest rectangle that fits entirely. (Thanks j_random_hacker for clarifying :) ).
The above answers have given the best O(n) solution in code, however, their explanations are quite tough to comprehend. The O(n) algorithm using a stack seemed magic to me at first, but right now it makes every sense to me. OK, let me explain it.
First observation:
To find the maximal rectangle, if for every bar x, we know the first smaller bar on its each side, let's say l and r, we are certain that height[x] * (r - l - 1) is the best shot we can get by using height of bar x. In the figure below, 1 and 2 are the first smaller of 5.
OK, let's assume we can do this in O(1) time for each bar, then we can solve this problem in O(n)! by scanning each bar.
Then, the question comes: for every bar, can we really find the first smaller bar on its left and on its right in O(1) time? That seems impossible right? ... It is possible, by using a increasing stack.
Why using an increasing stack can keep track of the first smaller on its left and right?
Maybe by telling you that an increasing stack can do the job is not convincing at all, so I will walk you through this.
Firstly, to keep the stack increasing, we need one operation:
while x < stack.top():
stack.pop()
stack.push(x)
Then you can check that in the increasing stack (as depicted below), for stack[x], stack[x-1] is the first smaller on its left, then a new element that can pop stack[x] out is the first smaller on its right.
Still can't believe stack[x-1] is the first smaller on the left on stack[x]?
I will prove it by contradiction.
First of all, stack[x-1] < stack[x] is for sure. But let's assume stack[x-1] is not the first smaller on the left of stack[x].
So where is the first smaller fs?
If fs < stack[x-1]:
stack[x-1] will be popped out by fs,
else fs >= stack[x-1]:
fs shall be pushed into stack,
Either case will result fs lie between stack[x-1] and stack[x], which is contradicting to the fact that there is no item between stack[x-1] and stack[x].
Therefore stack[x-1] must be the first smaller.
Summary:
Increasing stack can keep track of the first smaller on left and right for each element. By using this property, the maximal rectangle in histogram can be solved by using a stack in O(n).
Congratulations! This is really a tough problem, I'm glad my prosaic explanation didn't stop you from finishing. Attached is my proved solution as your reward :)
def largestRectangleArea(A):
ans = 0
A = [-1] + A
A.append(-1)
n = len(A)
stack = [0] # store index
for i in range(n):
while A[i] < A[stack[-1]]:
h = A[stack.pop()]
area = h*(i-stack[-1]-1)
ans = max(ans, area)
stack.append(i)
return ans
There are three ways to solve this problem in addition to the brute force approach. I will write down all of them. The java codes have passed tests in an online judge site called leetcode: http://www.leetcode.com/onlinejudge#question_84. so I am confident codes are correct.
Solution 1: dynamic programming + n*n matrix as cache
time: O(n^2), space: O(n^2)
Basic idea: use the n*n matrix dp[i][j] to cache the minimal height between bar[i] and bar[j]. Start filling the matrix from rectangles of width 1.
public int solution1(int[] height) {
int n = height.length;
if(n == 0) return 0;
int[][] dp = new int[n][n];
int max = Integer.MIN_VALUE;
for(int width = 1; width <= n; width++){
for(int l = 0; l+width-1 < n; l++){
int r = l + width - 1;
if(width == 1){
dp[l][l] = height[l];
max = Math.max(max, dp[l][l]);
} else {
dp[l][r] = Math.min(dp[l][r-1], height[r]);
max = Math.max(max, dp[l][r] * width);
}
}
}
return max;
}
Solution 2: dynamic programming + 2 arrays as cache.
time: O(n^2), space: O(n)
Basic idea: this solution is like solution 1, but saves some space. The idea is that in solution 1 we build the matrix from row 1 to row n. But in each iteration, only the previous row contributes to the building of the current row. So we use two arrays as previous row and current row by turns.
public int Solution2(int[] height) {
int n = height.length;
if(n == 0) return 0;
int max = Integer.MIN_VALUE;
// dp[0] and dp[1] take turns to be the "previous" line.
int[][] dp = new int[2][n];
for(int width = 1; width <= n; width++){
for(int l = 0; l+width-1 < n; l++){
if(width == 1){
dp[width%2][l] = height[l];
} else {
dp[width%2][l] = Math.min(dp[1-width%2][l], height[l+width-1]);
}
max = Math.max(max, dp[width%2][l] * width);
}
}
return max;
}
Solution 3: use stack.
time: O(n), space:O(n)
This solution is tricky and I learnt how to do this from explanation without graphs and explanation with graphs. I suggest you read the two links before reading my explanation below. It's hard to explain without graphs so my explanations might be hard to follow.
Following are my explanations:
For each bar, we must be able to find the biggest rectangle containing this bar. So the biggest one of these n rectangles is what we want.
To get the biggest rectangle for a certain bar (let's say bar[i], the (i+1)th bar), we just need to find out the biggest interval
that contains this bar. What we know is that all the bars in this interval must be at least the same height with bar[i]. So if we figure out how many
consecutive same-height-or-higher bars are there on the immediate left of bar[i], and how many consecutive same-height-or-higher bars are there on the immediate right of the bar[i], we
will know the length of the interval, which is the width of the biggest rectangle for bar[i].
To count the number of consecutive same-height-or-higher bars on the immediate left of bar[i], we only need to find the closest bar on the left that is shorter
than the bar[i], because all the bars between this bar and bar[i] will be consecutive same-height-or-higher bars.
We use a stack to dynamicly keep track of all the left bars that are shorter than a certain bar. In other words, if we iterate from the first bar to bar[i], when we just arrive at the bar[i] and haven't updated the stack,
the stack should store all the bars that are no higher than bar[i-1], including bar[i-1] itself. We compare bar[i]'s height with every bar in the stack until we find one that is shorter than bar[i], which is the cloest shorter bar.
If the bar[i] is higher than all the bars in the stack, it means all bars on the left of bar[i] are higher than bar[i].
We can do the same thing on the right side of the i-th bar. Then we know for bar[i] how many bars are there in the interval.
public int solution3(int[] height) {
int n = height.length;
if(n == 0) return 0;
Stack<Integer> left = new Stack<Integer>();
Stack<Integer> right = new Stack<Integer>();
int[] width = new int[n];// widths of intervals.
Arrays.fill(width, 1);// all intervals should at least be 1 unit wide.
for(int i = 0; i < n; i++){
// count # of consecutive higher bars on the left of the (i+1)th bar
while(!left.isEmpty() && height[i] <= height[left.peek()]){
// while there are bars stored in the stack, we check the bar on the top of the stack.
left.pop();
}
if(left.isEmpty()){
// all elements on the left are larger than height[i].
width[i] += i;
} else {
// bar[left.peek()] is the closest shorter bar.
width[i] += i - left.peek() - 1;
}
left.push(i);
}
for (int i = n-1; i >=0; i--) {
while(!right.isEmpty() && height[i] <= height[right.peek()]){
right.pop();
}
if(right.isEmpty()){
// all elements to the right are larger than height[i]
width[i] += n - 1 - i;
} else {
width[i] += right.peek() - i - 1;
}
right.push(i);
}
int max = Integer.MIN_VALUE;
for(int i = 0; i < n; i++){
// find the maximum value of all rectangle areas.
max = Math.max(max, width[i] * height[i]);
}
return max;
}
Implementation in Python of the #IVlad's answer O(n) solution:
from collections import namedtuple
Info = namedtuple('Info', 'start height')
def max_rectangle_area(histogram):
"""Find the area of the largest rectangle that fits entirely under
the histogram.
"""
stack = []
top = lambda: stack[-1]
max_area = 0
pos = 0 # current position in the histogram
for pos, height in enumerate(histogram):
start = pos # position where rectangle starts
while True:
if not stack or height > top().height:
stack.append(Info(start, height)) # push
elif stack and height < top().height:
max_area = max(max_area, top().height*(pos-top().start))
start, _ = stack.pop()
continue
break # height == top().height goes here
pos += 1
for start, height in stack:
max_area = max(max_area, height*(pos-start))
return max_area
Example:
>>> f = max_rectangle_area
>>> f([5,3,1])
6
>>> f([1,3,5])
6
>>> f([3,1,5])
5
>>> f([4,8,3,2,0])
9
>>> f([4,8,3,1,1,0])
9
Linear search using a stack of incomplete subproblems
Copy-paste algorithm's description (in case the page goes down):
We process the elements in
left-to-right order and maintain a
stack of information about started but
yet unfinished subhistograms. Whenever
a new element arrives it is subjected
to the following rules. If the stack
is empty we open a new subproblem by
pushing the element onto the stack.
Otherwise we compare it to the element
on top of the stack. If the new one is
greater we again push it. If the new
one is equal we skip it. In all these
cases, we continue with the next new
element. If the new one is less, we
finish the topmost subproblem by
updating the maximum area w.r.t. the
element at the top of the stack. Then,
we discard the element at the top, and
repeat the procedure keeping the
current new element. This way, all
subproblems are finished until the
stack becomes empty, or its top
element is less than or equal to the
new element, leading to the actions
described above. If all elements have
been processed, and the stack is not
yet empty, we finish the remaining
subproblems by updating the maximum
area w.r.t. to the elements at the
top.
For the update w.r.t. an element, we
find the largest rectangle that
includes that element. Observe that an
update of the maximum area is carried
out for all elements except for those
skipped. If an element is skipped,
however, it has the same largest
rectangle as the element on top of the
stack at that time that will be
updated later. The height of the
largest rectangle is, of course, the
value of the element. At the time of
the update, we know how far the
largest rectangle extends to the right
of the element, because then, for the
first time, a new element with smaller
height arrived. The information, how
far the largest rectangle extends to
the left of the element, is available
if we store it on the stack, too.
We therefore revise the procedure
described above. If a new element is
pushed immediately, either because the
stack is empty or it is greater than
the top element of the stack, the
largest rectangle containing it
extends to the left no farther than
the current element. If it is pushed
after several elements have been
popped off the stack, because it is
less than these elements, the largest
rectangle containing it extends to the
left as far as that of the most
recently popped element.
Every element is pushed and popped at
most once and in every step of the
procedure at least one element is
pushed or popped. Since the amount of
work for the decisions and the update
is constant, the complexity of the
algorithm is O(n) by amortized
analysis.
The other answers here have done a great job presenting the O(n)-time, O(n)-space solution using two stacks. There's another perspective on this problem that independently provides an O(n)-time, O(n)-space solution to the problem, and might provide a little bit more insight as to why the stack-based solution works.
The key idea is to use a data structure called a Cartesian tree. A Cartesian tree is a binary tree structure (though not a binary search tree) that's built around an input array. Specifically, the root of the Cartesian tree is built above the minimum element of the array, and the left and right subtrees are recursively constructed from the subarrays to the left and right of the minimum value.
For example, here's a sample array and its Cartesian tree:
+----------------------- 23 ------+
| |
+------------- 26 --+ +-- 79
| | |
31 --+ 53 --+ 84
| |
41 --+ 58 -------+
| |
59 +-- 93
|
97
+----+----+----+----+----+----+----+----+----+----+----+
| 31 | 41 | 59 | 26 | 53 | 58 | 97 | 93 | 23 | 84 | 79 |
+----+----+----+----+----+----+----+----+----+----+----+
The reason that Cartesian trees are useful in this problem is that the question at hand has a really nice recursive structure to it. Begin by looking at the lowest rectangle in the histogram. There are three options for where the maximum rectangle could end up being placed:
It could pass right under the minimum value in the histogram. In that case, to make it as large as possible, we'd want to make it as wide as the entire array.
It could be entirely to the left of the minimum value. In that case, we recursively want the answer formed from the subarray purely to the left of the minimum value.
It could be entirely to the right of the minimum value. In that case, we recursively want the answer formed from the subarray purely to the right of the minimum value.
Notice that this recursive structure - find the minimum value, do something with the subarrays to the left and the right of that value - perfectly matches the recursive structure of a Cartesian tree. In fact, if we can create a Cartesian tree for the overall array when we get started, we can then solve this problem by recursively walking the Cartesian tree from the root downward. At each point, we recursively compute the optimal rectangle in the left and right subarrays, along with the rectangle you'd get by fitting right under the minimum value, and then return the best option we find.
In pseudocode, this looks like this:
function largestRectangleUnder(int low, int high, Node root) {
/* Base case: If the range is empty, the biggest rectangle we
* can fit is the empty rectangle.
*/
if (low == high) return 0;
/* Assume the Cartesian tree nodes are annotated with their
* positions in the original array.
*/
return max {
(high - low) * root.value, // Widest rectangle under the minimum
largestRectangleUnder(low, root.index, root.left),
largestRectnagleUnder(root.index + 1, high, root.right)
}
}
Once we have the Cartesian tree, this algorithm takes time O(n), since we visit each node exactly once and do O(1) work per node.
It turns out that there's a simple, linear-time algorithm for building Cartesian trees. The "natural" way you'd probably think to build one would be to scan across the array, find the minimum value, then recursively build a Cartesian tree from the left and right subarrays. The problem is that the process of finding the minimum value is really expensive, and this can take time Θ(n2).
The "fast" way to build a Cartesian tree is by scanning the array from the left to the right, adding in one element at a time. This algorithm is based on the following observations about Cartesian trees:
First, Cartesian trees obey the heap property: every element is less than or equal to its children. The reason for this is that the Cartesian tree root is the smallest value in the overall array, and its children are the smallest elements in their subarrays, etc.
Second, if you do an inorder traversal of a Cartesian tree, you get back the elements of the array in the order in which they appear. To see why this is, notice that if you do an inorder traversal of a Cartesian tree, you first visit everything to the left of the minimum value, then the minimum value, then everything to the right of the minimum value. Those visitations are recursively done the same way, so everything ends up being visited in order.
These two rules give us a lot of information about what happens if we start with a Cartesian tree of the first k elements of the array and want to form a Cartesian tree for the first k+1 elements. That new element will have to end up on the right spine of the Cartesian tree - the part of the tree formed by starting at the root and only taking steps to the right - because otherwise something would come after it in an inorder traversal. And, within that right spine, it has to be placed in a way that makes it bigger than everything above it, since we need to obey the heap property.
The way that you actually add a new node to the Cartesian tree is to start at the rightmost node in the tree and walk upwards until you either hit the root of the tree or find a node that has a smaller value. You then make the new value have as its left child the last node it walked up on top of.
Here's a trace of that algorithm on a small array:
+---+---+---+---+
| 2 | 4 | 3 | 1 |
+---+---+---+---+
2 becomes the root.
2 --+
|
4
4 is bigger than 2, we can't move upwards. Append to right.
+---+---+---+---+
| 2 | 4 | 3 | 1 |
+---+---+---+---+
2 ------+
|
--- 3
|
4
3 is lesser than 4, climb over it. Can't climb further over 2, as it is smaller than 3. Climbed over subtree rooted at 4 goes to the left of new value 3 and 3 becomes rightmost node now.
+---+---+---+---+
| 2 | 4 | 3 | 1 |
+---+---+---+---+
+---------- 1
|
2 ------+
|
--- 3
|
4
1 climbs over the root 2, the entire tree rooted at 2 is moved to left of 1, and 1 is now the new root - and also the rightmost value.
+---+---+---+---+
| 2 | 4 | 3 | 1 |
+---+---+---+---+
Although this might not seem to run in linear time - wouldn't you potentially end up climbing all the way to the root of the tree over and over and over again? - you can show that this runs in linear time using a clever argument. If you climb up over a node in the right spine during an insertion, that node ends up getting moved off the right spine and therefore can't be rescanned in a future insertion. Therefore, every node is only ever scanned over at most once, so the total work done is linear.
And now the kicker - the standard way that you'd actually implement this approach is by maintaining a stack of the values that correspond to the nodes on the right spine. The act of "walking up" and over a node corresponds to popping a node off the stack. Therefore, the code for building a Cartesian tree looks something like this:
Stack s;
for (each array element x) {
pop s until it's empty or s.top > x
push x onto the stack.
do some sort of pointer rewiring based on what you just did.
}
The stack manipulations here might seem really familiar, and that's because these are the exact stack operations that you would do in the answers shown elsewhere here. In fact, you can think of what those approaches are doing as implicitly building the Cartesian tree and running the recursive algorithm shown above in the process of doing so.
The advantage, I think, of knowing about Cartesian trees is that it provides a really nice conceptual framework for seeing why this algorithm works correctly. If you know that what you're doing is running a recursive walk of a Cartesian tree, it's easier to see that you're guaranteed to find the largest rectangle. Plus, knowing that the Cartesian tree exists gives you a useful tool for solving other problems. Cartesian trees show up in the design of fast data structures for the range minimum query problem and are used to convert suffix arrays into suffix trees.
Here's some Java code that implements this idea, courtesy of #Azeem!
import java.util.Stack;
public class CartesianTreeMakerUtil {
private static class Node {
int val;
Node left;
Node right;
}
public static Node cartesianTreeFor(int[] nums) {
Node root = null;
Stack<Node> s = new Stack<>();
for(int curr : nums) {
Node lastJumpedOver = null;
while(!s.empty() && s.peek().val > curr) {
lastJumpedOver = s.pop();
}
Node currNode = this.new Node();
currNode.val = curr;
if(s.isEmpty()) {
root = currNode;
}
else {
s.peek().right = currNode;
}
currNode.left = lastJumpedOver;
s.push(currNode);
}
return root;
}
public static void printInOrder(Node root) {
if(root == null) return;
if(root.left != null ) {
printInOrder(root.left);
}
System.out.println(root.val);
if(root.right != null) {
printInOrder(root.right);
}
}
public static void main(String[] args) {
int[] nums = new int[args.length];
for (int i = 0; i < args.length; i++) {
nums[i] = Integer.parseInt(args[i]);
}
Node root = cartesianTreeFor(nums);
tester.printInOrder(root);
}
}
The easiest solution in O(N)
long long getMaxArea(long long hist[], long long n)
{
stack<long long> s;
long long max_area = 0;
long long tp;
long long area_with_top;
long long i = 0;
while (i < n)
{
if (s.empty() || hist[s.top()] <= hist[i])
s.push(i++);
else
{
tp = s.top(); // store the top index
s.pop(); // pop the top
area_with_top = hist[tp] * (s.empty() ? i : i - s.top() - 1);
if (max_area < area_with_top)
{
max_area = area_with_top;
}
}
}
while (!s.empty())
{
tp = s.top();
s.pop();
area_with_top = hist[tp] * (s.empty() ? i : i - s.top() - 1);
if (max_area < area_with_top)
max_area = area_with_top;
}
return max_area;
}
There is also another solution using Divide and Conquer. The algorithm for it is :
1) Divide the array into 2 parts with the smallest height as the breaking point
2) The maximum area is the maximum of :
a) Smallest height * size of the array
b) Maximum rectangle in left half array
c) Maximum rectangle in right half array
The time complexity comes to O(nlogn)
The stack solution is one of the most clever solutions I've seen till date. And it can be a little hard to understand why that works.
I've taken a jab at explaining the same in some detail here.
Summary points from the post:-
General way our brain thinks is :-
Create every situation and try to find the value of the contraint that is needed to solve the problem.
And we happily convert that to code as :- find the value of contraint(min) for each situation(pair(i,j))
The clever solutions tries to flip the problem.For each constraint/min value of tha area, what is the best possible left and right extremes ?
So if we traverse over each possible min in the array. What are the left and right extremes for each value ?
Little thought says, the first left most value less than the current min and similarly the first rightmost value that is lesser than the current min.
So now we need to see if we can find a clever way to find the first left and right values lesser than the current value.
To think: If we have traversed the array partially say till min_i, how can the solution to min_i+1 be built?
We need the first value less than min_i to its left.
Inverting the statement : we need to ignore all values to the left of min_i that are greater than min_i. We stop when we find the first value smaller than min_i (i) . The troughs in the curve hence become useless once we have crossed it. In histogram , (2 4 3) => if 3 is min_i, 4 being larger is not of interest.
Corrollary: in a range (i,j). j being the min value we are considering.. all values between j and its left value i are useless. Even for further calculations.
Any histogram on the right with a min value larger than j, will be binded at j. The values of interest on the left form a monotonically increasing sequence with j being the largest value. (Values of interest here being possible values that may be of interest for the later array)
Since, we are travelling from left to right, for each min value/ current value - we do not know whether the right side of the array will have an element smaller than it.
So we have to keep it in memory until we get to know this value is useless. (since a smaller value is found)
All this leads to a usage of our very own stack structure.
We keep on stack until we don't know its useless.
We remove from stack once we know the thing is crap.
So for each min value to find its left smaller value, we do the following:-
pop the elements larger to it (useless values)
The first element smaller than the value is the left extreme. The i to our min.
We can do the same thing from the right side of the array and we will get j to our min.
It's quite hard to explain this, but if this is making sense then I'd suggest read the complete article here since it has more insights and details.
I don't understand the other entries, but I think I know how to do it in O(n) as follows.
A) for each index find the largest rectangle inside the histogram ending at that index where the index column touches the top of the rectangle and remember where the rectangle starts. This can be done in O(n) using a stack based algorithm.
B) Similarly for each index find the largest rectangle starting at that index where the index column touches the top of the rectangle and remember where the rectangle ends. Also O(n) using the same method as (A) but scanning the histogram backwards.
C) For each index combine the results of (A) and (B) to determine the largest rectangle where the column at that index touches the top of the rectangle. O(n) like (A).
D) Since the largest rectangle must be touched by some column of the histogram the largest rectangle is the largest rectangle found in step (C).
The hard part is implementing (A) and (B), which I think is what JF Sebastian may have solved rather than the general problem stated.
I coded this one and felt little better in the sense:
import java.util.Stack;
class StackItem{
public int sup;
public int height;
public int sub;
public StackItem(int a, int b, int c){
sup = a;
height = b;
sub =c;
}
public int getArea(){
return (sup - sub)* height;
}
#Override
public String toString(){
return " from:"+sup+
" to:"+sub+
" height:"+height+
" Area ="+getArea();
}
}
public class MaxRectangleInHistogram {
Stack<StackItem> S;
StackItem curr;
StackItem maxRectangle;
public StackItem getMaxRectangleInHistogram(int A[], int n){
int i = 0;
S = new Stack();
S.push(new StackItem(0,0,-1));
maxRectangle = new StackItem(0,0,-1);
while(i<n){
curr = new StackItem(i,A[i],i);
if(curr.height > S.peek().height){
S.push(curr);
}else if(curr.height == S.peek().height){
S.peek().sup = i+1;
}else if(curr.height < S.peek().height){
while((S.size()>1) && (curr.height<=S.peek().height)){
curr.sub = S.peek().sub;
S.peek().sup = i;
decideMaxRectangle(S.peek());
S.pop();
}
S.push(curr);
}
i++;
}
while(S.size()>1){
S.peek().sup = i;
decideMaxRectangle(S.peek());
S.pop();
}
return maxRectangle;
}
private void decideMaxRectangle(StackItem s){
if(s.getArea() > maxRectangle.getArea() )
maxRectangle = s;
}
}
Just Note:
Time Complexity: T(n) < O(2n) ~ O(n)
Space Complexity S(n) < O(n)
I would like to thank #templatetypedef for his/her extremely detailed and intuitive answer. The Java code below is based on his suggestion to use Cartesian Trees and solves the problem in O(N) time and O(N) space. I suggest that you read #templatetypedef's answer above before reading the code below. The code is given in the format of the solution to the problem at leetcode: https://leetcode.com/problems/largest-rectangle-in-histogram/description/ and passes all 96 test cases.
class Solution {
private class Node {
int val;
Node left;
Node right;
int index;
}
public Node getCartesianTreeFromArray(int [] nums) {
Node root = null;
Stack<Node> s = new Stack<>();
for(int i = 0; i < nums.length; i++) {
int curr = nums[i];
Node lastJumpedOver = null;
while(!s.empty() && s.peek().val >= curr) {
lastJumpedOver = s.pop();
}
Node currNode = this.new Node();
currNode.val = curr;
currNode.index = i;
if(s.isEmpty()) {
root = currNode;
}
else {
s.peek().right = currNode;
}
currNode.left = lastJumpedOver;
s.push(currNode);
}
return root;
}
public int largestRectangleUnder(int low, int high, Node root, int [] nums) {
/* Base case: If the range is empty, the biggest rectangle we
* can fit is the empty rectangle.
*/
if(root == null) return 0;
if (low == high) {
if(0 <= low && low <= nums.length - 1) {
return nums[low];
}
return 0;
}
/* Assume the Cartesian tree nodes are annotated with their
* positions in the original array.
*/
int leftArea = -1 , rightArea= -1;
if(root.left != null) {
leftArea = largestRectangleUnder(low, root.index - 1 , root.left, nums);
}
if(root.right != null) {
rightArea = largestRectangleUnder(root.index + 1, high,root.right, nums);
}
return Math.max((high - low + 1) * root.val,
Math.max(leftArea, rightArea));
}
public int largestRectangleArea(int[] heights) {
if(heights == null || heights.length == 0 ) {
return 0;
}
if(heights.length == 1) {
return heights[0];
}
Node root = getCartesianTreeFromArray(heights);
return largestRectangleUnder(0, heights.length - 1, root, heights);
}
}
python-3
a=[3,4,7,4,6]
a.sort()
r=0
for i in range(len(a)):
if a[i]* (n-1) > r:
r = a[i]*(n-i)
print(r)
output:
16
I come across this question in one of interview. Was trying to solve this, resulting in observed following things -
Need to check consecutive left elements greater than current
element
Need to check consecutive right elements greater than
current element
Calculate area (number of left side max elements + number of right side max elements + 1) * current element
Check and replace existing maxArea if calculated area is greater than
maxArea
Following is the JS code implementing above pseudocode
function maxAreaCovered(arr) {
let maxArea = 0;
for (let index = 0; index < arr.length; index++) {
let l = index - 1;
let r = index + 1;
let maxEleCount = 0
while (l > -1) {
if (arr[l] >= arr[index]) {
maxEleCount++;
} else {
break;
}
l--;
}
while (r < arr.length) {
if (arr[r] >= arr[index]) {
maxEleCount++;
} else {
break;
}
r++;
}
let area = (maxEleCount + 1) * arr[index];
maxArea = Math.max(area, maxArea);
}
return maxArea
}
console.log(maxAreaCovered([6, 2, 5, 4, 5, 1, 6]));
You can use O(n) method which uses stack to calculate the maximum area under the histogram.
long long histogramArea(vector<int> &histo){
stack<int> s;
long long maxArea=0;
long long area= 0;
int i =0;
for (i = 0; i < histo.size();) {
if(s.empty() || histo[s.top()] <= histo[i]){
s.push(i++);
}
else{
int top = s.top(); s.pop();
area= histo[top]* (s.empty()?i:i-s.top()-1);
if(area >maxArea)
maxArea= area;
}
}
while(!s.empty()){
int top = s.top();s.pop();
area= histo[top]* (s.empty()?i:i-s.top()-1);
if(area >maxArea)
maxArea= area;
}
return maxArea;
}
For explanation you can read here http://www.geeksforgeeks.org/largest-rectangle-under-histogram/

Quick Sort Time Complexity Best Case Input

I have to find time complexity of quick sort for BEST CASE INPUT in a c program & i have selected the last element of array as pivot.
Now i know what input values i have to enter for best case, i.e., keep 1st middle element at the last place(pivot) & next pivot should be the next middle element.
But i have to generate this kind of best case input array of very big sizes like 1000, 5000, 100000.., for quick sort.
I can code, but can anyone please help me understand how to generate that kind of best case input array for quick sort with last pivot, using c programming.
I just need the logic like how to generate that kind of array using c programming.
Basically you need to do a divide & conquer approach akin to quicksort itself. Do it with a function that given a range of indices in the output:
generates the first-half partition by recursively calling itself
generates the second-half partition by recursively calling itself
inserts the pivot value after the second-half partition.
One thing to note is that since you are just generating output not sorting anything, you don't actually have to have any values as input -- you can just represent ranges logically as a start value at some index in the array and a count.
Some C# code is below; this is untested -- don't look if you want to do this yourself.
static int[] GenerateBestCaseQuickSort(int n)
{
var ary = new int[n];
GenerateBestCaseQuickSortAux(ary, 0, n, 1);
return ary;
}
static void GenerateBestCaseQuickSortAux(int[] ary, int start_index, int count, int start_value)
{
if (count == 0)
return;
if (count == 1)
{
ary[start_index] = start_value;
return;
}
int partition1_count = count / 2;
int partition2_count = count - partition1_count - 1; // need to save a spot for the pivot so -1...
int pivot_value_index = start_index + partition1_count;
int pivot_value = start_value + partition1_count;
GenerateBestCaseQuickSort(ary, start_index, partition1_count, start_value);
GenerateBestCaseQuickSort(ary, pivot_value_index, partition2_count, pivot_value+1);
ary[start_index + count - 1] = pivot_value;
}

Code Complexity in 3 array case

You are given with three sorted arrays ( in ascending order), you are required to find a triplet ( one element from each array) such that distance is minimum.
Distance is defined like this :
If a[i], b[j] and c[k] are three elements then
distance = max{abs(a[i]-b[j]),abs(a[i]-c[k]),abs(b[j]-c[k])}
Please give a solution in O(n) time complexity
Linear time algorithm:
double MinimalDistance(double[] A, double[] B, double[] C)
{
int i,j,k = 0;
double min_value = infinity;
double current_val;
int opt_indexes[3] = {0, 0, 0);
while(i < A.size || j < B.size || k < C.size)
{
current_val = calculate_distance(A[i],B[j],C[k]);
if(current_val < min_value)
{
min_value = current_val;
opt_indexes[1] = i;
opt_indexes[2] = j;
opt_indexes[3] = k;
}
if(A[i] < B[j] && A[i] < C[k] && i < A.size)
i++;
else if (B[j] < C[k] && j < B.size)
j++;
else
k++;
}
return min_value;
}
In each step you check the current distance, then increment the index of the array currently pointing to the minimal value. each array is iterated through exactly once, which mean the running time is O(A.size + B.size + C.size).
if you want the optimal indexes instead of the minimal values, you can return opt_indexes instead of min_value.
Suppose we have just one sorted array, then 3 consecutive elements which have less possible distances are the desired solution. Now when we have three arrays, just merge them all and make a big sorted array ABC (this can be done in O(n) by merge operation in merge-sort), just keep a flag to determine which element belongs in which original array. Now you have to find three consecutive elements in array like this:
a1,a2,b1,b2,b3,c1,b4,c2,c3,c4,b5,b6,a3,a4,a5,....
and here consecutive means they belong to the 3 different group in consecutive order, e.g: a2,b3,c1 or c4,b6,a3.
Now finding this tree elements is not hard, sure smallest and greatest one should be last and first of a elements of first and last group in some triple, e.g in the group: [c2,c3,c4],[b5,b6],[a3,a4,a5], we don't need to check a4,a5,c2,c3 is clear that possible solution in this case is among c4,[b5,b6],a5, also we don't need to compare c4 with b5,b6, or a5 with b5,b6, sure distance is made by a5-c4 (in this group). So we can start from left and keep track of last element and update best possible solution in each iteration by just keeping the last visited value of each group.
Example (first I should say that I didn't wrote the code because I think this is OP's task not me):
Suppose we have this sequences after sorted array:
a1,a2,b1,b2,b3,c1,b4,c2,c3,c4,b5,b6,a3,a4,a5,....
let iterate step by step:
We need to just keep track of last item for each item from our arrays, a is for keeping track of current best a_i, b for b_i, and c for c_i. suppose at first a_i=b_i=c_i=-1,
in the first step a will be a1, in the next step
a=a2,b=-1,c=-1
a=a2,b=b1,c=-1
a=a2,b=b2,c=-1
a=a2,b=b3,c=-1,
a=a2,b=b3,c=c1,
At this point we save current pointers (a2,b3,c1) as a best value for difference,
In the next step:
a=a2,c=c1,b=b4
Now we compare the difference of b4-a2 with previously best option, if is better, we save this pointers as a solution upto now and we proceed:
a=a2,b=b4,c=c2 (again compare and if needed update the best solution),
a=a2,b=b4,c=c3 (again ....)
a=a2,b=b4,c=c4 (again ....)
a=a2, b=b5,c=c4, ....
Ok if is not clear from the text, after merge we have (I'll suppose all of array have at least one element):
solution = infinite;
a=b=c=-1,
bestA=bestB=bestC=1;
for (int i=0;i<ABC.Length;i++)
{
if(ABC[i].type == "a") // type is a flag determines
// who is the owner of this element
{
a=ABC[i].Value;
if (b!=-1&&c!=-1)
{
if (max(|a-b|,|b-c|,|a-c|) < solution)
{
solution = max(|a-b|,|b-c|,|a-c|);
bestA= a,bestB = b,bestC = c;
}
}
}
// and two more if for type "b" and "c"
}
Sure there is more elegant algorithm than this, but I see you had problem with your link, so I guess this trivial way of looking at problem makes it easier, afterward you can understand your own link.

Algorithm to find duplicate in an array

I have an assignment to create an algorithm to find duplicates in an array which includes number values. but it has not said which kind of numbers, integers or floats. I have written the following pseudocode:
FindingDuplicateAlgorithm(A) // A is the array
mergeSort(A);
for int i <- 0 to i<A.length
if A[i] == A[i+1]
i++
return A[i]
else
i++
have I created an efficient algorithm?
I think there is a problem in my algorithm, it returns duplicate numbers several time. for example if array include 2 in two for two indexes i will have ...2, 2,... in the output. how can i change it to return each duplicat only one time?
I think it is a good algorithm for integers, but does it work good for float numbers too?
To handle duplicates, you can do the following:
if A[i] == A[i+1]:
result.append(A[i]) # collect found duplicates in a list
while A[i] == A[i+1]: # skip the entire range of duplicates
i++ # until a new value is found
Do you want to find Duplicates in Java?
You may use a HashSet.
HashSet h = new HashSet();
for(Object a:A){
boolean b = h.add(a);
boolean duplicate = !b;
if(duplicate)
// do something with a;
}
The return-Value of add() is defined as:
true if the set did not already
contain the specified element.
EDIT:
I know HashSet is optimized for inserts and contains operations. But I'm not sure if its fast enough for your concerns.
EDIT2:
I've seen you recently added the homework-tag. I would not prefer my answer if itf homework, because it may be to "high-level" for an allgorithm-lesson
http://download.oracle.com/javase/1.4.2/docs/api/java/util/HashSet.html#add%28java.lang.Object%29
Your answer seems pretty good. First sorting and them simply checking neighboring values gives you O(n log(n)) complexity which is quite efficient.
Merge sort is O(n log(n)) while checking neighboring values is simply O(n).
One thing though (as mentioned in one of the comments) you are going to get a stack overflow (lol) with your pseudocode. The inner loop should be (in Java):
for (int i = 0; i < array.length - 1; i++) {
...
}
Then also, if you actually want to display which numbers (and or indexes) are the duplicates, you will need to store them in a separate list.
I'm not sure what language you need to write the algorithm in, but there are some really good C++ solutions in response to my question here. Should be of use to you.
O(n) algorithm: traverse the array and try to input each element in a hashtable/set with number as the hash key. if you cannot enter, than that's a duplicate.
Your algorithm contains a buffer overrun. i starts with 0, so I assume the indexes into array A are zero-based, i.e. the first element is A[0], the last is A[A.length-1]. Now i counts up to A.length-1, and in the loop body accesses A[i+1], which is out of the array for the last iteration. Or, simply put: If you're comparing each element with the next element, you can only do length-1 comparisons.
If you only want to report duplicates once, I'd use a bool variable firstDuplicate, that's set to false when you find a duplicate and true when the number is different from the next. Then you'd only report the first duplicate by only reporting the duplicate numbers if firstDuplicate is true.
public void printDuplicates(int[] inputArray) {
if (inputArray == null) {
throw new IllegalArgumentException("Input array can not be null");
}
int length = inputArray.length;
if (length == 1) {
System.out.print(inputArray[0] + " ");
return;
}
for (int i = 0; i < length; i++) {
if (inputArray[Math.abs(inputArray[i])] >= 0) {
inputArray[Math.abs(inputArray[i])] = -inputArray[Math.abs(inputArray[i])];
} else {
System.out.print(Math.abs(inputArray[i]) + " ");
}
}
}

Resources