The problem is in a 2d histogram with N columns, counting number of rectangles with area ≥ K. The columns have width 1 and I know the number of unit squares on the ith column.
I've come up with the following O(N²) algorithm: let hi be the height of the i th column. Then I can do the following: when I fix i,j as our bottom side of rectangle, I find the highest possible height of the rectangle h and add max(0, h - ceil(K/(j-i+1)) + 1) to the answer.
I heard there is an O(N log N) algorithm, and I tried to derive it by using the fact
∑Ni=1N⁄i ~ N log N
However, that's all I have and I can't make further progress. Can you give a hint on the algorithm?
Related
You are given a grid of 0's and 1's and it's dimension 1 ≤ N,M ≤ 2500 and a number 0 ≤ K ≤ 6. The task is to count the number of rectangles in the grid that have exactly K ones inside it.
It has to be quicker that O(N^2*M), something like O(NMlog(N+M)) will work. My best aproach was a dp with complexity O(N^2*M), but this is to slow, I've been told that the answer is divide and conquer but I can't get it. Any idea?
One way to get the log factor is to divide horizontally, add the number of rectangles with K 1s above the dividing line, the number of rectangles with K 1s below the dividing line (computed recursively), and for each horizontal width, (there are O(|columns|^2) of them) the number of rectangles that extend some part above and some part below the divided line with the same fixed width. We compute the last one by splitting into two-part partitions of K (maxed at 7 since K ≤ 6 and we may want one part zero). We can binary search on a fixed width and bottom or top line, up or down, on matrix prefix sums, which we can precalculate in O(M * N) and retrieve in O(1).
f(vertical_range) =
f(top_part) +
f(bottom_part) +
NumTopRecs(w, i) * NumBottomRecs(w, k - i)
where i <- [0...k],
w <- all O(M^2) widths
The trick is that in each recursive call, we rotate the half that's passed to f such that the horizontal divider line becomes vertical, which means we've cut the M for our call (from which we draw O(M^2) widths) in 2.
Please would anyone suggest a dynamic programming approach to solve the SPOJ problem "A STANDARD PROBLEM", link:- http://www.spoj.com/problems/ASTDPROB/
Problem statement:
Given a boolean matrix of size NXM .Answer for all Q queries of type int low, high, finds the largest area sub rectangle having only Zeros and lying between rows numbered low and high.
1 ≤ N, M ≤ 1000
1 ≤ Q ≤ 10^6
I need an O(n^2) or O(n^2 * log n) dp algorithm.
Up till my approach goes like this
I precompute the sides of maximum sub rectangle starting each cell (i,j) in almost O(n^2) time using DP.
store the answer for each query in a grid ans[M][M] (currently my doing in O(n^3)=around 10^9 atomic operations in 1s which is impossible)
then answer all queries in O(1).
Please suggest any optimization for 2nd step?
Anyone with more efficient approach,please share that also.
Thanks in advance.
Let M be the matrix of 0s and 1s.
Compute a matrix S, where S[k][l]' is the number of consecutive zeros up fromM[k][l]`. This will take O(n^2).
Now for a given query (lo,hi) you can go from line lo to line hi. And for each line line find the maximum rectangle from line to hi in following way:
- go with a pointer p through the S[line] and keep track of possible heights.
For example, suppose S[line] = [1,2,2,1,5,6,9,2,1,4]. When p = 5 you should have a list of tuples like:
W = [0,4,5]
and from this you can compute the sizes of rectangles finishing at p==6:
max(S[line][W[0]], hi-lo+1) * (p-W[0] + 1) = 6
max(S[line][W[1]], hi-lo+1) * (p-W[1] + 1) = 10
max(S[line][W[2]], hi-lo+1) * (p-W[2] + 1) = 6
EDIT: Well, it seems there are more sophisticated solutions, at least after S is computed. You can consider it as the problem H from:
http://www.informatik.uni-ulm.de/acm/Locals/2003/html/judge.html
There is also a related SO question here
Maximize the rectangular area under Histogram
EDIT: How to use the histogram idea.
Let M have following structure
1010100101
0001001001
0001000010
0100000000
then S can be computed bottom-up and in this case is
0301041020
3230330310
2120222202
1011111111
Now to find a rectangle starting from some line till the end we use the 'histogram problem'. For the second line we have: 3230330310 and this corresponds to the histogram of the form
X X XX X
XXX XX X
XXX XX XX
Finding the largest rectangle here gives the largest rectangle in the starting problem.
Complecity: O(n) - histogram algorithm. Now, for each query we check at most n lines and we have q queries so: O(n^2 q)
Let's say that a point at coordinate (x1,y1) dominates another point (x2,y2) if x1 ≤ x2 and y1 ≤ y2;
I have a set of points (x1,y1) , ....(xn,yn) and I want to find the total number of dominating pairs. I can do this using brute force by comparing all points against one another, but this takes time O(n2). Instead, I'd like to use a divide-and-conquer approach to solve this in time O(n log n).
Right now, I have the following algorithm:
Draw a vertical line dividing the set of points points into two equal subsets of Pleft and Pright. As a base case, if there are just two points left, I can compare them directly.
Recursively count the number of dominating pairs in Pleft and Pright
Some conquer step?
The problem is that I can't see what the "conquer" step should be here. I want to count how many dominating pairs there are that cross from Pleft into Pright, but I don't know how to do that without comparing all the points in both parts, which would take time O(n2).
Can anyone give me a hint about how to do the conquer step?
so the 2 halves of y coordinates are : {1,3,4,5,5} and {5,8,9,10,12}
i draw the division line.
Suppose you sort the points in both halves separately in ascending order by their y coordinates. Now, look at the lowest y-valued point in both halves. If the lowest point on the left has a lower y value than the lowest point on the right, then that point is dominated by all points on the right. Otherwise, the bottom point on the right doesn't dominate anything on the left.
In either case, you can remove one point from one of the two halves and repeat the process with the remaining sorted lists. This does O(1) work per point, so if there are n total points, this does O(n) work (after sorting) to count the number of dominating pairs across the two halves. If you've seen it before, this is similar to the algorithm for counting inversions in an array).
Factoring in the time required to sort the points (O(n log n)), this conquer step takes O(n log n) time, giving the recurrence
T(n) = 2T(n / 2) + O(n log n)
This solves to O(n log2 n) according to the Master Theorem.
However, you can speed this up. Suppose that before you start the divide amd conquer step that you presort the points by their y coordinates, doing one pass of O(n log n) work. Using tricks similar to the closest pair of points problem, you can then get the points in each half sorted in O(n) time on each subproblem of size n (see the discussion at this bottom of this page) for details). That changes the recurrence to
T(n) = 2T(n / 2) + O(n)
Which solves to O(n log n), as required.
Hope this helps!
Well in this way you have O(n^2) just for division to subsets...
My approach would be different
sort points by X ... O(n.log(n))
now check for Y
but check only points with bigger X (if you sort them ascending then with larger index)
so now you have O(n.log(n)+(n.n/2))
You can also further speed things up by doing separate X and Y test and after that combine the result, that would leads O(n + 3.n.log(n))
add index attribute to your points
where index = 0xYYYYXXXXh is unsigned integer type
YYYY is index of point in Y-sorted array
XXXX is index of point in X-sorted array
if you have more than 2^16 points use bigger then 32-bit data-type.
sort points by ascending X and set XXXX part of their index O1(n.log(n))
sort points by ascending Y and set YYYY part of their index O2(n.log(n))
sort points by ascending index O3(n.log(n))
now point i dominates any point j if (i < j)
but if you need to create actually all the pairs for any point
that would take O4(n.n/2) so this approach will save not a bit of time
if you need just single pair for any point then simple loop will suffice O4(n-1)
so in this case O(n-1+3.n.log(n)) -> ~O(n+3.n.log(n))
hope it helped,... of course if you are stuck with that subdivision approach than i have no better solution for you.
PS. for this you do not need any additional recursion just 3x sorting and only one uint for any point so the memory requirements are not that big and even should be faster than recursive call to subdivision recursion in general
This algorithm runs in O(N*log(N)) where N is the size of the list of points and it uses O(1) extra space.
Perform the following steps:
Sort the list of points by y-coordinate (ascending order), break ties by
x-coordinate (ascending order).
Go through the sorted list in reverse order to count the dominating points:
if the current x-coordinate >= max x-coordinate value encountered so far
then increment the result and update the max.
This works since you know for sure that if all pairs with a greater y-coordinates have a smaller x-coordinate than the current point you have found a dominating points. The sorting step makes it really efficient.
Here's the Python code:
def my_cmp(p1, p2):
delta_y = p1[1] - p2[1]
if delta_y != 0:
return delta_y
return p1[0] - p2[0]
def count_dom_points(points):
points.sort(cmp = my_cmp)
maxi = float('-inf')
count = 0
for x, y in reversed(points):
if x >= maxi:
count += 1
maxi = x
return count
We are given an N dimensional matrix of order [m][m][m]....n times where value position contains the value sum of its index..
For example in 6x6 matrix A, value at position A[3][4] will be 7.
We have to find out the total number of counts of elements greater than x. For 2 dimensional matrix we have following approach:
If we know the one index say [i][j] {i+j = x} then we create a diagonal by just doing [i++][j--] of [i--][j++] with constraint that i and j are always in range of 0 to m.
For example in two dimensional matrix A[6][6] for value A[3][4] (x = 7), diagonal can be created via:
A[1][6] -> A[2][5] -> A[3][4] -> A[4][3] -> A[5][2] -> A[6][2]
Here we have converted our problem into another problem which is count the element below the diagonal including the diagonal.
We can easily count in O(m) complexity instead spending O(m^2) where 2 is order of matrix.
But if we consider N dimensional matrix, how we will do it, because in N dimensional matrix if we know the index of that location,
where sum of index is x say A[i1][i2][i3][i4]....[in] times.
Then there may be multiple diagonal which satisfy that condition, say by doing i1-- we can increment any of {i2, i3, i4....in}
So, above used approach for 2 dimensional matrix become useless here... because there is only two variable quantity i1 and i2 is present.
Please help me to find solution
For 2D: count of the elements below diagonal is triangular number.
For 3D: count of the elements below diagonal plane is tetrahedral number
Note that Kth tetrahedral number is the sum of the first K triangular numbers.
For nD: n-simplexial (I don't know exact english term) number (is sum of first (n-1)-simplexial numbers).
The value of Kth n-simplexial is
S(k, n) = k * (k+1) * (k+2).. (k + n - 1) / n! = BinomialCoefficient(k+n-1, n)
Edit: this method works "as is" for limited values of X below main anti-diagonal (hyper)plane.
Generating function approach:
Let's we have polynom
A(s)=1+s+s^2+s^3+..+s^m
then it's nth power
B(s) = An(s) has important property: coefficient of kth power of s is the number of ways to compose k from n summands. So the sum of nth to kth coefficients gives us the count of the elements below kth diagonal
For a 2-dimensional matrix, you converted the problem into another problem, which is count the elements below the diagonal including the diagonal.
Try and visualize it for a 3-d matrix. In case of a 3-dimensional matrix, the problem will be reduced to another problem, which is to count the elements below the diagonal plane including the diagonal
Provided a set of N connected lines on a 2D axis, I am looking for an algorithm which will determine the X minimal bounding rectangles.
For example, suppose I am given 10 lines and I would like to bound them with at most 3 (potentially intersecting) rectangles. So if 8 of the lines are clustered closely together, they may use 1 rectangle, and the other two may use a 2nd or perhaps also a 3rd rectangle depending on their proximity to each other.
Thanks.
If the lines are actually a path, then perhaps you wouldn't be averse to the requirement that each rectangle cover a contiguous portion of the path. In this case, there's a dynamic program that runs in time O(n2 r), where n is the number of segments and r is the number of rectangles.
Compute a table with entries C(i, j) denoting the cost of covering segments 1, …, i with j rectangles. The recurrence is, for i, j > 0,
C(0, 0) = 0
C(i, 0) = ∞
C(i, j) = min over i' < i of (C(i', j - 1) + [cost of the rectangle covering segments i' + 1, …, i])
There are O(n r) entries, each of which is computed in time O(n). Recover the optimal collection of rectangles at the end by, e.g., storing the best i' for each entry.
I don't know of a simple, optimal algorithm for the general case. Since there are “only” O(n4) rectangles whose edges each contain a segment endpoint, I would be tempted to formulate this problem as an instance of generalized set cover.