You are given a grid of 0's and 1's and it's dimension 1 ≤ N,M ≤ 2500 and a number 0 ≤ K ≤ 6. The task is to count the number of rectangles in the grid that have exactly K ones inside it.
It has to be quicker that O(N^2*M), something like O(NMlog(N+M)) will work. My best aproach was a dp with complexity O(N^2*M), but this is to slow, I've been told that the answer is divide and conquer but I can't get it. Any idea?
One way to get the log factor is to divide horizontally, add the number of rectangles with K 1s above the dividing line, the number of rectangles with K 1s below the dividing line (computed recursively), and for each horizontal width, (there are O(|columns|^2) of them) the number of rectangles that extend some part above and some part below the divided line with the same fixed width. We compute the last one by splitting into two-part partitions of K (maxed at 7 since K ≤ 6 and we may want one part zero). We can binary search on a fixed width and bottom or top line, up or down, on matrix prefix sums, which we can precalculate in O(M * N) and retrieve in O(1).
f(vertical_range) =
f(top_part) +
f(bottom_part) +
NumTopRecs(w, i) * NumBottomRecs(w, k - i)
where i <- [0...k],
w <- all O(M^2) widths
The trick is that in each recursive call, we rotate the half that's passed to f such that the horizontal divider line becomes vertical, which means we've cut the M for our call (from which we draw O(M^2) widths) in 2.
Related
The problem is in a 2d histogram with N columns, counting number of rectangles with area ≥ K. The columns have width 1 and I know the number of unit squares on the ith column.
I've come up with the following O(N²) algorithm: let hi be the height of the i th column. Then I can do the following: when I fix i,j as our bottom side of rectangle, I find the highest possible height of the rectangle h and add max(0, h - ceil(K/(j-i+1)) + 1) to the answer.
I heard there is an O(N log N) algorithm, and I tried to derive it by using the fact
∑Ni=1N⁄i ~ N log N
However, that's all I have and I can't make further progress. Can you give a hint on the algorithm?
This is from an old Olympiad practice problem:
Imagine you have a 1000x1000 grid, in which the cell (i,j) contains the number i*j. (Rows and columns are numbered starting at 1.)
At each step, we build a new grid from the old one, in which each cell (i,j) contains the "neighborhood average" of (i,j) in the last grid. The "neighborhood average" is defined as the floor of the average values of the cell and its up to 8 neighbors. So for example if the 4 numbers in the corner of the grid were 1,2,5,7, in the next step the corner would be calculated as (1+2+5+7)/4 = 3.
Eventually we'll reach a point where all the numbers are the same and the grid doesn't change anymore. The goal is to figure out how many steps it takes to reach this point.
I tried simply simulating it but that doesn't work, because it seems that the answer is O(n^2) steps and each simulation step takes O(n^2) to process, resulting in O(n^4) which is too slow for n=1000.
Is there a faster way to do it?
A slightly faster way can be as follows:
If you notice for any cell that is not on the matrix border (x,y), it's original value shall be x*y.
Also, the value of the cell after 1st iteration shall be:
V1 = ( xy + x(y+1) + x(y-1)
+(x+1)y + (x+1)(y+1) + (x+1)(y-1)
+(x-1)y + (x-1)(y+1) + (x-1)(y-1)
) / 9
= xy
For the elements on the left vertical edge (not on the corners)
v2 = ( xy + (x-1)y + (x+1)y + x(y+1) + (x-1)(y+1) + (x+1)(y+1) ) / 6
= xy + x/2.
For the elements on the right vertical edge (not on the corners)
v3 = ( xy + (x-1)y + (x+1)y + x(y-1) + (x-1)(y-1) + (x+1)(y-1) ) / 6
= xy - x/2.
Similarly for top and bottom horizontal edges and corners.
Hence after the 1st iteration, only the border elements shall change their value, the non-border elements shall remain the same.
For subsequent iterations, this change shall be propagated from the borders inwards in the matrix.
So one obvious way you can reduce your computations a little is to only change those elements that you expect to get changed in the first N/2 iterations. Note: by doing this the complexity shall not change IMO but the constant factor shall reduce.
Another possible way that you can consider is as follows:
You know that the centre-most element shall be unchanged till N/2 iterations.
So you may think of a way to jumptstart your iterations by starting from the centre-most element outwards.
That it, if you can find out an incremental mathematical formula for the change in elements after N/2 iterations, you may reduce the complexity of your algorithm by a factor of N.
The "floor" step makes me suspect an analytical solution is unlikely, and that this actually a micro-optimization exercise. Here is my idea.
Let's ignore the corners and edges for a moment. There are only 3996 of them and they will need special treatment anyway.
For an interior cell, you need to add 9 elements to get its next state. But turn that around, and say: Each interior cell has to be part of 8 additions.
Or does it? Start with three consecutive rows A[i], B[i], and C[i], and compute three new rows:
A'[i] = A[i-1] + A[i] + A[i+1]
B'[i] = B[i-1] + B[i] + B[i+1]
C'[i] = C[i-1] + C[i] + C[i+1]
(Note that you can compute each of these slightly faster with a "sliding window", since A'[i+1] = A'[i] - A[i-1] + A[i+1]. Same number of arithmetic operations but fewer loads.)
Now, to get the new value at location B[j], you just compute A'[j] + B'[j] + C'[j].
So far, we have not saved any work; we have just reordered the additions.
But now, having computing the updated row B, you can throw away A' and compute the next row:
D'[i] = D[i-1] + D[i] + D[i+1]
...which you can use with arrays B' and C' to compute the new values for row C without recomputing B' or C'. (You would implement this by shifting row B' and C' to become A' and B', of course... But this way was easier to explain. Maybe. I think.)
For each row, say B, we scan it once to produce B' doing 2n arithmetic operations, and a second time to compute the updated B which also takes 2n operations, so in total we do four additions/subtractions per element instead of eight.
Of course, in practice, you would compute C' while updating B for the same number of operations but better locality.
That's the only structural idea I have. A SIMD optimization expert might have other suggestions...
If you look at the initial matrix you'll notice that it's symmetric i.e. m[i][j] = m[j][i]. Therefore the neighbors of m[i][j] will have the same values as the neighbors of m[j][i], so you only need to calculate values for a little more than half the matrix for each step.
This optimization reduces the # of calculations per grid from N^2 to ((N^2)+N)/2.
Given a 15*15 symmetric matrix, each row containing all the numbers from 1 to 15 and each column containing all the numbers from 1 to 15, how do you go on to prove that all the diagonal elements will be different?
I tried to prove that no two diagonal elements will be same, but couldn't come up with anything solid. Even tried it for 5*5 matrix, but nothing I could come up with to prove it.
Any help would be appreciated!
This is a problem of symmetric latin squares. The first observation (which requires a short proof) is that each of the numbers 1 to 15 occur an even number of times in the off-diagonal positions. Since 15 is odd, this means that each number must occur at least once in the diagonal positions. But there are only 15 diagonal positions and so each number must occur exactly once in the diagonal positions.
If by 'prove' you mean demonstrate for a particular matrix, see below. If by 'prove' you mean mathematically prove, well, all diagonal matrices are symmetric matrices, and a diagonal matrix isn't required to have unique elements, so not all symmetric matrices have unique elements on the diagonal.
One way to test a particular matrix is to make a new array containing all the diagonal elements, then eliminate duplicates in that array, and test the length. Another is to take each diagonal element and compare it against those elements on the diagonal with a higher index. Here's the latter with some pseudocode using 0 based arrays
unique = TRUE
for i = 0 to 14 {
value = matrix[i][i]
for j = i+1 to 14 // doesn't loop if i+1 > 14
if (value == matrix[j][j])
unique = FALSE
}
ADDED: The OP points out I missed the restriction on the contents of each row and column! All symmetric NxN matrices consisting of N unique values with no duplicated values in each row and column must have an antidiagonal consisting of only one value, by the definition of symmetry. If N is odd, the resulting matrix has a single element that is in both the diagonal and antidiagonal (and of course, if N is even, no element is in common). Given this, you can see the diagonal values must differ in each position from the antidiagonal, except in the common element. Combine that with the requirement that each row and each column has N values, and you'll see that the diagonal element must be different for each row. This isn't formal, but I hope it helps.
We can assume the given matrix is m * m, and we should fill the matrix with m distinct numbers: N1, N2 ... Nm.
Because each element should show up in each column/row once, for each number, it will show up n) times in the matrix.
Because it is symmetric, each number will show up x (even) times in the upper section above the diagonal or x (even) times in the lower section below the diagonal. In this way, in addition to the diagonal, each number will show up 2 * x (x is even) times in the matrix.
Therefore, if the given m is odd, each number should show up one more time in the diagonal; if the given is even, we don't require each number show up on the diagonal cause 2 * x is already even.
Provided a set of N connected lines on a 2D axis, I am looking for an algorithm which will determine the X minimal bounding rectangles.
For example, suppose I am given 10 lines and I would like to bound them with at most 3 (potentially intersecting) rectangles. So if 8 of the lines are clustered closely together, they may use 1 rectangle, and the other two may use a 2nd or perhaps also a 3rd rectangle depending on their proximity to each other.
Thanks.
If the lines are actually a path, then perhaps you wouldn't be averse to the requirement that each rectangle cover a contiguous portion of the path. In this case, there's a dynamic program that runs in time O(n2 r), where n is the number of segments and r is the number of rectangles.
Compute a table with entries C(i, j) denoting the cost of covering segments 1, …, i with j rectangles. The recurrence is, for i, j > 0,
C(0, 0) = 0
C(i, 0) = ∞
C(i, j) = min over i' < i of (C(i', j - 1) + [cost of the rectangle covering segments i' + 1, …, i])
There are O(n r) entries, each of which is computed in time O(n). Recover the optimal collection of rectangles at the end by, e.g., storing the best i' for each entry.
I don't know of a simple, optimal algorithm for the general case. Since there are “only” O(n4) rectangles whose edges each contain a segment endpoint, I would be tempted to formulate this problem as an instance of generalized set cover.
Each rectangle is comprised of 4 doubles like this: (x0,y0,x1,y1)
The edges are parallel to the x and y axes
They are randomly placed - they may be touching at the edges, overlapping , or not have any contact
I need to find the area that is formed by their overlap - all the area in the canvas that more than one rectangle "covers" (for example with two rectangles, it would be the intersection)
I understand I need to use sweep line algorithm. Do I have to use a tree structure? What is the easiest way of using sweep line algorithm for this problem?
At first blush it seems that an O(n^2) algorithm should be straightforward since we can just check all pairwise points. However, that would create the problem of double counting, as all points that are in 3 rectangles would get counted 3 times! After realizing that, an O(n^2) algorithm doesn't look bad to me now. If you can think of a trivial O(n^2) algorithm, please post.
Here is an O(n^2 log^2 n) algorithm.
Data structure: Point (p) {x_value, isBegin, isEnd, y_low, y_high, rectid}
[For each point, we have a single x_value, two y_values, and the ID of the rectangle which this point came from]
Given n rectangles, first create 2n points as above using the x_left and x_right values of the rectangle.
Create a list of points, and sort it on x_value. This takes O(n log n) time
Start from the left (index 0), use a map to put when you see a begin, and remove when you see an end point.
In other words:
Map m = new HashMap(); // rectangles overlapping in x-axis
for (Point p in the sorted list) {
if (p.isBegin()) {
m.put(p); // m is keyed off of rectangle id
if (s.size() >= 2) {
checkOverlappingRectangles(m.values())
}
} else {
m.remove(p); // So, this takes O(log n) time
}
}
Next, we need a function that takes a list of rectangles, knowing that the all the rectangles have overlapping x axis, but may or may not overlap on y axis. That is in fact same as this algorithm, we just use a transverse data structures since we are interested in y axis now.