Calculate the sum of elements in a matrix efficiently - algorithm

In an interview I was asked if I was given an n*m matrix how to calculate the sum of the values in a given sub-matrix (defined by top-left, bottom-right coordinates).
I was told I could pre-process the matrix.
I was told the matrix could be massive and so could the sub-matrix so the algo had to be efficient. I stumbled a bit and wasn't told the best answer.
Anyone have a good answer?

This is what Summed Area Tables are for. http://en.wikipedia.org/wiki/Summed_area_table
Your "preprocessing" step is to build a new matrix of the same size, where each entry is the sum of the sub-matrix to the upper-left of that entry. Any arbitrary sub-matrix sum can be calculated by looking up and mixing only 4 entries in the SAT.
EDIT: Here's an example.
For the initial matrix
0 1 4
2 3 2
1 2 7
The SAT is
0 1 5
2 6 12
3 9 22
The SAT is obtained using S(x,y) = a(x,y) + S(x-1,y) + S(x,y-1) - S(x-1,y-1),
where S is the SAT matrix and a is the initial matrix .
If you want the sum of the lower-right 2x2 sub-matrix, the answer would be 22 + 0 - 3 - 5 = 14. Which is obviously the same as 3 + 2 + 2 + 7. Regardless of the size of the matrix, the sum of a sub matrix can be found in 4 lookups and 3 arithmetic ops. Building the SAT is O(n), similarly requiring only 4 lookups and 3 math ops per cell.

You can do it by Dynamic programming. Create matrix dp with size n*m.
And for each i, j where
1 <= i <= n , 1 <= j <= m
dp[i][j] will be :
dp[i][j] = dp[i - 1][j] + dp[i][j - 1] - dp[i - 1][j - 1] + values[i][j]
And for each query we have lx, rx, ly, ry where lx and rx are top-left coordinates, ly and ry bottom-right coordinates of sub-matrix.
1 ≤ lxi ≤ rx ≤ n, 1 ≤ ly ≤ ry ≤ m
sum = dp[rx][ry] - dp[lx - 1][ry] - dp[rx][ly - 1] + dp[lx-1][ly - 1]
Look at picture to understand how algorithm works.
OD = dp[rx][ry], OB = dp[lx - 1][ry], OC = dp[rx][ly - 1], OA = dp[lx - 1][ly - 1]

Create a new matrix where entry (i,j) is the sum of elements in the original matrix that have lower or equal i and j. Then, to find the sum of the elements in the submatrix, you can just use a constant number of basic operations using the corners of the submatrix of your sum matrix.
In particular, find the corners top_left, bottom_left, top_right and bottom_right of your sum matrix, where the first three are just outside the submatrix and bottom_right is just inside. Then, your sum will be
bottom_right + top_left - bottom_left - bottom_right

Below is a sample implementation in C using Summed Area Tables concept as explained in one of the answers above.
Python implementation for the same can be found at below link -
http://www.ardendertat.com/2011/09/20/programming-interview-questions-2-matrix-region-sum/
#include<stdio.h>
int pre[3][3];
int arr[3][3] = {
{0,1,4},
{2,3,2},
{1,2,7}
};
void preprocess()
{
for(int i=0;i<3;i++)
{
for(int j=0;j<3;j++)
{
if(i>0 && j>0)
{
pre[i][j] = arr[i][j] + pre[i-1][j] + pre[i][j-1] - pre[i-1][j-1];
}
else if(i>0 && j==0)
{
pre[i][j] = arr[i][j] + pre[i-1][j];
}
else if(j>0 && i==0)
{
pre[i][j] = arr[i][j] + pre[i][j-1];
}
else
{
pre[i][j] = arr[i][j];
}
}
}
}
int subsum(int x1, int y1, int x2, int y2)
{
preprocess();
int ans = pre[x2][y2] - pre[x1-1][y2] - pre[x2][y1-1] + pre[x1-1][y1-1];
return ans;
}
int main()
{
printf("%d\n",subsum(1,1,2,2));
return 0;
}

This should work. You always have to go through each element in the submatrix to do the addition and this is the simplest way.
*note that the following code may not compile but it's right in pseudocode
struct Coords{
int x,y;
}
int SumSubMatrix(Coords topleft, Coords bottomright, int** matrix){
int localsum = 0;
for( int i = topleft.x; i <= bottomright.x; i++ ){
for(int j = topleft.y; j <= bottomright.y; j++){
localsum += matrix[i][j];
}
}
return localsum;
}
Edit: An alternative pre-processing method is to create another matrix from the original containing the row or column sums. Here's an example:
Original:
0 1 4
2 3 2
1 2 7
Row Matrix:
0 1 5
2 5 7
1 3 10
Column Matrix:
0 1 4
2 4 6
3 6 13
Now, just take the endpoint x values and subtract the start point values, like so (for rows based):
for( int i = topleft.y; i >= bottomright.y; i++ ){
localsum += matrix2[bottomright.x][i] - matrix2[topleft.x][i];
}
Now, it's either O( n ) or O( m )

Related

Number of ways to write n as sum of k numbers with restrictions on each part

Title says it all.
I need to split n as sum of k parts where each part ki should be in the range of
1 <= ki <= ri for given array r.
for example -
n = 4, k = 3 and r = [2, 2, 1]
ans = 2
#[2, 1, 1], [1, 2, 1]
Order matters. (2, 1, 1) and (1, 2, 1) are different.
I taught of solving it using stars and bars method, but be because of upper bound ri i dont know to to approach it.
i implemented a direct recursion function and it works fine for small values only.
Constraints of original problem are
1 <= n <= 107
1 <= k <= 105
1 <= ri <= 51
All calculations will be done under prime Modulo.
i found a similar problem here but i don't know how to implement in program. HERE
My brute-force recursive function -
#define MAX 1000
const int md = 1e9 + 7;
vector <int> k;
vector <map<int, int>> mapper;
vector <int> hold;
int solve(int sum, int cur){
if(cur == (k.size() - 1) && sum >= 1 && sum <= k[cur]) return 1;
if(cur == (k.size() - 1) && (sum < 1 || sum > k[cur])) return 0;
if(mapper[cur].find(sum) != mapper[cur].end())
return mapper[cur][sum];
int ans = 0;
int start = 1;
for(int i=start; i<=k[cur]; ++i){
int remain = sum - i;
int seg = (k.size() - cur) - 1;
if(remain < seg) break;
int res = solve(sum - i, cur + 1);
ans = (1LL * ans + res) % md;
}
mapper[cur][sum] = ans;
return ans;
}
int main(){
for(int i=0; i<MAX; ++i) k.push_back(51); // restriction for each part default 51
mapper.resize(MAX);
cout << solve(MAX + MAX, 0) << endl;
}
Instead of using a map for storing result of computation i used a two dimensional array and it gave very good performance boost but i cannot use it because of large n and k values.
How could i improve my recursive function or what are other ways of solving this problem.
That's interesting problem.
First lets say r_i = r_i - 1, n = n - k, numbers in [0, r_i] just for convenience. Now it's possible to add some fictitious numbers to make m the power of 2 without changing answer.
Now let's represent each interval of [0, r_i] as polynomial 1 * x ^ 0 + 1 * x ^ 1 + ... + 1 * x & r_i. Now if we multiply all these polynomials, coefficient at x ^ n will be answer.
Here is structure called Number Theoretic Transform (NTT) which allows to multiply two polynomials modulo p in O(size * log(size)).
If you will just multiply it using NTT, code will work in something like O(n * k * log (k * max(r))). It's very slow.
But now our fictive numbers help. Let's use divide and conquer technics. We'll make O(log m) steps, on each step multiply 2 * i-th and 2 * i + 1-th polynomials. In the next step we'll multiply resulting polynomials of this step.
Each step works in O(k * log(k)) and there is O(log(k)) steps, so algorhitm works in O(k * log^2 (k)). It's fast asymptotically, but I'm not sure if it fits TL for this problem. I think it will work about 20 seconds on max test.

What is maximum water colledted between two histograms?

I recently came across this problem:
You are given height of n histograms each of width 1. You have to choose any two histograms such that if it starts raining and all other histograms(except the two you have selected) are removed, then the water collected between the two histograms is maximised.
Input:
9
3 2 5 9 7 8 1 4 6
Output:
25
Between third and last histogram.
This is a variant of Trapping rain water problem.
I tried two solutions but both had worst case complexity of N^2. How can we optimise further.
Sol1: Brute force for every pair.
int maxWaterCollected(vector<int> hist, int n) {
int ans = 0;
for (int i= 0; i < n; i++) {
for (int j = i + 1; j < n; j++) {
ans = max(ans, min(hist[i], hist[j]) * (j - i - 1));
}
}
return ans;
}
Sol2: Keep a sequence of histograms in increasing order of height. For every histogram, find its best histogram in this sequence. now, if all histograms are in increasing order then this solution also becomes N^2.
int maxWaterCollected(vector<int> hist, int n) {
vector< pair<int, int> > increasingSeq(1, make_pair(hist[0], 0)); // initialised with 1st element.
int ans = 0;
for (int i = 1; i < n; i++) {
// compute best result from current increasing sequence
for (int j = 0; j < increasingSeq.size(); j++) {
ans = max(ans, min(hist[i], increasingSeq[j].first) * (i - increasingSeq[j].second - 1));
}
// add this histogram to sequence
if (hist[i] > increasingSeq.back().first) {
increasingSeq.push_back(make_pair(hist[i], i));
}
}
return ans;
}
Use 2 iterators, one from begin() and one from end() - 1.
until the 2 iterator are equal:
Compare current result with the max, and keep the max
Move the iterator with smaller value (begin -> end or end -> begin)
Complexity: O(n).
Jarod42 has the right idea, but it's unclear from his terse post why his algorithm, described below in Python, is correct:
def candidates(hist):
l = 0
r = len(hist) - 1
while l < r:
yield (r - l - 1) * min(hist[l], hist[r])
if hist[l] <= hist[r]:
l += 1
else:
r -= 1
def maxwater(hist):
return max(candidates(hist))
The proof of correctness is by induction: the optimal solution either (1) belongs to the candidates yielded so far or (2) chooses histograms inside [l, r]. The base case is simple, because all histograms are inside [0, len(hist) - 1].
Inductively, suppose that we're about to advance either l or r. These cases are symmetric, so let's assume that we're about to advance l. We know that hist[l] <= hist[r], so the value is (r - l - 1) * hist[l]. Given any other right endpoint r1 < r, the value is (r1 - l - 1) * min(hist[l], hist[r1]), which is less because r - l - 1 > r1 - l - 1 and hist[l] >= min(hist[l], hist[r1]). We can rule out all of these solutions as suboptimal, so it's safe to advance l.

Minimize total area using K rectangles in less than O(N^4)

Given an increasing sequence of N numbers (up to T), we can use at most K rectangles (placed starting at position 0) such as for the i-th value v in the sequence, exists a rectangle in positions [v, T) with height at least i + 1.
Total area of rectangles should be the minimum that satisfies what mentioned above.
Example: given the sequence [0, 3, 4], T = 5 and K = 2 we can use:
a rectangle from 0 to 2 with height 1 (thus having an area of 3)
a rectangle from 3 to 4 with height 3 (thus having an area of 6).
Using at most 2 rectangles, we cannot get a total area smaller than 9.
This problem can be solved using DP.
int dp[MAXK+1][MAXN][MAXN];
int sequence[MAXN];
int filldp(int cur_idx, int cur_value, int cur_K) {
int res = dp[cur_K][cur_idx][cur_value];
if (res != -1) return res;
res = INF;
if (cur_idx == N - 1 && cur_value >= N)
res = min(res, (T - seq[cur_idx]) * cur_value);
else {
if (cur_idx < N - 1 && cur_value >= cur_idx + 1) {
int cur_cost = (seq[cur_idx + 1] - seq[cur_idx]) * cur_value;
res = min(res, cur_cost + filldp(cur_idx + 1, cur_value, cur_K);
}
// Try every possible height for a rectangle
if (cur_K < K)
for (int new_value = cur_value + 1; cur_value <= N; new_value++)
res = min(res, filldp(cur_idx, new_value, cur_K + 1));
}
dp[cur_K][cur_idx][cur_value] = res;
return res;
}
Unsurprisingly, this DP approach is not really fast probably due to the for cycle. However, as far as I can understand, this code should not do more than MAXK * MAXN * MAXN significative calls (i.e., not more that every cell in dp). MAXK and MAXN are both 200, so dp has 8 millions of cells, which is not too much.
Am I missing anything?
UPDATE: As pointed out by Saeed Amiri (thank you!), the code makes N^2*K significative calls, but each one is O(N). The whole algorithm is then O(N^3*K) = O(N^4).
Can we do better?

Perfect square or not?

This is a code to check if a number is perfect square or not. Why does it work ?
static bool IsSquare(int n)
{
int i = 1;
for (; ; )
{
if (n < 0)
return false;
if (n == 0)
return true;
n -= i;
i += 2;
}
}
Because all perfect squares are sums of consecutive odd numbers:
1 = 1
4 = 1 + 3
9 = 1 + 3 + 5
16 = 1 + 3 + 5 + 7
and so on. Your program attempts to subtract consecutive odd numbers from n, and see if it drops to zero or goes negative.
You can make an informal proof of this by drawing squares with sides of {1,2,3,4,...} and observe that constructing a square k+1 from square k requires adding 2k+1 unit squares.

How to fill a 2D array diagonally based on coordinates

I'm building a heatmap-like rectangular array interface and I want the 'hot' location to be at the top left of the array, and the 'cold' location to be at the bottom right. Therefore, I need an array to be filled diagonally like this:
0 1 2 3
|----|----|----|----|
0 | 0 | 2 | 5 | 8 |
|----|----|----|----|
1 | 1 | 4 | 7 | 10 |
|----|----|----|----|
2 | 3 | 6 | 9 | 11 |
|----|----|----|----|
So actually, I need a function f(x,y) such that
f(0,0) = 0
f(2,1) = 7
f(1,2) = 6
f(3,2) = 11
(or, of course, a similar function f(n) where f(7) = 10, f(9) = 6, etc.).
Finally, yes, I know this question is similar to the ones asked here, here and here, but the solutions described there only traverse and don't fill a matrix.
Interesting problem if you are limited to go through the array row by row.
I divided the rectangle in three regions. The top left triangle, the bottom right triangle and the rhomboid in the middle.
For the top left triangle the values in the first column (x=0) can be calculated using the common arithmetic series 1 + 2 + 3 + .. + n = n*(n+1)/2. Fields in the that triangle with the same x+y value are in the same diagonal and there value is that sum from the first colum + x.
The same approach works for the bottom right triangle. But instead of x and y, w-x and h-y is used, where w is the width and h the height of rectangle. That value have to be subtracted from the highest value w*h-1 in the array.
There are two cases for the rhomboid in the middle. If the width of rectangle is greater than (or equal to) the height, then the bottom left field of the rectangle is the field with the lowest value in the rhomboid and can be calculated that sum from before for h-1. From there on you can imagine that the rhomboid is a rectangle with a x-value of x+y and a y-value of y from the original rectangle. So calculations of the remaining values in that new rectangle are easy.
In the other case when the height is greater than the width, then the field at x=w-1 and y=0 can be calculated using that arithmetic sum and the rhomboid can be imagined as a rectangle with x-value x and y-value y-(w-x-1).
The code can be optimised by precalculating values for example. I think there also is one formula for all that cases. Maybe i think about it later.
inline static int diagonalvalue(int x, int y, int w, int h) {
if (h > x+y+1 && w > x+y+1) {
// top/left triangle
return ((x+y)*(x+y+1)/2) + x;
} else if (y+x >= h && y+x >= w) {
// bottom/right triangle
return w*h - (((w-x-1)+(h-y-1))*((w-x-1)+(h-y-1)+1)/2) - (w-x-1) - 1;
}
// rhomboid in the middle
if (w >= h) {
return (h*(h+1)/2) + ((x+y+1)-h)*h - y - 1;
}
return (w*(w+1)/2) + ((x+y)-w)*w + x;
}
for (y=0; y<h; y++) {
for (x=0; x<w; x++) {
array[x][y] = diagonalvalue(x,y,w,h);
}
}
Of course if there is not such a limitation, something like that should be way faster:
n = w*h;
x = 0;
y = 0;
for (i=0; i<n; i++) {
array[x][y] = i;
if (y <= 0 || x+1 >= w) {
y = x+y+1;
if (y >= h) {
x = (y-h)+1;
y -= x;
} else {
x = 0;
}
} else {
x++;
y--;
}
}
What about this (having an NxN matrix):
count = 1;
for( int k = 0; k < 2*N-1; ++k ) {
int max_i = std::min(k,N-1);
int min_i = std::max(0,k-N+1);
for( int i = max_i, j = min_i; i >= min_i; --i, ++j ) {
M.at(i).at(j) = count++;
}
}
Follow the steps in the 3rd example -- this gives the indexes (in order to print out the slices) -- and just set the value with an incrementing counter:
int x[3][3];
int n = 3;
int pos = 1;
for (int slice = 0; slice < 2 * n - 1; ++slice) {
int z = slice < n ? 0 : slice - n + 1;
for (int j = z; j <= slice - z; ++j)
x[j][slice - j] = pos++;
}
At a M*N matrix, the values, when traversing like in your stated example, seem to increase by n, except for border cases, so
f(0,0)=0
f(1,0)=f(0,0)+2
f(2,0)=f(1,0)+3
...and so on up to f(N,0). Then
f(0,1)=1
f(0,2)=3
and then
f(m,n)=f(m-1,n)+N, where m,n are index variables
and
f(M,N)=f(M-1,N)+2, where M,N are the last indexes of the matrix
This is not conclusive, but it should give you something to work with. Note, that you only need the value of the preceding element in each row and a few starting values to begin.
If you want a simple function, you could use a recursive definition.
H = height
def get_point(x,y)
if x == 0
if y == 0
return 0
else
return get_point(y-1,0)+1
end
else
return get_point(x-1,y) + H
end
end
This takes advantage of the fact that any value is H+the value of the item to its left. If the item is already at the leftmost column, then you find the cell that is to its far upper right diagonal, and move left from there, and add 1.
This is a good chance to use dynamic programming, and "cache" or memoize the functions you've already accomplished.
If you want something "strictly" done by f(n), you could use the relationship:
n = ( n % W , n / H ) [integer division, with no remainder/decimal]
And work your function from there.
Alternatively, if you want a purely array-populating-by-rows method, with no recursion, you could follow these rules:
If you are on the first cell of the row, "remember" the item in the cell (R-1) (where R is your current row) of the first row, and add 1 to it.
Otherwise, simply add H to the cell you last computed (ie, the cell to your left).
Psuedo-Code: (Assuming array is indexed by arr[row,column])
arr[0,0] = 0
for R from 0 to H
if R > 0
arr[R,0] = arr[0,R-1] + 1
end
for C from 1 to W
arr[R,C] = arr[R,C-1]
end
end

Resources