Fastest way to find sum of any rectangle in matrix - algorithm

I have a m x n matrix and want to be able to calculate sums of arbitrary rectangular submatrices. This will happen several times for the given matrix. What data structure should I use?
For example I want to find sum of rectangle in matrix
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
Sum is 68.
What I'll do is accumulating it row by row:
1 2 3 4
6 8 10 12
15 18 21 24
28 32 36 40
And then, if I want to find sum of the matrix I just accumulate 28,32,36,40 = 136. Only four operation instead of 15.
If I want to find sum of second and third row, I just accumulate 15,18,21,24 and subtract 1, 2, 3, 4. = 6+8+10+12+15+18+21+24 = 68.
But in this case I can use another matrix, accumulating this one by columns:
1 3 6 10
5 11 18 26
9 19 30 42
13 27 42 58
and in this case I just sum 26 and 42 = 68. Only 2 operation instead of 8. For wider sub-matrix is is efficient to use second method and matrix, for higher - first one. Can I somehow split merge this to methods to one matrix?
So I just sum to corner and subtract another two?

You're nearly there with your method. The solution is to use a summed area table (aka Integral Image):
http://en.wikipedia.org/wiki/Summed_area_table
The key idea is you do one pass through your matrix and accumulate such that "the value at any point (x, y) in the summed area table is just the sum of all the pixels above and to the left of (x, y), inclusive.".
Then you can compute the sum inside any rectangle in constant time with four lookups.

Why can't you just add them using For loops?
int total = 0;
for(int i = startRow; i = endRow; i++)
{
for(int j = startColumn; j = endColumn; j++)
{
total += array[i][j];
}
}
Where your subarray ("rectangle") would go from startRow to endRow (width) and startColumn to endColumn (height).

Related

Matlab - Algorithm for calculating 1d consecutive line segment edges from midpoints?

So I have a rectilinear grid that can be described with 2 vectors. 1 for the x-coordinates of the cell centres and one for the y-coordinates. These are just points with spacing like x spacing is 50 scaled to 10 scaled to 20 (55..45..30..10,10,10..10,12..20,20,20) and y spacing is 60 scaled to 40 scaled to 60 (60,60,60,55..42,40,40,40..40,42..60,60) and the grid is made like this
e.g. x = 1 2 3, gridx = 1 2 3, y = 10 11 12, gridy = 10 10 10
1 2 3 11 11 11
1 2 3 12 12 12
so then cell centre 1 is 1,10 cc2 is 2,10 etc.
Now Im trying to formulate an algorithm to calculate the positions of the cell edges in the x and y direction. So like my first idea was to first get the first edge using x(1)-[x(2)-x(1)]/2, in the real case x(2)-x(1) is equal to 60 and x(1) = 16348.95 so celledge1 = x(1)-30 = 16318.95. Then after calculating the first one I go through a loop and calculate the rest like this:
for aa = 2:length(x)+1
celledge1(aa) = x(aa-1) + [x(aa-1)-celledge(aa-1)]
end
And I did the same for y. This however does not work and my y vector in the area where the edge spacing should be should be 40 is 35,45,35,45... approx.
Anyone have any idea why this doesnt work and can point me in the right direction. Cheers
Edit: Tried to find a solution using geometric alebra:
We are trying to find the points A,B,C,....H. From basic geometry we know:
c1 (centre 1) = [A+B]/2 and c2 = [B+C]/2 etc. etc.
So we have 7 equations and 8 variables. We also know the the first few distances between centres are equal (60,60,60,60) therefore the first segment is 60 too.
B - A = 60
So now we have 8 equations and 8 variables so I made this algorithm in Matlab:
edgex = zeros(length(DATA2.x)+1,1);
edgey = zeros(length(DATA2.y)+1,1);
edgex(1) = (DATA2.x(1)*2-diffx(1))/2;
edgey(1) = (DATA2.y(1)*2-diffy(1))/2;
for aa = 2:length(DATA2.x)+1
edgex(aa) = DATA2.x(aa-1)*2-edgex(aa-1);
end
for aa = 2:length(DATA2.y)+1
edgey(aa) = DATA2.y(aa-1)*2-edgey(aa-1);
end
And I still got the same answer as before with the y spacing going 35,45,35,45 where it should be 40,40,40... Could it be an accuracy error??
Edit: here are the numbers if ur interested and I did the same computation as above only in excel: http://www.filedropper.com/workoutedges
It seems you're just trying to interpolate your data. You can do this with the built-in interp1
x = [30 24 19 16 8 7 16 22 29 31];
xi = interp1(2:2:numel(x)*2, x, 1:(numel(x)*2+1), 'linear', 'extrap');
This just sets up the original data as the even-indexed elements and interpolates the odd indices, including extrapolation for the two end points.
Results:
xi =
Columns 1 through 11:
33.0000 30.0000 27.0000 24.0000 21.5000 19.0000 17.5000 16.0000 12.0000 8.0000 7.5000
Columns 12 through 21:
7.0000 11.5000 16.0000 19.0000 22.0000 25.5000 29.0000 30.0000 31.0000 32.0000

Make an n x n-1 matrix from 1 x n vector where the i-th row is the vector without the i-th element, without a for loop

I need this for Lagrange polynomials. I'm curious how one would do this without a for loop. The code currently looks like this:
tj = 1:n;
ti = zeros(n,n-1);
for i = 1:n
ti(i,:) = tj([1:i-1, i+1:end]);
end
My tj is not really just a 1:n vector but that's not important. While this for loop gets the job done, I'd rather use some matrix operation. I tried looking for some appropriate matrices to multiply it with, but no luck so far.
Here's a way:
v = [10 20 30 40]; %// example vector
n = numel(v);
M = repmat(v(:), 1, n);
M = M(~eye(n));
M = reshape(M,n-1,n).';
gives
M =
20 30 40
10 30 40
10 20 40
10 20 30
This should generalize to any n
ti = flipud(reshape(repmat(1:n, [n-1 1]), [n n-1]));
Taking a closer look at what's going on. If you look at the resulting matrix closely, you'll see that it's n-1 1's, n-1 2's, etc. from the bottom up.
For the case where n is 3.
ti =
2 3
1 3
1 2
So we can flip this vertically and get
f = flipud(ti);
1 2
1 3
2 3
Really this is [1, 2, 3; 1, 2, 3] reshaped to be 3 x 2 rather than 2 x 3.
In that line of thinking
a = repmat(1:3, [2 1])
1 2 3
1 2 3
b = reshape(a, [3 2]);
1 2
1 3
2 3
c = flipud(b);
2 3
1 3
1 2
We are now back to where you started when we bring it all together and replace 3's with n and 2's with n-1.
Here's another way. First create a matrix where each row is the vector tj but are stacked on top of each other. Next, extract the lower and upper triangular parts of the matrix without the diagonal, then add the results together ensuring that you remove the last column of the lower triangular matrix and the first column of the upper triangular matrix.
n = numel(tj);
V = repmat(tj, n, 1);
L = tril(V,-1);
U = triu(V,1);
ti = L(:,1:end-1) + U(:,2:end);
numel finds the total number of values in tj which we store in n. repmat facilitates the stacking of the vector tj to create a matrix that is n x n large. After, we use tril and triu so that we extract the lower and upper triangular parts of the matrices without the diagonal. In addition, the rest of the matrix is all zero except for the relevant triangular parts. The -1 and 1 flags for tril and triu respectively extract this out successfully while ensuring that the diagonal is all zero. This creates a column of extra zeroes appearing at the last column when calling tril and the first column when calling triu. The last part is to simply add these two matrices together ignoring the last column of the tril result and the first column of the triu result.
Given that tj = [10 20 30 40]; (borrowed from Luis Mendo's example), we get:
ti =
20 30 40
10 30 40
10 20 40
10 20 30

Matlab best match of a sequence within a matrix

I want to find the best match of a sequence of integers within a NxN matrix. The problem is that I don't know how to extract the position of this best match. The following code that I have should calculate the edit distance but I would like to know where in my grid that edit distance is shortest!
function res = searchWordDistance(word,grid)
% wordsize = length(word); % extract the actual size
% [x ,y] = find(word(1) == grid);
D(1,1,1)=0;
for i=2:length(word)+1
D(i,1,1) = D(i-1,1,1)+1;
end
for j=2:length(grid)
D(1,1,j) = D(1,1,j-1)+1;
D(1,j,1) = D(1,j-1,1)+1;
end
% inspect the grid for best match
for i=2:length(word)
for j=2:length(grid)
for z=2:length(grid)
if(word(i-1)==grid(j-1,z-1))
d = 0;
else
d=1;
end
c1=D(i-1,j-1,z-1)+d;
c2=D(i-1,j,z)+1;
c3=D(i,j-1,z-1)+1;
D(i,j,z) = min([c1 c2 c3]);
end
end
end
I have used this code (in one less dimension) to compare two strings.
EDIT Using a 5x5 matrix as example
15 17 19 20 22
14 8 1 15 24
11 4 17 3 2
14 2 1 14 8
19 23 5 1 22
now If I have a sequence [4,1,1] and [15,14,12,14] they should be found using the algorithm. The first one is a perfect match(diagonal starts at (3,2)). The second one is on the first column and is the closest match for that sequence since only one number is wrong.

Cumulative summation of value in each row

I have something like the following:
a = [1 11; 2 16; 3 9; 4 13; 5 8; 6 14];
b = a;
n = length(a);
Sum = [];
for i=1:1:n,
Sum = b(i,2)+b(i+1:1:n,2)
end
b =
1 11
2 16
3 9
4 13
5 8
6 14
For the first iteration I am looking to find the first combination of values in the second column which are between 19 and 25.
Sum =
27
20
24
19
25
Since 20 is that first combination (Rows 1&3) -- I would like to remove that data at start a new matrix or signify that is the first combination (i.e. place a 1 next to in by creating a third column)
The next step would be to sum the values which are still in the matrix with row 2 value:
Sum =
29
24
30
Then 2&5 would be combined.
However, I would like to allow not only pairs to be combined but also several rows if possible.
Is there something I am overlooking that may simplify this problem?
I don't think you're going to simplify this very much. It's a variation on the knapsack problem, which is NP-hard. The best algorithm to use might depend on the size of your inputs.

Summation of difference between matrix elements

I am in the process of building a function in MATLAB. As a part of it I have to calculate differences between elements in two matrices and sum them up.
Let me explain considering two matrices,
1 2 3 4 5 6
13 14 15 16 17 18
and
7 8 9 10 11 12
19 20 21 22 23 24
The calculations in the first row - only four elements in both matrices are considered at once (zero indicates padding):
(1-8)+(2-9)+(3-10)+(4-11): This replaces 1 in initial matrix.
(2-9)+(3-10)+(4-11)+(5-12): This replaces 2 in initial matrix.
(3-10)+(4-11)+(5-12)+(6-0): This replaces 3 in initial matrix.
(4-11)+(5-12)+(6-0)+(0-0): This replaces 4 in initial matrix. And so on
I am unable to decide how to code this in MATLAB. How do I do it?
I use the following equation.
Here i ranges from 1 to n(h), n(h), the number of distant pairs. It depends on the lag distance chosen. So if I choose a lag distance of 1, n(h) will be the number of elements - 1.
When I use a 7 X 7 window, considering the central value, n(h) = 4 - 1 = 3 which is the case here.
You may want to look at the circshfit() function:
a = [1 2 3 4; 9 10 11 12];
b = [5 6 7 8; 12 14 15 16];
for k = 1:3
b = circshift(b, [0 -1]);
b(:, end) = 0;
diff = sum(a - b, 2)
end

Resources