Summation of difference between matrix elements - matrix

I am in the process of building a function in MATLAB. As a part of it I have to calculate differences between elements in two matrices and sum them up.
Let me explain considering two matrices,
1 2 3 4 5 6
13 14 15 16 17 18
and
7 8 9 10 11 12
19 20 21 22 23 24
The calculations in the first row - only four elements in both matrices are considered at once (zero indicates padding):
(1-8)+(2-9)+(3-10)+(4-11): This replaces 1 in initial matrix.
(2-9)+(3-10)+(4-11)+(5-12): This replaces 2 in initial matrix.
(3-10)+(4-11)+(5-12)+(6-0): This replaces 3 in initial matrix.
(4-11)+(5-12)+(6-0)+(0-0): This replaces 4 in initial matrix. And so on
I am unable to decide how to code this in MATLAB. How do I do it?
I use the following equation.
Here i ranges from 1 to n(h), n(h), the number of distant pairs. It depends on the lag distance chosen. So if I choose a lag distance of 1, n(h) will be the number of elements - 1.
When I use a 7 X 7 window, considering the central value, n(h) = 4 - 1 = 3 which is the case here.

You may want to look at the circshfit() function:
a = [1 2 3 4; 9 10 11 12];
b = [5 6 7 8; 12 14 15 16];
for k = 1:3
b = circshift(b, [0 -1]);
b(:, end) = 0;
diff = sum(a - b, 2)
end

Related

Generic triangular numbers sequence formula

I know that I can get the nth element of the following sequence
1 3 6 10 15 21
With the formula
(n * (n + 1)) / 2
where n is the nth number I want. How can I generalise the formula to get the nth element of the following sequences where by following sequences I mean
1 -> 1 3 6 10 15 21
2 -> 2 5 9 14 20
3 -> 4 8 13 19
4 -> 7 12 18
5 -> 11 17
6 -> 16
It is not quite clear what do you mean by n-th element in 2D-table (potentially infinite)
Simple formula for element at row and column (numbered from 1):
(r+c-1)*(r+c)/2 - (r-1)
Possible intuition for this formula:
Key moment: element with coordinates r,c stands on the diagonal number d, where d = r + c - 1
There are s = d*(d+1)/2 elements in d filled diagonals, so the last element of d-th diagonal (rightmost top) has value s, and element in r-th row of the same diagonal is
v(r,c) = s-(r-1) = (d)*(d+1)/2 -(r-1) = (r+c-1)*(r+c)/2 - (r-1)

Fastest way to find sum of any rectangle in matrix

I have a m x n matrix and want to be able to calculate sums of arbitrary rectangular submatrices. This will happen several times for the given matrix. What data structure should I use?
For example I want to find sum of rectangle in matrix
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
Sum is 68.
What I'll do is accumulating it row by row:
1 2 3 4
6 8 10 12
15 18 21 24
28 32 36 40
And then, if I want to find sum of the matrix I just accumulate 28,32,36,40 = 136. Only four operation instead of 15.
If I want to find sum of second and third row, I just accumulate 15,18,21,24 and subtract 1, 2, 3, 4. = 6+8+10+12+15+18+21+24 = 68.
But in this case I can use another matrix, accumulating this one by columns:
1 3 6 10
5 11 18 26
9 19 30 42
13 27 42 58
and in this case I just sum 26 and 42 = 68. Only 2 operation instead of 8. For wider sub-matrix is is efficient to use second method and matrix, for higher - first one. Can I somehow split merge this to methods to one matrix?
So I just sum to corner and subtract another two?
You're nearly there with your method. The solution is to use a summed area table (aka Integral Image):
http://en.wikipedia.org/wiki/Summed_area_table
The key idea is you do one pass through your matrix and accumulate such that "the value at any point (x, y) in the summed area table is just the sum of all the pixels above and to the left of (x, y), inclusive.".
Then you can compute the sum inside any rectangle in constant time with four lookups.
Why can't you just add them using For loops?
int total = 0;
for(int i = startRow; i = endRow; i++)
{
for(int j = startColumn; j = endColumn; j++)
{
total += array[i][j];
}
}
Where your subarray ("rectangle") would go from startRow to endRow (width) and startColumn to endColumn (height).

Identify gaps in repeated sequences

I have a vector that should contain n sequences from 00 to 11
A = [00;01;02;03;04;05;06;07;08;09;10;11;00;01;02;03;04;05;06;07;08;09;10;11]
and I would like to check that the sequence "00 - 11 " is always respected (no missing values).
for example if
A =[00;01;02; 04;05;06;07;08;09;10;11;00;01;02;03;04;05;06;07;08;09;10;11]
(missing 03 in the 3rd position)
For each missing value I would like to have back this information in another vector
missing=
[value_1,position_1;
value_2, position_2;
etc, etc]
Can you help me?
For sure we know that the last element must be 11, so we can already check for this and make our life easier for testing all previous elements. We ensure that A is 11-terminated, so an "element-wise change" approach (below) will be valid. Note that the same is true for the beginning, but changing A there would mess with indices, so we better take care of that later.
missing = [];
if A(end) ~= 11
missing = [missing; 11, length(A) + 1];
A = [A, 11];
end
Then we can calculate the change dA = A(2:end) - A(1:end-1); from one element to another, and identify the gap positions idx_gap = find((dA~=1) & (dA~=-11));. Now we need to expand all missing indices and expected values, using ev for the expected value. ev can be obtained from the previous value, as in
for k = 1 : length(idx_gap)
ev = A(idx_gap(k));
Now, the number of elements to fill in is the change dA in that position minus one (because one means no gap). Note that this can wrap over if there is a gap at the boundary between segments, so we use the modulus.
for n = 1 : mod(dA(idx_gap(k)) - 1, 12)
ev = mod(ev + 1, 12);
missing = [missing; ev, idx_gap(k) + 1];
end
end
As a test, consider A = [5 6 7 8 9 10 3 4 5 6 7 8 9 10 11 0 1 2 3 4 6 7 8]. That's a case where the special initialization from the beginning will fire, memorizing the missing 11 already, and changing A to [5 6 ... 7 8 11]. missing then will yield
11 24 % recognizes improper termination of A.
11 7
0 7 % properly handles wrap-over here.
1 7
2 7
5 21 % recognizes single element as missing.
9 24
10 24
which should be what you are expecting. Now what's missing still is the beginning of A, so let's say missing = [0 : A(1) - 1, 1; missing]; to complete the list.
This will give you the missing values and their positions in the full sequence:
N = 11; % specify the repeating 0:N sub-sequence
n = 3; % reps of sub-sequence
A = [5 6 7 8 9 10 3 4 5 6 7 8 9 10 11 0 1 2 3 4 6 7 8]'; %' column from s.bandara
da = diff([A; N+1]); % EDITED to include missing end
skipLocs = find(~(da==1 | da==-N));
skipLength = da(skipLocs)-1;
skipLength(skipLength<0) = N + skipLength(skipLength<0) + 1;
firstSkipVal = A(skipLocs)+1;
patchFun = #(x,y)(0:y)'+x - (N+1)*(((0:y)'+x)>N);
patches = arrayfun(patchFun,firstSkipVal,skipLength-1,'uni',false);
locs = arrayfun(#(x,y)(x:x+y)',skipLocs+cumsum([A(1); skipLength(1:end-1)])+1,...
skipLength-1,'uni',false);
Then putting them together, including any missing values at the beginning:
>> gapMap = [vertcat(patches{:}) vertcat(locs{:})-1]; % not including lead
>> gapMap = [repmat((0 : A(1) - 1)',1,2); gapMap] %' including lead
gapMap =
0 0
1 1
2 2
3 3
4 4
11 11
0 12
1 13
2 14
5 29
9 33
10 34
11 35
The first column contains the missing values. The second column is the 0-based location in the hypothetical full sequence.
>> Afull = repmat(0:N,1,n)
>> isequal(gapMap(:,1), Afull(gapMap(:,2)+1)')
ans =
1
Although this doesn't solve your problem completely, you can identify the position of missing values, or of groups of contiguous missing values, like this:
ind = 1+find(~ismember(diff(A),[1 -11]));
ind gives the position with respect to the current sequence A, not to the completed sequence.
For example, with
A =[00;01;02; 04;05;06;07;08;09;10;11;00;01;02;03; ;06;07;08;09;10;11];
this gives
>> ind = 1+find(~ismember(diff(A),[1 -11]))
ind =
4
16

Partitioning a circular buffer while keeping order

I've got a circular buffer with positive natural values, e.g.
1 5
4 2
11 7
2 9
We're going to partition it into exactly two continuous parts, while keeping this order. These two parts in this example could be:
(4 1 5) and (2 7 9 2 11),
(7 9 2 11 4) and (1 5 2),
etc.
The idea is to keep order and take two continuous subsequences.
And the problem now is to partition it so that the sums of these subsequences are closes to each other, i.e. the difference between the sums must be closest to zero.
In this case, I believe the solution would be: (2 7 9 2) and (11 4 1 5) with sums, respectively, 20 and 21.
How to do this optimally?
Algorithm:
Calculate the total sum.
Let the current sum = 0.
Start off with 2 pointers at any point (both starting off at the same point).
Increase the second pointer, adding the number it passed, until the current sum is more than half of the total sum.
Increase the first pointer, subtracting the number it passed, until the current sum is less than half of the total sum.
Stop if either:
The first pointer is back where it started, or
The best sum is 0.5 or 0 from half the total sum (in which case the difference will be 1 or 0).
The difference can be 1 only if the total sum is odd, in which case the difference can never be 0. (Thanks Artur!)
Otherwise repeat from step 3.
Check all the current sums we got in this process and keep the one that's closest to half, along with indices of the partition that got that sum.
Running time:
The running time will be O(n), since we only ever increase the pointers and the first one only goes around once, and the second one can't go around more than twice.
Example:
Input:
1 5
4 2
11 7
2 9
Total sum = 41.
Half of sum = 20.5.
So, let's say we start off at 1. (I just put it on a straight line to make it easier to draw)
p1, p2
V
1 5 2 7 9 2 11 4
sum = 0
p1 p2
V V
1 5 2 7 9 2 11 4
sum = 1
p1 p2
V V
1 5 2 7 9 2 11 4
sum = 6
p1 p2
V V
1 5 2 7 9 2 11 4
sum = 8
p1 p2
V V
1 5 2 7 9 2 11 4
sum = 15
p1 p2
V V
1 5 2 7 9 2 11 4
sum = 24
p1 p2
V V
1 5 2 7 9 2 11 4
sum = 23
p1 p2
V V
1 5 2 7 9 2 11 4
sum = 18
p1 p2
V V
1 5 2 7 9 2 11 4
sum = 20
Here the sum (20) is 0.5 from half the total sum (20.5), so we can stop.
The above corresponds to (11 4 1 5) (2 7 9 2), with a difference in sums of 1.

Cumulative summation of value in each row

I have something like the following:
a = [1 11; 2 16; 3 9; 4 13; 5 8; 6 14];
b = a;
n = length(a);
Sum = [];
for i=1:1:n,
Sum = b(i,2)+b(i+1:1:n,2)
end
b =
1 11
2 16
3 9
4 13
5 8
6 14
For the first iteration I am looking to find the first combination of values in the second column which are between 19 and 25.
Sum =
27
20
24
19
25
Since 20 is that first combination (Rows 1&3) -- I would like to remove that data at start a new matrix or signify that is the first combination (i.e. place a 1 next to in by creating a third column)
The next step would be to sum the values which are still in the matrix with row 2 value:
Sum =
29
24
30
Then 2&5 would be combined.
However, I would like to allow not only pairs to be combined but also several rows if possible.
Is there something I am overlooking that may simplify this problem?
I don't think you're going to simplify this very much. It's a variation on the knapsack problem, which is NP-hard. The best algorithm to use might depend on the size of your inputs.

Resources