If I have a 5x5 matrix called MATRIX1 like this:
12 13 14 15 16
21 23 24 25 26
31 43 52 23 43
63 36 74 47 45
21 23 32 34 43
How can I make a for loop (or something similar) which will give me a new matrix with average values of all columns of 5x5 matrix?
I mean to get another matrix with a name MATRIX2 in which will be just one row with 5 average values of each column from MATRIX1.
Thanks
First you need to declare an array of size 5 like this
int a[] = new int[5];
Second you need to go through all col values and calculate it's average
for(int i=0;i<5;++i){
int sum = 0;
for(int j=0;j<5;++j){
sum+=a[j][i];
}
a[i] = sum ;
}
i assumed you use java as u didn't tell what lang you use
Here is the example of matrix in Excel MATRIX in Excel . . . But to calculate this in Matlab is already challenge for me.
Related
I want to compute the mean and standard derivation of sub region that is created by a window (dashed line) and center at identified pixel-red color( called local mean and standard derivation). This is figure to describe it
We can do it by convolution image with a mask. However, it takes long time because I only care the mean and standard derivation of a server points, while convolution computes for whole point in image. Could you have a faster way to resolve it that only compute the mean and standard derivation at identified pixel? I am doing it by matlab. This is my code by convolution function
I=[18 36 70 33 64 40 62 76 71 37 5
82 49 86 45 96 29 74 7 60 56 45
25 32 55 48 25 30 12 82 95 77 8
24 18 78 74 19 57 67 59 16 46 78
28 9 59 2 29 11 7 31 75 15 25
83 26 96 8 82 26 85 12 11 28 19
81 64 78 70 26 33 17 72 81 16 54
75 39 78 34 59 31 77 31 61 81 89
89 84 29 99 79 25 26 35 65 56 76
93 90 45 7 61 13 34 24 11 34 92
88 82 91 81 100 4 88 70 85 8 19];
identified_position=[30 36 84 90] %indices of pixel 78, 48,72 60
mask=1/9.*ones(3,3);
mean_all=imfilter(I,mask,'same');
%Mean of identified pixels
mean_all(identified_position)
% Compute the variance
std_all=stdfilt(I,ones(3));
%std of identified pixels
std_all(identified_position)
This is the comparison code
function compare_mean(dimx,dimy)
I=randi(100,[dimx,dimy]);
rad=3;
identified_position=randi(max(I(:)),[1,5]);% Get 5 random position
function way1()
mask=ones(rad,rad);
mask=mask./sum(mask(:));
mean_all=conv2(I,mask,'same');
mean_out =mean_all(identified_position);
end
function way2()
box_size = rad; %// Edit your window size here (an odd number is preferred)
bxr = floor(box_size/2); %// box radius
%// Get neighboring indices and those elements for all identified positions
off1 = bsxfun(#plus,[-bxr:bxr]',[-bxr:bxr]*size(I,1)); %//'#neighborhood offsets
idx = bsxfun(#plus,off1(:),identified_position); %// all absolute offsets
I_selected_neigh = I(idx); %// all offsetted elements
mean_out = mean(I_selected_neigh,1); %// mean output
end
way2()
time_way1=#()way1();timeit(time_way1)
time_way2=#()way2();timeit(time_way2)
end
Sometime the way2 has error is
Subscript indices must either be real positive integers or logicals.
Error in compare_mean/way2 (line 18)
I_selected_neigh = I(idx); %// all offsetted elements
Error in compare_mean (line 22)
way2()
Discussion & Solution Codes
Given I as the input image, identified_position as the linear indices of the selected points and bxsz as the window/box size, the approach listed next must be pretty efficient -
%// Get XY coordinates
[X,Y] = ind2sub(size(I),identified_position);
pts = [X(:) Y(:)];
%// Parameters
bxr = (bxsz-1)/2;
Isz = size(I);
%// XY coordinates of neighboring elements
[offx,offy] = ndgrid(-bxr:bxr,-bxr:bxr);
x_idx = bsxfun(#plus,offx(:),pts(:,1)'); %//'
y_idx = bsxfun(#plus,offy(:),pts(:,2)'); %//'
%// Outside image boundary elements
invalids = x_idx>Isz(1) | x_idx<1 | y_idx>Isz(2) | y_idx<1;
%// All neighboring indices
all_idx = (y_idx-1)*size(I,1) + x_idx;
all_idx(invalids) = 1;
%// All neighboring elements
all_vals = I(all_idx);
all_vals(invalids) = 0;
mean_out = mean(all_vals,1); %// final mean output
stdfilts = stdfilt(all_vals,ones(bxsz^2,1))
std_out = stdfilts(ceil(size(stdfilts,1)/2),:) %// final stdfilt output
Basically, it gets all the neighbouring indices for all identified positions in one go with bsxfun and thus, gets all those neighbouring elements. Those selected elements are then used to get the mean and stdfilt outputs. The whole idea is to keep the memory requirement minimum and at the same time doing everything in a vectorized fashion within those selected elements. Hopefully, this must be faster!
Benchmarking
Benchmarking Code
dx = 10000; %// x-dimension of input image
dy = 10000; %// y-dimension of input image
npts = 1000; %// number of points
I=randi(100,[dx,dy]); %// create input image of random intensities
identified_position=randi(max(I(:)),[1,npts]);
rad=5; %// blocksize (rad x rad)
%// Run the approaches fed with the inputs
func1 = #() way1(I,identified_position,rad); %// original approach
time1 = timeit(func1);
clear func1
func2 = #() way2(I,identified_position,rad); %// proposed approach
time2 = timeit(func2);
clear func2
disp(['Input size: ' num2str(dx) 'x' num2str(dy) ' & Points: ' num2str(npts)])
disp(['With Original Approach: Elapsed Time = ' num2str(time1) '(s)'])
disp(['With Proposed Approach: Elapsed Time = ' num2str(time2) '(s)'])
disp(['**Speedup w/ Proposed Approach : ' num2str(time1/time2) 'x!**'])
Associated function codes
%// OP's stated approach
function mean_out = way1(I,identified_position,rad)
mask=ones(rad,rad);
mask=mask./sum(mask(:));
mean_all=conv2(I,mask,'same');
mean_out =mean_all(identified_position);
return;
function mean_out = way2(I,identified_position,rad)
%//.... code from proposed approach stated earlier until mean_out %//
Runtime results
Input size: 10000x10000 & Points: 1000
With Original Approach: Elapsed Time = 0.46394(s)
With Proposed Approach: Elapsed Time = 0.00049403(s)
**Speedup w/ Proposed Approach : 939.0778x!**
I have a m x n matrix and want to be able to calculate sums of arbitrary rectangular submatrices. This will happen several times for the given matrix. What data structure should I use?
For example I want to find sum of rectangle in matrix
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
Sum is 68.
What I'll do is accumulating it row by row:
1 2 3 4
6 8 10 12
15 18 21 24
28 32 36 40
And then, if I want to find sum of the matrix I just accumulate 28,32,36,40 = 136. Only four operation instead of 15.
If I want to find sum of second and third row, I just accumulate 15,18,21,24 and subtract 1, 2, 3, 4. = 6+8+10+12+15+18+21+24 = 68.
But in this case I can use another matrix, accumulating this one by columns:
1 3 6 10
5 11 18 26
9 19 30 42
13 27 42 58
and in this case I just sum 26 and 42 = 68. Only 2 operation instead of 8. For wider sub-matrix is is efficient to use second method and matrix, for higher - first one. Can I somehow split merge this to methods to one matrix?
So I just sum to corner and subtract another two?
You're nearly there with your method. The solution is to use a summed area table (aka Integral Image):
http://en.wikipedia.org/wiki/Summed_area_table
The key idea is you do one pass through your matrix and accumulate such that "the value at any point (x, y) in the summed area table is just the sum of all the pixels above and to the left of (x, y), inclusive.".
Then you can compute the sum inside any rectangle in constant time with four lookups.
Why can't you just add them using For loops?
int total = 0;
for(int i = startRow; i = endRow; i++)
{
for(int j = startColumn; j = endColumn; j++)
{
total += array[i][j];
}
}
Where your subarray ("rectangle") would go from startRow to endRow (width) and startColumn to endColumn (height).
Say for example that I have the following matrix representing some image:
I=[1 2; 5 7; 7 5];
Getting the vector for the above matrix, we can do the following:
I_vector=I(:);
At the same time, say that we have the following vector that was retrieved after applying some operations on I
f=[5 65 65; 65 67 98; 7 7 9; 87 34 86; 65 87 87; 86 23 07; 76 89 13];
Say that for each element in I, I want to assign a vector value. So, instead of having I(1)=1, I want it to be I(1)=[5 65 65]. So, when calling I(1), we get the latter result.
Is that possible in matlab?
Thanks.
If the vectors you want to place inside I are all of the same length, then store it as a matrix and call by row:
I(1,:)
If the the vectors are not of the same length, then store it in a cell array and access the content of each cell with { }:
I = {1:10, 1:20}
I{2}
I want to understand "median of medians" algorithm on the following example:
We have 45 distinct numbers divided into 9 group with 5 elements each.
48 43 38 33 28 23 18 13 8
49 44 39 34 29 24 19 14 9
50 45 40 35 30 25 20 15 10
51 46 41 36 31 26 21 16 53
52 47 42 37 32 27 22 17 54
The first step is sorting every group (in this case they are already sorted)
Second step recursively, find the "true" median of the medians (50 45 40 35 30 25 20 15 10) i.e. the set will be divided into 2 groups:
50 25
45 20
40 15
35 10
30
sorting these 2 groups
30 10
35 15
40 20
45 25
50
the medians is 40 and 15 (in case the numbers are even we took left median)
so the returned value is 15 however "true" median of medians (50 45 40 35 30 25 20 15 10) is 30, moreover there are 5 elements less then 15 which are much less than 30% of 45 which are mentioned in wikipedia
and so T(n) <= T(n/5) + T(7n/10) + O(n) fails.
By the way in the Wikipedia example, I get result of recursion as 36. However, the true median is 47.
So, I think in some cases this recursion may not return true median of medians. I want to understand where is my mistake.
The problem is in the step where you say to find the true median of the medians. In your example, you had these medians:
50 45 40 35 30 25 20 15 10
The true median of this data set is 30, not 15. You don't find this median by splitting the groups into blocks of five and taking the median of those medians, but instead by recursively calling the selection algorithm on this smaller group. The error in your logic is assuming that median of this group is found by splitting the above sequence into two blocks
50 45 40 35 30
and
25 20 15 10
then finding the median of each block. Instead, the median-of-medians algorithm will recursively call itself on the complete data set 50 45 40 35 30 25 20 15 10. Internally, this will split the group into blocks of five and sort them, etc., but it does so to determine the partition point for the partitioning step, and it's in this partitioning step that the recursive call will find the true median of the medians, which in this case will be 30. If you use 30 as the median as the partitioning step in the original algorithm, you do indeed get a very good split as required.
Hope this helps!
Here is the pseudocode for median of medians algorithm (slightly modified to suit your example). The pseudocode in wikipedia fails to portray the inner workings of the selectIdx function call.
I've added comments to the code for explanation.
// L is the array on which median of medians needs to be found.
// k is the expected median position. E.g. first select call might look like:
// select (array, N/2), where 'array' is an array of numbers of length N
select(L,k)
{
if (L has 5 or fewer elements) {
sort L
return the element in the kth position
}
partition L into subsets S[i] of five elements each
(there will be n/5 subsets total).
for (i = 1 to n/5) do
x[i] = select(S[i],3)
M = select({x[i]}, n/10)
// The code to follow ensures that even if M turns out to be the
// smallest/largest value in the array, we'll get the kth smallest
// element in the array
// Partition array into three groups based on their value as
// compared to median M
partition L into L1<M, L2=M, L3>M
// Compare the expected median position k with length of first array L1
// Run recursive select over the array L1 if k is less than length
// of array L1
if (k <= length(L1))
return select(L1,k)
// Check if k falls in L3 array. Recurse accordingly
else if (k > length(L1)+length(L2))
return select(L3,k-length(L1)-length(L2))
// Simply return M since k falls in L2
else return M
}
Taking your example:
The median of medians function will be called over the entire array of 45 elements like (with k = 45/2 = 22):
median = select({48 49 50 51 52 43 44 45 46 47 38 39 40 41 42 33 34 35 36 37 28 29 30 31 32 23 24 25 26 27 18 19 20 21 22 13 14 15 16 17 8 9 10 53 54}, 45/2)
The first time M = select({x[i]}, n/10) is called, array {x[i]} will contain the following numbers: 50 45 40 35 30 20 15 10.
In this call, n = 45, and hence the select function call will be M = select({50 45 40 35 30 20 15 10}, 4)
The second time M = select({x[i]}, n/10) is called, array {x[i]} will contain the following numbers: 40 20.
In this call, n = 9 and hence the call will be M = select({40 20}, 0).
This select call will return and assign the value M = 20.
Now, coming to the point where you had a doubt, we now partition the array L around M = 20 with k = 4.
Remember array L here is: 50 45 40 35 30 20 15 10.
The array will be partitioned into L1, L2 and L3 according to the rules L1 < M, L2 = M and L3 > M. Hence:
L1: 10 15
L2: 20
L3: 30 35 40 45 50
Since k = 4, it's greater than length(L1) + length(L2) = 3. Hence, the search will be continued with the following recursive call now:
return select(L3,k-length(L1)-length(L2))
which translates to:
return select({30 35 40 45 50}, 1)
which will return 30 as a result. (since L has 5 or fewer elements, hence it'll return the element in kth i.e. 1st position in the sorted array, which is 30).
Now, M = 30 will be received in the first select function call over the entire array of 45 elements, and the same partitioning logic which separates the array L around M = 30 will apply to finally get the median of medians.
Phew! I hope I was verbose and clear enough to explain median of medians algorithm.
I have a set of N^2 numbers and N bins. Each bin is supposed to have N numbers from the set assigned to it. The problem I am facing is finding a set of distributions that map the numbers to the bins, satisfying the constraint, that each pair of numbers can share the same bin only once.
A distribution can nicely be represented by an NxN matrix, in which each row represents a bin. Then the problem is finding a set of permutations of the matrix' elements, in which each pair of numbers shares the same row only once. It's irrelevant which row it is, only that two numbers were both assigned to the same one.
Example set of 3 permutations satisfying the constraint for N=8:
0 1 2 3 4 5 6 7
8 9 10 11 12 13 14 15
16 17 18 19 20 21 22 23
24 25 26 27 28 29 30 31
32 33 34 35 36 37 38 39
40 41 42 43 44 45 46 47
48 49 50 51 52 53 54 55
56 57 58 59 60 61 62 63
0 8 16 24 32 40 48 56
1 9 17 25 33 41 49 57
2 10 18 26 34 42 50 58
3 11 19 27 35 43 51 59
4 12 20 28 36 44 52 60
5 13 21 29 37 45 53 61
6 14 22 30 38 46 54 62
7 15 23 31 39 47 55 63
0 9 18 27 36 45 54 63
1 10 19 28 37 46 55 56
2 11 20 29 38 47 48 57
3 12 21 30 39 40 49 58
4 13 22 31 32 41 50 59
5 14 23 24 33 42 51 60
6 15 16 25 34 43 52 61
7 8 17 26 35 44 53 62
A permutation that doesn't belong in the above set:
0 10 20 30 32 42 52 62
1 11 21 31 33 43 53 63
2 12 22 24 34 44 54 56
3 13 23 25 35 45 55 57
4 14 16 26 36 46 48 58
5 15 17 27 37 47 49 59
6 8 18 28 38 40 50 60
7 9 19 29 39 41 51 61
Because of multiple collisions with the second permutation, since, for example they're both pairing the numbers 0 and 32 in one row.
Enumerating three is easy, it consists of 1 arbitrary permutation, its transposition and a matrix where the rows are made of the previous matrix' diagonals.
I can't find a way to produce a set consisting of more though. It seems to be either a very complex problem, or a simple problem with an unobvious solution. Either way I'd be thankful if somebody had any ideas how to solve it in reasonable time for the N=8 case, or identified the proper, academic name of the problem, so I could google for it.
In case you were wondering what is it useful for, I'm looking for a scheduling algorithm for a crossbar switch with 8 buffers, which serves traffic to 64 destinations. This part of the scheduling algorithm is input traffic agnostic, and switches cyclically between a number of hardwired destination-buffer mappings. The goal is to have each pair of destination addresses compete for the same buffer only once in the cycling period, and to maximize that period's length. In other words, so that each pair of addresses was competing for the same buffer as seldom as possible.
EDIT:
Here's some code I have.
CODE
It's greedy, it usually terminates after finding the third permutation. But there should exist a set of at least N permutations satisfying the problem.
The alternative would require that choosing permutation I involved looking for permutations (I+1..N), to check if permutation I is part of the solution consisting of the maximal number of permutations. That'd require enumerating all permutations to check at each step, which is prohibitively expensive.
What you want is a combinatorial block design. Using the nomenclature on the linked page, you want designs of size (n^2, n, 1) for maximum k. This will give you n(n+1) permutations, using your nomenclature. This is the maximum theoretically possible by a counting argument (see the explanation in the article for the derivation of b from v, k, and lambda). Such designs exist for n = p^k for some prime p and integer k, using an affine plane. It is conjectured that the only affine planes that exist are of this size. Therefore, if you can select n, maybe this answer will suffice.
However, if instead of the maximum theoretically possible number of permutations, you just want to find a large number (the most you can for a given n^2), I am not sure what the study of these objects is called.
Make a 64 x 64 x 8 array: bool forbidden[i][j][k] which indicates whether the pair (i,j) has appeared in row k. Each time you use the pair (i, j) in the row k, you will set the associated value in this array to one. Note that you will only use the half of this array for which i < j.
To construct a new permutation, start by trying the member 0, and verify that at least seven of forbidden[0][j][0] that are unset. If there are not seven left, increment and try again. Repeat to fill out the rest of the row. Repeat this whole process to fill the entire NxN permutation.
There are probably optimizations you should be able to come up with as you implement this, but this should do pretty well.
Possibly you could reformulate your problem into graph theory. For example, you start with the complete graph with N×N vertices. At each step, you partition the graph into N N-cliques, and then remove all edges used.
For this N=8 case, K64 has 64×63/2 = 2016 edges, and sixty-four lots of K8 have 1792 edges, so your problem may not be impossible :-)
Right, the greedy style doesn't work because you run out of numbers.
It's easy to see that there can't be more than 63 permutations before you violate the constraint. On the 64th, you'll have to pair at least one of the numbers with another its already been paired with. The pigeonhole principle.
In fact, if you use the table of forbidden pairs I suggested earlier, you find that there are a maximum of only N+1 = 9 permutations possible before you run out. The table has N^2 x (N^2-1)/2 = 2016 non-redundant constraints, and each new permutation will create N x (N choose 2) = 28 new pairings. So all the pairings will be used up after 2016/28 = 9 permutations. It seems like realizing that there are so few permutations is the key to solving the problem.
You can generate a list of N permutations numbered n = 0 ... N-1 as
A_ij = (i * N + j + j * n * N) mod N^2
which generates a new permutation by shifting the columns in each permutation. The top row of the nth permutation are the diagonals of the n-1th permutation. EDIT: Oops... this only appears to work when N is prime.
This misses one last permutation, which you can get by transposing the matrix:
A_ij = j * N + i