How to save output of for loop operation in matlab - for-loop

I have a matrix A which has a size of 54x100. For some specific condition I perform an operation on each row of A. I need to save the output of this for loop. I've tried the following but it did not work.
S=zeros(54,100);
for i=1:54;
Ri=A(i,:);
answer=mean(reshape(Ri,5,20),1);
S(i)=answer;
end

Firstly, judging by your question I'd recommend some basic Matlab tutorials like this or just detailed documentation like this.
To actually help you with your issue though; you can do this:
%% Make up A (since I don't know what it actually is)
n = 54; m = 100;
A = randn(n,m); % N x m matrix of random numbers
%% Loop over each row of A
S = cell(n,1);
for j = 1:n;
Rj = A(j,:); % j'th row
answer = mean(reshape(Rj,5,20),1); % some operation
S{j} = answer; % store the answer in cell S
end
The problem was that your answer was not a single number (1x1 matrix) but a vector and so you got a dimension mismatch error. Above I'm putting the answers into a cell object of size n. The result of your operation on j'th row can then be retrieved by calling S{j}.
Also:
Do not using i as an iterator since it also represents the imaginary unit.
Do not hard-code values but reference the existing ones. For example here I referenced n in the for-loop declaration as opposed to just writing for j = 1:54 because otherwise, if I got struck by a fancy to use my code for a 53x100 array it would not work anymore.
When you post your code I reccomend adding a minimal working example - a pece of code which people can just copy and paste into their Matlab (or whatever interpreter of whatever language) and run to reproduce your problem. Here you have not included anything which tells the code what A is, for example.
This is quite a good read in general and should help you in the future

Related

Summing a number of matrices a number of times

I wrote a Matlab code below which is suppose to solve a system of equations:
A(n)U(n)=U(n-1)*(1-Walpha(2))+ Walpha(2)*U(0)+SUM((Walpha(j)-Walpha(j+1))*U(n-j));
The code below is running, but, i suspect it is only returning WU(n) at the last j step, but it is suppose to calculate and save the values at each j step and then add these values together for each n step and save the result in WU.
U=cell(N,1);
RHS=cell(N,1);
WU=cell(N,1);
RHS{1}=Um0;
U{1}=U1;
WU{1}=zeros(M-1,1);
for n=2:N-1;
for j=1:n-1;
WU{n}=(Walpha(j)-Walpha(j+1))*U{n-j};
end
RHS{n}=(1-Walpha(2)).*U{n-1}+Walpha(n).*U0+ WU{n}-UU{n};
U{n}=inv(A{n})*RHS{n};
end
Can somebody please explain as to how i should re-write my codes such that the summation part in the system is evaluated correctly.

Optimizing MATLAB work on N dim array(512,512,400)

I am working on images that are 512x512 pixels; I have written a code that analyzes my images and gives me the values that I need in matrices that have dimensions (512,512,400) in 10 minutes more or less, using pre-allocation.
My problem is when I want to work with this matrices: it takes me hours to see results and I want to implement some script that does what I want in much less time. Can you help me?
% meanm is a matrix (512,512,400) that contains the mean of every inputmatrix
% sigmam is a matrix (512,512,400) that contains the std of every inputmatrix
% Basically what I want is that for every inputmatrix (512x512), that is stored inside
% an array of dimensions (512,512,400),
% if a value is higher than the meanm + sigmam it has to be changed with
% the corrispondent value of meanm matrix.
p=400;
for h=1:p
if (inputmatrix(:,:,h) > meanm(:,:,h) + sigmam(:,:,h))
inputmatrix(:,:,h) = meanm(:,:,h);
end
end
I know that MatLab performs better on matrices calculation but I have no idea how to translate this for loop on my 400 images in something easier for it.
Try using the condition of your for loop to make a logical matrix
logical_mask = (meanm + sigmam) < inputmatrix;
inputmatrix(logical_mask) = meanm(logical_mask);
This should improve your performance by using two features of Matlab
Vectorization uses matrix operations instead of loops. To quote the linked site "Vectorized code often runs much faster than the corresponding code containing loops."
Logical Indexing allows you to access all elements in your array that meet a condition simultaneously.

I need help optimizing this compression algorithm I came up with on my own

I tried coming up with a compression algorithm. I do little bit about compression theories and so am aware that this scheme that I have come up with could very well never achieve compression at all.
Currently it works only for a string with no consecutive repeating letters/digits/symbols. Once properly established I hope to extrapolate it to binary data etc. But first the algorithm:
Assuming there are only 4 letters: a,b,c,d; we create a matrix/array corresponding to the letters. Whenever a letter is encountered, the corresponding index is incremented so that the index of the last letter encountered is always largest. We incremement an index by 2 if it was originally zero. If it was not originally zero then we increment it by 2+(the second largest element in the matrix). An example to clarify:
Array = [a,b,c,d]
Initial state = [0,0,0,0]
Letter = a
New state = [2,0,0,0]
Letter = b
New state = [2,4,0,0]
.
.c
.d
.
New state = [2,4,6,8]
Letter = a
New state = [12,4,6,8]
//Explanation for the above state: 12 because Largest - Second Largest - 2 = Old value
Letter = d
New state = [12,4,6,22]
and so on...
Decompression is just this logic in reverse.
A rudimentary implementation of compression (in python):
(This function is very rudimentary so not the best kind of code...I know. I can optimize it once I get the core algorithm correct.)
def compress(text):
matrix = [0]*95 #we are concerned with 95 printable chars for now
for i in text:
temp = copy.deepcopy(matrix)
temp.sort()
largest = temp[-1]
if matrix[ord(i)-32] == 0:
matrix[ord(i)-32] = largest+2
else:
matrix[ord(i)-32] = largest+matrix[ord(i)-32]+2
return matrix
The returned matrix is then used for decompression. Now comes the tricky part:
I can't really call this compression at all because each number in the matrix generated from the function are of the order of 10**200 for a string of length 50000. So storing the matrix actually takes more space than storing the original string. I know...totally useless. But I had hoped prior to doing all this that I can use the mathematical properties of a matrix to effectively represent it in some kind of mathematical shorthand. I have tried many possibilities and failed. Some things that I tried:
Rank of the matrix. Failed because not unique.
Denote using the mod function. Failed because either the quotient or the remainder
Store each integer as a generator using pickle.
Store the matrix as a bitmap file but then the integers are too large to be able to store as color codes.
Let me iterate again that the algorithm could be optimized. e.g. instead of adding 2 we could add 1 and proceed. But don't really result in any compression. Same for the code. Minor optimizations later...first I want to improve the main algorithm.
Furthermore, it is very likely that this product of a mediocre and idle mind like myself could never be able to achieve compression after all. In which case, I would then like your help and ideas on what this could probably be useful in.
TL;DR: Check coded parts which depict a compression algorithm. The compressed result is longer than the original string. Can this be fixed? If yes, how?
PS: I have the entire code on my PC. Will create a repo on github and upload in some time.
Compression is essentially a predictive process. Look for patterns in the input and use them to encode the more likely next character(s) more efficiently than the less likely. I can't see anything in your algorithm that tries to build a predictive model.

Vectorization of matlab code

i'm kinda new to vectorization. Have tried myself but couldn't. Can somebody help me vectorize this code as well as give a short explaination on how u do it, so that i can adapt the thinking process too. Thanks.
function [result] = newHitTest (point,Polygon,r,tol,stepSize)
%This function calculates whether a point is allowed.
%First is a quick test is done by calculating the distance from point to
%each point of the polygon. If that distance is smaller than range "r",
%the point is not allowed. This will slow down the algorithm at some
%points, but will greatly speed it up in others because less calls to the
%circleTest routine are needed.
polySize=size(Polygon,1);
testCounter=0;
for i=1:polySize
d = sqrt(sum((Polygon(i,:)-point).^2));
if d < tol*r
testCounter=1;
break
end
end
if testCounter == 0
circleTestResult = circleTest (point,Polygon,r,tol,stepSize);
testCounter = circleTestResult;
end
result = testCounter;
Given the information that Polygon is 2 dimensional, point is a row vector and the other variables are scalars, here is the first version of your new function (scroll down to see that there are lots of ways to skin this cat):
function [result] = newHitTest (point,Polygon,r,tol,stepSize)
result = 0;
linDiff = Polygon-repmat(point,size(Polygon,1),1);
testLogicals = sqrt( sum( ( linDiff ).^2 ,2 )) < tol*r;
if any(testLogicals); result = circleTest (point,Polygon,r,tol,stepSize); end
The thought process for vectorization in Matlab involves trying to operate on as much data as possible using a single command. Most of the basic builtin Matlab functions operate very efficiently on multi-dimensional data. Using for loop is the reverse of this, as you are breaking your data down into smaller segments for processing, each of which must be interpreted individually. By resorting to data decomposition using for loops, you potentially loose some of the massive performance benefits associated with the highly optimised code behind the Matlab builtin functions.
The first thing to think about in your example is the conditional break in your main loop. You cannot break from a vectorized process. Instead, calculate all possibilities, make an array of the outcome for each row of your data, then use the any keyword to see if any of your rows have signalled that the circleTest function should be called.
NOTE: It is not easy to efficiently conditionally break out of a calculation in Matlab. However, as you are just computing a form of Euclidean distance in the loop, you'll probably see a performance boost by using the vectorized version and calculating all possibilities. If the computation in your loop were more expensive, the input data were large, and you wanted to break out as soon as you hit a certain condition, then a matlab extension made with a compiled language could potentially be much faster than a vectorized version where you might be performing needless calculation. However this is assuming that you know how to program code that matches the performance of the Matlab builtins in a language that compiles to native code.
Back on topic ...
The first thing to do is to take the linear difference (linDiff in the code example) between Polygon and your row vector point. To do this in a vectorized manner, the dimensions of the 2 variables must be identical. One way to achieve this is to use repmat to copy each row of point to make it the same size as Polygon. However, bsxfun is usually a superior alternative to repmat (as described in this recent SO question), making the code ...
function [result] = newHitTest (point,Polygon,r,tol,stepSize)
result = 0;
linDiff = bsxfun(#minus, Polygon, point);
testLogicals = sqrt( sum( ( linDiff ).^2 ,2 )) < tol*r;
if any(testLogicals); result = circleTest (point,Polygon,r,tol,stepSize); end
I rolled your d value into a column of d by summing across the 2nd axis (note the removal of the array index from Polygon and the addition of ,2 in the sum command). I then went further and evaluated the logical array testLogicals inline with the calculation of the distance measure. You will quickly see that a downside of heavy vectorisation is that it can make the code less readable to those not familiar with Matlab, but the performance gains are worth it. Comments are pretty necessary.
Now, if you want to go completely crazy, you could argue that the test function is so simple now that it warrants use of an 'anonymous function' or 'lambda' rather than a complete function definition. The test for whether or not it is worth doing the circleTest does not require the stepSize argument either, which is another reason for perhaps using an anonymous function. You can roll your test into an anonymous function and then jut use circleTest in your calling script, making the code self documenting to some extent . . .
doCircleTest = #(point,Polygon,r,tol) any(sqrt( sum( bsxfun(#minus, Polygon, point).^2, 2 )) < tol*r);
if doCircleTest(point,Polygon,r,tol)
result = circleTest (point,Polygon,r,tol,stepSize);
else
result = 0;
end
Now everything is vectorised, the use of function handles gives me another idea . . .
If you plan on performing this at multiple points in the code, the repetition of the if statements would get a bit ugly. To stay dry, it seems sensible to put the test with the conditional function into a single function, just as you did in your original post. However, the utility of that function would be very narrow - it would only test if the circleTest function should be executed, and then execute it if needs be.
Now imagine that after a while, you have some other conditional functions, just like circleTest, with their own equivalent of doCircleTest. It would be nice to reuse the conditional switching code maybe. For this, make a function like your original that takes a default value, the boolean result of the computationally cheap test function, and the function handle of the expensive conditional function with its associated arguments ...
function result = conditionalFun( default, cheapFunResult, expensiveFun, varargin )
if cheapFunResult
result = expensiveFun(varargin{:});
else
result = default;
end
end %//of function
You could call this function from your main script with the following . . .
result = conditionalFun(0, doCircleTest(point,Polygon,r,tol), #circleTest, point,Polygon,r,tol,stepSize);
...and the beauty of it is you can use any test, default value, and expensive function. Perhaps a little overkill for this simple example, but it is where my mind wandered when I brought up the idea of using function handles.

Details of the "New Yale" sparse matrix format?

There's some Netlib code written in Fortran which performs transposes and multiplication on sparse matrices. The library works with Bank-Smith (sort of), "old Yale", and "new Yale" formats.
Unfortunately, I haven't been able to find much detail on "new Yale." I implemented what I think matches the description given in the paper, and I can get and set entries appropriately.
But the results are not correct, leading me to wonder if I've implemented something which matches the description in the paper but is not what the Fortran code expects.
So a couple of questions:
Should row lengths include diagonal entries? e.g., if you have M=[1,1;0,1], it seems that it should look like this:
IJA = [3,4,4,1]
A = [1,1,X,1] // where X=NULL
It seems that if diagonal entries are included in row lengths, you'd get something like this:
IJA = [3,5,6,1]
A = [1,1,X,1]
That doesn't make much sense because IJA[2]=6 should be the size of the IJA/A arrays, but it is what the paper seems to say.
Should the matrices use 1-based indexing?
It is Fortran code after all. Perhaps instead my IJA and A should look like this:
IJA = [4,5,5,2]
A = [1,1,X,1] // still X=NULL
Is there anything else I'm missing?
Yes, that's vague, but I throw that out there in case someone who has messed with this code before would like to volunteer any additional information. Anyone else can feel free to ignore this last question.
I know these questions may seem rather trivial, but I thought perhaps some Fortran folks could provide me with some insight. I'm not used to thinking in a one-based system, and though I've converted the code to C using f2c, it's still written like Fortran.
I can't see how you deduced those vectors from that paper. First the Old Yale format:
M = [7,16;0,-12]
Then, A contains all non-zero values of M in row-form:
A = [7,16,-12]
and IA stores the position in A of the first elements of each row, and JA stores the column indices of all the values in A:
IA = [1,3,4]
JA = [1,2,2]
New format: A has diagonal values first, a zero and then the remaining non-zero elements (I have put | to clarify the seperation between diagonal and non-diagonal) :
A = [7,-12,0 | 16]
IA and JA are combined in IJA, but as far as I can tell from the paper you need to take into account the new ordering of A (I have put | to clarify the seperation between IA and JA):
IJA = [1,2,3 | 2]
So, applied to your case M = [1,1;0,1], I get
A = [1,1,0 | 1]
IJA = [1,2,3 | 2]
first element of the first row is the first in A and the first element of the second row is the second in A, then I put 3 since they say the length of a row is determined by IA(I)-IA(I+1), so I make sure the difference is 1. Then the column indices of the non-zero non-diagonal elements follow, and that is 2.
So, first of all, the reference given in the SMMP paper is possibly not the correct one. I checked it out (the ref) from the library last night. It appears to give the "old Yale" format. It does mention, on pp. 49-50, that the diagonal can be separated out from the rest of the matrix -- but doesn't so much as mention an IJA vector.
I was able to find the format described in the 1992 edition of Numerical Recipes in C on pp. 78-79.
Of course, there is no guarantee that this is the format accepted by the SMMP library from Netlib.
NR seems to have IA giving positions relative to IJA, not relative to JA. The last position in the IA portion gives not the size of the IJA and A vectors, but size-1, because the vectors are indexed starting at 1 (per Fortran standard).
Row lengths do not include non-zero diagonal entries.

Resources