I saw some similar issues in this forum but I didn't find a real solution to this problem.
I have the following matlab code, in which I work with very big images (182MP):
%step 1: read the image
image=single(imread('image.tif'));
%step 2: read the image segmentation
regions=imread('image_segmentation.ppm');
%step 3: count the number of segments
number_of_regions=numel(unique(regions));
%step 4: regions label
regions_label=unique(regions);
for i=1:number_of_regions
%pick the pixel indexes of the i'th region
[x_region,y_region]=find(regions==label_regions(i));
%the problem starts here
ndvi_region=(image(x_region,y_region,1)-image(x_region,y_region,3))./(imagem(x_region,y_region,1)+image(x_region,y_region,3));
every time I run the code with specific regions matlab returns the error: Maximum variable size allowed by the program is exceeded.
I'm running the code with 48GB of RAM in my colllege's cluster. The problem starts only in region number 43 and below. The other regions run ok.
Is there a smart way for me to run this code?
I believe the problem is in your use of
image(x_region,y_region,1)
I suspect that you think that accesses the N elements of the region; but in fact, you access NxN elements! For a large region, that can easily blow up on you. In general, A(vec1, vec2) creates a section of numel(vec1) by numel(vec2). To solve this, you need to use the sub2ind function to find just the indices you need (or use a single parameter in your find command, and shape your matrix accordingly):
% option 1
[x_region, y_region]=find(regions==label_regions(i));
indx1 = sub2ind(size(image), x_region, y_region, 1*ones(size(x_region)));
indx3 = sub2ind(size(image), x_region, y_region, 3*ones(size(x_region)));
ndvi_region = (image(indx1) - image(indx3))./(image(indx1) + image(indx3));
% option 2
indx = find(regions==label_regions(i));
r_image = reshape(image, [], 3); % assuming you have XxYx3 image
ndvi_region = (r_image(indx, 1) - r_image(indx, 3))./(r_image(indx,1) + r_image(indx, 3));
The second option does make a complete copy of the image, so option 1 is probably quicker.
Related
I am running simulations in Anylogic and I'm trying to calibrate the following distribution:
Jump = normal(coef1, coef2, -1, 1);
However, I keep getting the following message as soon as I start the calibration (experimentation):
Random number generation requires too many iterations (> 10000)
I tried to replace -1 and 1 by other values and keep getting the same thing.
I also tried to change the bounds of coef1 and coef2 and put things like [0,1], but I still get the same error.
I don't get it.
Any ideas?
The four parameter normal method is not deprecated and is not a "calibration where coef1 and coef2 are the coefficicents to be solved for". Where did you get that understanding from? Or are you saying that you're using your AnyLogic Experiment (possibly a multi-run or optimisation experiment) to 'calibrate' that distribution, in which case you need to explain what you mean by 'calibrate' here---what is your desired outcome?
If you look in the API reference (AnyLogic classes and functions --> API Reference --> com.xj.anylogic.engine --> Utilities), you'll see that it's a method to use a truncated normal distribution.
public double normal(double min,
double max,
double shift,
double stretch)
The first 2 parameters are the min and max (where it will sample repeatedly and ignore values outside the [min,max] range); the second two are effectively the mean and standard deviation. So you will get the error you mentioned if min or max means it will sample too many times to get a value in range.
API reference details below:
Generates a sample of truncated Normal distribution. Distribution
normal(1, 0) is stretched by stretch coefficient, then shifted to the
right by shift, after that it is truncated to fit in [min, max]
interval. Truncation is performed by discarding every sample outside
this interval and taking subsequent try. For more details see
normal(double, double)
Parameters:
min - the minimum value that this function will return. The distribution is truncated to return values above this. If the sample
(stretched and shifted) is below this value it will be discarded and
another sample will be drawn. Use -infinity for "No limit".
max - the maximum value that this function will return. The distribution is truncated to return values below this. If the sample
(stretched and shifted) is bigger than this value it will be discarded
and another sample will be drawn. Use +infinity for "No limit".
shift - the shift parameter that indicates how much the (stretched) distribution will shifted to the right = mean value
stretch - the stretch parameter that indicates how much the distribution will be stretched = standard deviation Returns:
the generated sample
According to AnyLogic's documentation, there is no version of normal which takes 4 arguments. Also note that if you specify a mean and standard deviation, the order is unusual (to probabilists/statisticians) by putting the standard deviation before the mean.
I am working on images that are 512x512 pixels; I have written a code that analyzes my images and gives me the values that I need in matrices that have dimensions (512,512,400) in 10 minutes more or less, using pre-allocation.
My problem is when I want to work with this matrices: it takes me hours to see results and I want to implement some script that does what I want in much less time. Can you help me?
% meanm is a matrix (512,512,400) that contains the mean of every inputmatrix
% sigmam is a matrix (512,512,400) that contains the std of every inputmatrix
% Basically what I want is that for every inputmatrix (512x512), that is stored inside
% an array of dimensions (512,512,400),
% if a value is higher than the meanm + sigmam it has to be changed with
% the corrispondent value of meanm matrix.
p=400;
for h=1:p
if (inputmatrix(:,:,h) > meanm(:,:,h) + sigmam(:,:,h))
inputmatrix(:,:,h) = meanm(:,:,h);
end
end
I know that MatLab performs better on matrices calculation but I have no idea how to translate this for loop on my 400 images in something easier for it.
Try using the condition of your for loop to make a logical matrix
logical_mask = (meanm + sigmam) < inputmatrix;
inputmatrix(logical_mask) = meanm(logical_mask);
This should improve your performance by using two features of Matlab
Vectorization uses matrix operations instead of loops. To quote the linked site "Vectorized code often runs much faster than the corresponding code containing loops."
Logical Indexing allows you to access all elements in your array that meet a condition simultaneously.
I would like to introduce an interesting MATLAB programming problem I’ve encountered in my research. The solution may be of use to people doing computations on very large data sets. It involves striking a balance between RAM and CPU usage using parfor. Because my data is so large, files must be read in over and over again to be processed. The other issue it introduces is finding an optimal algorithm for multiplication, summation and averaging of very large vectors and matrices.
I have found a solution, but it’s time intensive and I would like too see if the community sees any room for improvement. Here’s the general form of the problem.
Suppose we have about 30,000 functions that we’ve taken the Fourier transforms of. Each transform has the form e(k)=a(k)+b(k)*i where k is the magnitude of a wavevector, a is the real component and b is the imaginary component. Each of these transforms is saved to file as a 2-column table with the structure below. The number of elements in each vector is about N=1e6. This means that each of these files is 1/64 GB in size. Note that the values of k_i are not necessarily in order.
k | Re(e) Im(e)
k_1 | a(1) b(1)
k_2 | a(2) b(2)
... |
k_N | a(N) b(N)
The goal is to cross-multiply each pair of modes and average the results over a set of about 50 fixed k-bands. So for example, if we let the elements of vectors 5 and 7 be represented respectively as e5=a5+b5*i and e7=a7+b7*i we need
a5(1)*a7(1) + b5(1)*b7(1)
a5(2)*a7(2) + b5(2)*b7(2)
...
a5(N)*a7(N) + b5(N)*b7(N)
Each element of the above N-dimensional vector belongs within a single k-bin. All the elements in each bin must be averaged and returned. So at the end of one mode-mode comparison we end up with just 50 numbers.
I have at my disposal a computer with 512GB of RAM and 48 processors. My version of MATLAB 2013a limits me to only opening 12 processors with parfor simultaneously on each instance of MATLAB that I run. So what I’ve been doing is opening 4 versions of MATLAB, allocating 10 processors each, and sending the maximum amount of data to each processor without spilling over my self-imposed limit of 450 GB.
Naturally this involves my breaking the problem into blocks. If I have 30000 vectors e, then I will have 30000^2 sets of these 50 cross-coefficients. (It’s actually about half this since the problem is symmetric). I decide to break my problem into blocks of size m=300. This means I’d have 100 rows and columns of these blocks. The code I’m running looks like this (I’m simplifying a bit to just include the relevant bits):
for ii=1:100 % for each of the block-rows
[a,b] = f_ReadModes(ii); % this function reads modes 1 through 300 if ii=1,
% modes 301 through 600 if ii=2 and so on
% “a” and “b” are matrices of size Nx300
% “a” contains the real vectors stored in columns,
% “b” contains the imaginary vectors stored in columns
for jj=ii:100 % for each of the block-columns in the upper triangle
[c,d] = f_ReadModes(jj); % same as above except this reads in a different
% set of 300 modes
block = zeros(300,300,50); % allocates space for the results. The first
% dimension corresponds to the "ii modes".
% The 2nd dimension corresponds to the “jj”
% modes. The 3rd dimension is for each k-bin.
parfor rr=1:300
A = zeros(1,300,50); % temporary storage to keep parfor happy
ModeMult = bsxfun(#times,a(:,rr),c)+bsxfun(#times,b(:,rr),d);
% My strategy is to cross multiply one mode by all the others before
% moving on. So when rr=6, I’m multiplying the 6th mode in a (and b)
% by all the modes in c and d. This would help fill in the 6th row
% of block, i.e. block(6,:,:).
for kk=1:50 % Now I average the results in each of the k-bins
ind_dk = f_kbins(kk); % this returns the rows of a,b,c,d and
% ModeMult that lie within the the kk^th bin
A(1,:,kk) = mean(ModeMult(ind_dk,:)); % average results in each bin
end
block(rr,:,:) = A; % place the results in more permanent storage.
end
f_WriteBlock(block); % writes the resulting cross-coefficient block to disk
end
end
There are three bottlenecks in this problem:
1) Read-in time
2) Computing the products ac and bd then summing them (this is ModeMult)
3) Averaging the results of step 2 in each k-bin
Bigger blocks are preferable since they necessitate fewer read-in’s. However, the computations in steps 2 and 3 don’t automatically parallelize, so they have to be sent to individual processors using parfor. Because the computational costs are high, utitlizing all the processors seems necessary.
The way my code is written, each processor needs enough memory to hold 4*N*m elements. When m=300, this means each processor is using about 10 GB of RAM. It would be nice if the memory requirement for each processor could be lowered somehow. It would also be great if the computations in steps 2 and 3 could be rewritten to run more efficiently. Any thoughts?
Suppose that I have these Three variables in matlab Variables
I want to extract diverse values in NewGrayLevels and sum rows of OldHistogram that are in the same rows as one diverse value is.
For example you see in NewGrayLevels that the six first rows are equal to zero. It means that 0 in the NewGrayLevels has taken its value from (0 1 2 3 4 5) of OldGrayLevels. So the corresponding rows in OldHistogram should be summed.
So 0+2+12+38+113+163=328 would be the frequency of the gray level 0 in the equalized histogram and so on.
Those who are familiar with image processing know that it's part of the histogram equalization algorithm.
Note that I don't want to use built-in function "histeq" available in image processing toolbox and I want to implement it myself.
I know how to write the algorithm with for loops. I'm seeking if there is a faster way without using for loops.
The code using for loops:
for k=0:255
Condition = NewGrayLevels==k;
ConditionMultiplied = Condition.*OldHistogram;
NewHistogram(k+1,1) = sum(ConditionMultiplied);
end
I'm afraid if this code gets slow for high resolution big images.Because the variables that I have uploaded are for a small image downloaded from the internet but my code may be used for sattellite images.
I know you say you don't want to use histeq, but it might be worth your time to look at the MATLAB source file to see how the developers wrote it and copy the parts of their code that you would like to implement. Just do edit('histeq') or edit('histeq.m'), I forget which.
Usually the MATLAB code is vectorized where possible and runs pretty quick. This could save you from having to reinvent the entire wheel, just the parts you want to change.
I can't think a way to implement this without a for loop somewhere, but one optimisation you could make would be using indexing instead of multiplication:
for k=0:255
Condition = NewGrayLevels==k; % These act as logical indices to OldHistogram
NewHistogram(k+1,1) = sum(OldHistogram(Condition)); % Removes a vector multiplication, some additions, and an index-to-double conversion
end
Edit:
On rereading your initial post, I think that the way to do this without a for loop is to use accumarray (I find this a difficult function to understand, so read the documentation and search online and on here for examples to do so):
NewHistogram = accumarray(1+NewGrayLevels,OldHistogram);
This should work so long as your maximum value in NewGrayLevels (+1 because you are starting at zero) is equal to the length of OldHistogram.
Well I understood that there's no need to write the code that #Hugh Nolan suggested. See the explanation here:
%The green lines are because after writing the code, I understood that
%there's no need to calculate the equalized histogram in
%"HistogramEqualization" function and after gaining the equalized image
%matrix you can pass it to the "ExtractHistogram" function
% (which there's no loops in it) to acquire the
%equalized histogram.
%But I didn't delete those lines of code because I had tried a lot to
%understand the algorithm and write them.
For more information and studying the code, please see my next question.
I implemented a matlab code that reads a wav file and do some analysis on it.
The size of the wav file is about (3-4 G).
when I run the file I get the following error:
"Out of memory. Type HELP MEMORY for your options"
I tried to increase the virtual memory, but it didn't work.
Following is the code I am using:
event=0;
[x, fs] = wavread('C:\946707752529.wav');
Ts=1/fs;% sampling period
N = length(x);% N is the number of samples
slength = N*Ts;%slength is the length of the sound file in seconds
% Reading the first 180 seconds and find the energy, then set a threshold value
calibration_samples = 180 * fs;
[x2, Fs] = wavread('C:\946707752529.wav', calibration_samples);
Tss=1/Fs;
[b,a]=butter(5,[200 800]/(Fs/2));
y2=filter(b,a,x2);
%This loop is to find the average energy of the samples for the first 180 seconds
startSample=1;
endSample=Fs;
energy=0;
for i=1:180
energy=energy+sum(y2(startSample:endSample).^2)*Tss;
startSample=endSample+1;
endSample=endSample+Fs;
end
mean_energy=energy/180;
Reference_Energy=mean_energy;% this is a reference energy level
Threshold=0.65*Reference_Energy;
% Now filtering the whole recorded file to be between [200-800] Hz
[b,a]=butter(5,[200 800]/(fs/2));
y=filter(b,a,x);
N = length(y);
N=N/fs; % how many iteration we need
startSample=1;
endSample=fs;
energy=0;
j=1;
while( j<=N)
counter=0;
energy=sum(y(startSample:endSample).^2)*Ts;
if (energy<=Threshold)
counter=counter+1;
for k=1:9
startSample=endSample+1;
endSample=endSample+fs;
energy=sum(y(startSample:endSample).^2)*Ts;
if (energy<=Threshold)
counter=counter+1;
else
break;
end %end inner if
end % end inner for
end % end outer IF
if(counter>=10)
event=event+1;
end
if(counter>0)
j=j+counter;
else
j=j+1;
end
startSample=endSample+1;
endSample=endSample+fs;
end % end outer For
System: Windows 7 64 bit
RAM: 8 GB
Matlab: 2013
I guess wavread actually stores all data of the wave file into system memory. Moreover it may add extra informations.
I see that you are calling this function two times, storing results in different matrices, so as your file is 3-4G, you need at least 6-8G of memory. However your OS, Matlab and maybe other programs also need some memory, that's why you have this out of memory error.
One solution is to divide the WAV file into multiple files and read them separately. Another solution is to call wavread only once, and use the loaded data wherever you need it, but without reallocating new memory for that.
Judging from your code this might work:
After reading in the file remove everything except the first 180 secs
Determine the treshold value
Clear memory of everything except the first piece of data
Analyse the piece of data and store the result
Clear memory of everything except the next piece of data...
This is assuming that your algorithm is correct and efficient.
It may also be the case that your algorithm has a problem, to detect this please run the code with dbstop if error and check the size of all variables when it errors out. Then just check whether one of them is much too big and you may have found the mistake.