Matlab : image region analyzer. Alternative for 'bwpropfilt'? - image

I'm running basic edge detection to detect windows region based on this http://www.mathworks.com/videos/edge-detection-with-matlab-119353.html
The edge works successfully :
final_edge = edge(gray_I,'sobel');
BW_out = bwareaopen(imfill(final_edge,'holes'),20);
figure;
imshow(BW_out);
Now when come to these following codes to filter image based on properties, it seems like my MATLAB R2013a can't identify this bwpropfilt method.
% imageRegionAnalyzer(BW);
% Filter image based on image properties
BW_out = bwpropfilt(BW_out,'Area', [400, 467]);
BW_out = bwpropfilt(BW_out,'Solidity',[0.5, 1]);
It says:
Undefined function 'bwpropfilt' for input arguments of type 'char'.
Then what should be my alternative to change this bwpropfilt?

bwpropfilt simply takes a look at the corresponding attribute that is output from regionprops and gives you objects that conform to that certain range and also filtering out those that are outside of the range. You can rewrite the algorithm by explicitly calling regionprops, creating a logical array to index into the structure to retain only the values within the right range (seen in the third input of bwpropfilt) corresponding to the property you want to examine (seen in the second input of bwpropfilt). If you want to finally reconstruct the image after filtering, you'll need to use the column major linear indices found in the PixelIdxList attribute, stack them all into a single vector and write to a new output image by setting all of these values to true.
Specifically, you can use the following code to reproduce the last two lines of code you have shown:
% Run regionprops and get all properties
s = regionprops(BW_out, 'all');
%%% For the first line of code
values = [s.Area];
s = s(values > 400 & values < 467);
%%% For the second line of code
values = [s.Solidity];
s = s(values > 0.5 & values < 1);
% Stack column major indices
ind = vertcat(s.PixelIdxList);
% Create output image
final_out = false(size(BW_out));
final_out(ind) = true;
final_out contains the filtered image only retaining the values within the range specified by the desired property.
Caution
The above logic only works for attributes returned from regionprops that contain only a single scalar value per unique region. If you examine the supported properties found in bwpropfilt, you will see that this list is a subset of the full list found in regionprops. This makes sense as certain regionprops properties return a vector or a matrix depending on what you choose so using a range to filter out properties becomes ambiguous if you have multiple values that characterize a particular unique region returned by regionprops.
Minor Note
Being curious, I opened up bwpropfilt to see how it is implemented as I currently have MATLAB R2016a. The above logic, with the exception of some exception handling, is essentially how bwpropfilt has been implemented so the code that I wrote is in line with the logic of the function.

Related

"Different row counts implied by arguments" in attempt to plot BAM file data

I'm attempting to use this tutorial to manipulate and plot ATAC-sequencing data. I have all the libraries listed in that tutorial installed and loaded, except while they use biocLite(BSgenome.Hsapiens.UCSC.hg19) for the human genome, I'm using biocLite(TxDb.Mmusculus.UCSC.mm10.knownGene) for the mouse genome.
Here I have loaded in my BAM file
sorted_AL1.1BAM <-"Sorted_1_S1_L001_R1_001.fastq.gz.subread.bam"
And created an object called TSS, which is transcription start site regions from the mouse genome. I want to ultimately plot the average signal in my read data across mouse transcription start sites.
TSSs <- resize(genes(TxDb.Mmusculus.UCSC.mm10.knownGene), fix = "start", 1)
The problem occurs with the following code:
nucFree <- regionPlot(bamFile = sorted_AL1.1BAM, testRanges = TSSs, style = "point",
format = "bam", paired = TRUE, minFragmentLength = 0, maxFragmentLength = 100,
forceFragment = 50)
The error is as follows:
Reading Bam header information.....Done
Filtering regions which extend outside of genome boundaries.....Done
Filtered 24528 of 24528 regions
Splitting regions by Watson and Crick strand..Error in DataFrame(..., check.names = FALSE) :
different row counts implied by arguments
I assume my BAM file contains empty values that need to be changed to NAs. My issue is that I'm not sure how to visualize and manipulate BAM files in R in order to do this. Any help would be appreciated.
I tried the following:
data.frame(sorted_AL1.1BAM)
sorted_AL1.1BAM[sorted_AL1.1BAM == ''] <- NA
I expected this to resolve the issue of different row counts, but I get the same error message.

np.where() with np.linspace()

I am trying to find index of an element in x_norm array with np.where() but it doesn' t work well.
Is there a way to find index of element?
x_norm = np.linspace(-10,10,1000)
np.where(x_norm == -0.19019019)
Np.where works with np.arange() and can find the index either first or last element of array creating by linspace.
The numbers generated by np.linspace contains more decimal places than the one you are pasting to np.where (-0.19019019019019012).
So it might be better to use np.argmin to find the nearest value and avoid rounding errors:
x_norm = np.linspace(-10,10,1000)
yournumber=-0.19019019
idx=np.argmin(np.abs(x_norm-yournumber))
You can then go further and add np.where(x_norm==x_norm[idx]) to your code in case if you'll have array with duplicates.
Set the level of precision to 8 using np.round then use np.where to filter data as a mask then apply the mask to the array.
x_norm = np.round(np.asarray(np.linspace(-10,10,1000)),8)
results=x_norm[np.where(x_norm==-9.91991992)]
print(results)

How to setup my function with blockproc to process the image in parts?

I have an image:
I want to divide this image into 3 equal parts and calculate the SIFT for each part individually and then concatenate the results.
I found out that Matlab's blockproc does just that, but I do not know how to get it to work with my function. Here is what I have:
[r c] = size(image);
c_new = floor(c/3); %round it
B = blockproc(image, [r c_new], #block_fun)
So according to Matlabs documentation the function, block_fun will be applied to the original image in blocks of size r and c_new.
this is what I wrote as block_fun
function feats = block_fun(img)
[keypoints, descriptors] = vl_sift(single(img));
feats = descriptors;
end
So, my matrix B should be a concatenation of the SIFT descriptors of all three parts of the same image? right?
But the error that I get when I run the command:
B = blockproc(image, [r c_new], #block_fun)
Function BLOCKPROC encountered an error while evaluating the user
supplied function handle, FUN.
The cause of the error was:
Error using single Conversion to single from struct is not possible.
For your custom function, blockproc sends in a structure where the image data is stored in a field called data. As such, you simply need to change your function so that it accesses the data field in the input. Like so:
function feats = block_fun(block_struct) %// Change
[keypoints, descriptors] = vl_sift(single(block_struct.data)); %// Change
feats = descriptors;
end
This error is caused by the fact that the function that is called via its handle by blockproc expects a block struct.
The real problem is that blockproc will attempt to concatenate all results and you will have a different set of 128xN feature vectors for each block, which blockproc doesn't allow.
I think that using im2col and reshape would be much more simple.

Matlab Query: Image Processing, Editing the Script

I am quite new to image processing and would like to produce an array that stores 10 images. After which I would like to run a for loop through some code that identifies some properties of the images, specifically the surface area of a biological specimen, which then spits out an array containing 10 areas.
Below is what I have managed to scrap up so far, and this is the ensuing error message:
??? Index exceeds matrix dimensions.
Error in ==> Testing1 at 14
nova(i).img = imread([myDir B(i).name]);
Below is the code I've been working on so far:
my_Dir = 'AC04/';
ext_img='*.jpg';
B = dir([my_Dir ext_img]);
nfile = max(size(B));
nova = zeros(1,nfile);
for i = 1:nfile
nova(i).img = imread([myDir B(i).name]);
end
areaarray = zeros(1,nfile);
for k = 1:nfile
[nova(k), threshold] = edge(nova(k), 'sobel');
.
.
.
.%code in this area is irrelevant to the problem I think%
.
.
.
areaarray(k) = bwarea(BWfinal);
end
areaarray
There are few ways you could store an image in a kind of an array structure in Matlab. You could use array of structs. In that case you could do as you did:
nova(i).img = imread([myDir B(i).name]);
You access first image with nova(1).img, second one with nova(2).img etc.
Other way to do it is to use cell array (similar to arrays but are more flexible in the sense that members could be of the different type):
nova{i} = imread([myDir B(i).name]);
You access first image with nova{1}, second one with nova{2} etc.
[ IMPORTANT ] In both cases you should remove this line from code:
nova = zeros(1,nfile);
I suppose you've tried to pre-allocate memory for images, and since you're beginner I advise you not to be concerned with it. It is an optimization concern to be addressed if you come across some performance issues - and if you don't come across them, take advantage of Matlab's automatic memory (re)allocation.

Removing a "row" from a structure array

This is similar to a question I asked before, but is slightly different:
So I have a very large structure array in matlab. Suppose, for argument's sake, to simplify the situation, suppose I have something like:
structure(1).name, structure(2).name, structure(3).name structure(1).returns, structure(2).returns, structure(3).returns (in my real program I have 647 structures)
Suppose further that structure(i).returns is a vector (very large vector, approximately 2,000,000 entries) and that a condition comes along where I want to delete the jth entry from structure(i).returns for all i. How do you do this? or rather, how do you do this reasonably fast? I have tried some things, but they are all insanely slow (I will show them in a second) so I was wondering if the community knew of faster ways to do this.
I have parsed my data two different ways; the first way had everything saved as cell arrays, but because things hadn't been working well for me I parsed the data again and placed everything as vectors.
What I'm actually doing is trying to delete NaN data, as well as all data in the same corresponding row of my data file, and then doing the very same thing after applying the Hampel filter. The relevant part of my code in this attempt is:
for i=numStock+1:-1:1
for j=length(stock(i).return):-1:1
if(isnan(stock(i).return(j)))
for k=numStock+1:-1:1
stock(k).return(j) = [];
end
end
end
stock(i).return = sort(stock(i).return);
stock(i).returnLength = length(stock(i).return);
stock(i).medianReturn = median(stock(i).return);
stock(i).madReturn = mad(stock(i).return,1);
end;
for i=numStock:-1:1
for j = length(stock(i+1).volume):-1:1
if(isnan(stock(i+1).volume(j)))
for k=numStock:-1:1
stock(k+1).volume(j) = [];
end
end
end
stock(i+1).volume = sort(stock(i+1).volume);
stock(i+1).volumeLength = length(stock(i+1).volume);
stock(i+1).medianVolume = median(stock(i+1).volume);
stock(i+1).madVolume = mad(stock(i+1).volume,1);
end;
for i=numStock+1:-1:1
for j=stock(i).returnLength:-1:1
if (abs(stock(i).return(j) - stock(i).medianReturn) > 3*stock(i).madReturn)
for k=numStock+1:-1:1
stock(k).return(j) = [];
end
end;
end;
end;
for i=numStock:-1:1
for j=stock(i+1).volumeLength:-1:1
if (abs(stock(i+1).volume(j) - stock(i+1).medianVolume) > 3*stock(i+1).madVolume)
for k=numStock:-1:1
stock(k+1).volume(j) = [];
end
end;
end;
end;
However, this returns an error:
"Matrix index is out of range for deletion.
Error in Failure (line 110)
stock(k).return(j) = [];"
So instead I tried by parsing everything in as vectors. Then I decided to try and delete the appropriate entries in the vectors prior to building the structure array. This isn't returning an error, but it is very slow:
%% Delete bad data, Hampel Filter
% Delete bad entries
id=strcmp(returns,'');
returns(id)=[];
volume(id)=[];
date(id)=[];
ticker(id)=[];
name(id)=[];
permno(id)=[];
sp500(id) = [];
id=strcmp(returns,'C');
returns(id)=[];
volume(id)=[];
date(id)=[];
ticker(id)=[];
name(id)=[];
permno(id)=[];
sp500(id) = [];
% Convert returns from string to double
returns=cellfun(#str2double,returns);
sp500=cellfun(#str2double,sp500);
% Delete all data for which a return is not a number
nanid=isnan(returns);
returns(nanid)=[];
volume(nanid)=[];
date(nanid)=[];
ticker(nanid)=[];
name(nanid)=[];
permno(nanid)=[];
% Delete all data for which a volume is not a number
nanid=isnan(volume);
returns(nanid)=[];
volume(nanid)=[];
date(nanid)=[];
ticker(nanid)=[];
name(nanid)=[];
permno(nanid)=[];
% Apply the Hampel filter, and delete all data corresponding to
% observations deleted by the filter.
medianReturn = median(returns);
madReturn = mad(returns,1);
for i=length(returns):-1:1
if (abs(returns(i) - medianReturn) > 3*madReturn)
returns(i) = [];
volume(i)=[];
date(i)=[];
ticker(i)=[];
name(i)=[];
permno(i)=[];
end;
end
medianVolume = median(volume);
madVolume = mad(volume,1);
for i=length(volume):-1:1
if (abs(volume(i) - medianVolume) > 3*madVolume)
returns(i) = [];
volume(i)=[];
date(i)=[];
ticker(i)=[];
name(i)=[];
permno(i)=[];
end;
end
As I said, this is very slow, probably because I'm using a for loop on a very large data set; however, I'm not sure how else one would do this. Sorry for the gigantic post, but does anyone have a suggestion as to how I might go about doing what I'm asking in a reasonable way?
EDIT: I should add that getting the vector method to work is probably preferable, since my aim is to put all of the return vectors into a matrix and get all of the volume vectors into a matrix and perform PCA on them, and I'm not sure how I would do that using cell arrays (or even if princomp would work on cell arrays).
EDIT2: I have altered the code to match your suggestion (although I did decide to give up speed and keep with the for-loops to keep with the structure array, since reparsing this data will be way worse time-wise). The new code snipet is:
stock_return = zeros(numStock+1,length(stock(1).return));
for i=1:numStock+1
for j=1:length(stock(i).return)
stock_return(i,j) = stock(i).return(j);
end
end
stock_return = stock_return(~any(isnan(stock_return)), : );
This returns an Index exceeds matrix dimensions error, and I'm not sure why. Any suggestions?
I could not find a convenient way to handle structures, therefore I would restructure the code so that instead of structures it uses just arrays.
For example instead of stock(i).return(j) I would do stock_returns(i,j).
I show you on a part of your code how to get rid of for-loops.
Say we deal with this code:
for j=length(stock(i).return):-1:1
if(isnan(stock(i).return(j)))
for k=numStock+1:-1:1
stock(k).return(j) = [];
end
end
end
Now, the deletion of columns with any NaN data goes like this:
stock_return = stock_return(:, ~any(isnan(stock_return)) );
As for the absolute difference from medianVolume, you can write a similar code:
% stock_return_length is a scalar
% stock_median_return is a column vector (eg. [1;2;3])
% stock_mad_return is also a column vector.
median_return = repmat(stock_median_return, stock_return_length, 1);
is_bad = abs(stock_return - median_return) > 3.* stock_mad_return;
stock_return = stock_return(:, ~any(is_bad));
Using a scalar for stock_return_length means of course that the return lengths are the same, but you implicitly assume it in your original code anyway.
The important point in my answer is using any. Logical indexing is not sufficient in itself, since in your original code you delete all the values if any of them is bad.
Reference to any: http://www.mathworks.co.uk/help/matlab/ref/any.html.
If you want to preserve the original structure, so you stick to stock(i).return, you can speed-up your code using essentially the same scheme but you can only get rid of one less for-loop, meaning that your program will be substantially slower.

Resources