In the CS231n course of Stanford (http://cs231n.github.io/linear-classify/#loss), we train a linear classifier on images from CIFAR-10, and when plotting the weight matrix for each class, we get an image that resembles the class predicted. How is that a consequence of the training process? What is the mathematical reason behind it ?
Linear classifier: f(x, W, b) = W*x + b
input x is a flattened image
each row of W can be reconstructed to become an image
Related
I have to do an exercise about the large deconvolution problem in astrophysics. First, we use the Moffat PSF to model atmospheric effects. Here's its expression :
I have generated a starting image from this PSF, but with adding white noise noise and background, like this :
Here's an example of starting image (30x30 pixels) :
Now, we consider the location of star and PSF parameters (r0,c0,alpha,beta) are known.
I have to answer to these questions :
For the first question, I am looking for the estimation of amplitude "a" of PSF and of value of background "b". I have to formulate the expression (1) under matricial form with a known matrix H and a set of data "d" (It seems that "d" array is the vector 1D expression of image 2D : d = D(:); and D = reshape(d,Rows,Columns).
So my question is : What is this expression with "known matrix H" ? I can't find in my courses this matricial expression that could allow to estimate the amplitude "a" of star and the estimation of "background" "b" ?
Which H known matrix is considered into this problem ?
Well, I was trying out convolution on grey scale images, but then when I searched for convolution on rgb images, I couldn't find satisfactory explanation. How to apply convolution to rgb images?
A linear combination of vectors can be computed by linearly combining corresponding vector elements:
a * [x1, y1, z1] + b * [x2, y2, z2] = [a*x1+b*x2, a*y1+b*y2 , a*z1+b*z2]
Because a convolution is a linear operation (i.e. you weight each pixel within a neighborhood and add up the results), it follows that you can apply a convolution to each of the RGB channels independently (e.g. using MATLAB syntax):
img = imread(...);
img(:,:,1) = conv2(img(:,:,1),kernel);
img(:,:,2) = conv2(img(:,:,2),kernel);
img(:,:,3) = conv2(img(:,:,3),kernel);
You can look at this in two different ways: First, you may convert the color image into an intensity image with a normal vector. The most applicable one is (.299, .587, .114) which is the natural gray scale conversion. To get intensity you need to convert I = .299*R + .587*G + .114*B.
If you are designing your own convolutional network and intend to keep the color channels as inputs, just treat colored image as a 4D tensor with 3 channels. For example if you have a (h x w) image, the tensor size is (1 x h x w x 3) and you may use a filter of size (kh x kw x 3 x f) which kh and kw are your filter sizes and f is the required output features.
I have the following code in MATLAB:
I=imread(image);
h=fspecial('gaussian',si,sigma);
I=im2double(I);
I=imfilter(I,h,'conv');
figure,imagesc(I),impixelinfo,title('Original Image after Convolving with gaussian'),colormap('gray');
How can I define and apply a Gaussian filter to an image without imfilter, fspecial and conv2?
It's really unfortunate that you can't use the some of the built-in methods from the Image Processing Toolbox to help you do this task. However, we can still do what you're asking, though it will be a bit more difficult. I'm still going to use some functions from the IPT to help us do what you're asking. Also, I'm going to assume that your image is grayscale. I'll leave it to you if you want to do this for colour images.
Create Gaussian Mask
What you can do is create a grid of 2D spatial co-ordinates using meshgrid that is the same size as the Gaussian filter mask you are creating. I'm going to assume that N is odd to make my life easier. This will allow for the spatial co-ordinates to be symmetric all around the mask.
If you recall, the 2D Gaussian can be defined as:
The scaling factor in front of the exponential is primarily concerned with ensuring that the area underneath the Gaussian is 1. We will deal with this normalization in another way, where we generate the Gaussian coefficients without the scaling factor, then simply sum up all of the coefficients in the mask and divide every element by this sum to ensure a unit area.
Assuming that you want to create a N x N filter, and with a given standard deviation sigma, the code would look something like this, with h representing your Gaussian filter.
%// Generate horizontal and vertical co-ordinates, where
%// the origin is in the middle
ind = -floor(N/2) : floor(N/2);
[X Y] = meshgrid(ind, ind);
%// Create Gaussian Mask
h = exp(-(X.^2 + Y.^2) / (2*sigma*sigma));
%// Normalize so that total area (sum of all weights) is 1
h = h / sum(h(:));
If you check this with fspecial, for odd values of N, you'll see that the masks match.
Filter the image
The basics behind filtering an image is for each pixel in your input image, you take a pixel neighbourhood that surrounds this pixel that is the same size as your Gaussian mask. You perform an element-by-element multiplication with this pixel neighbourhood with the Gaussian mask and sum up all of the elements together. The resultant sum is what the output pixel would be at the corresponding spatial location in the output image. I'm going to use the im2col that will take pixel neighbourhoods and turn them into columns. im2col will take each of these columns and create a matrix where each column represents one pixel neighbourhood.
What we can do next is take our Gaussian mask and convert this into a column vector. Next, we would take this column vector, and replicate this for as many columns as we have from the result of im2col to create... let's call this a Gaussian matrix for a lack of a better term. With this Gaussian matrix, we will do an element-by-element multiplication with this matrix and with the output of im2col. Once we do this, we can sum over all of the rows for each column. The best way to do this element-by-element multiplication is through bsxfun, and I'll show you how to use it soon.
The result of this will be your filtered image, but it will be a single vector. You would need to reshape this vector back into matrix form with col2im to get our filtered image. However, a slight problem with this approach is that it doesn't filter pixels where the spatial mask extends beyond the dimensions of the image. As such, you'll actually need to pad the border of your image with zeroes so that we can properly do our filter. We can do this with padarray.
Therefore, our code will look something like this, going with your variables you have defined above:
N = 5; %// Define size of Gaussian mask
sigma = 2; %// Define sigma here
%// Generate Gaussian mask
ind = -floor(N/2) : floor(N/2);
[X Y] = meshgrid(ind, ind);
h = exp(-(X.^2 + Y.^2) / (2*sigma*sigma));
h = h / sum(h(:));
%// Convert filter into a column vector
h = h(:);
%// Filter our image
I = imread(image);
I = im2double(I);
I_pad = padarray(I, [floor(N/2) floor(N/2)]);
C = im2col(I_pad, [N N], 'sliding');
C_filter = sum(bsxfun(#times, C, h), 1);
out = col2im(C_filter, [N N], size(I_pad), 'sliding');
out contains the filtered image after applying a Gaussian filtering mask to your input image I. As an example, let's say N = 9, sigma = 4. Let's also use cameraman.tif that is an image that's part of the MATLAB system path. By using the above parameters, as well as the image, this is the input and output image we get:
I would like to fit a MR binary data of 281*398*104 matrix which is not a perfect sphere, and find out the center and radius of sphere and error also. I know LMS or SVD is a good choice to fit for sphere.
I have tried sphereFit from matlab file exchange but got an error,
>> sphereFit(data)
Warning: Matrix is singular to working precision.
> In sphereFit at 33
ans =
NaN NaN NaN
Would you let me know where is the problem, or any others solution?
If you want to use sphere fitting algorithm you should first extract the boundary points of the object you assume to be a sphere. The result should be represented by a N-by-3 array containing coordinates of the points. Then you can apply sphereFit function.
In order to obtain boundary point of a binary object, there are several methods. One method is to apply morphological erosion (you need the "imerode" function from the image processing toolbox) with small structuring element, then compute set difference between the two images, and finally use the "find" function to transform binary image into a coordinate array.
the idea is as follow:
dataIn = imerode(data, ones([3 3 3]));
bnd = data & ~data2;
inds = find(bnd);
[y, x, z] = ind2sub(size(data), inds); % be careful about x y order
points = [x y z];
sphere = sphereFitting(points);
By the way, the link you gave refers to circle fitting, I suppose you wanted to point to a sphere fitting submission?
regards,
I have looked at many articles and answers to questions on how the Viola-Jones algorithm really works. I keep finding the answers saying the "sum of pixels" in a certain region subtracted by the "sum of pixels" in the adjacent region. I'm confused on what "sum of pixels" means. What is the value based on? Is it the number of pixels in the area? The intensity of the color?
Thanks in advance.
These are the definitions based on Viola-Jones paper on 'Robust Real-time Object Detection'
Integral Image: Integral Image(ii) at location x, y = ii(x,y)
ii(x,y) = > Sum of the pixels above and to the left of x, y inclusive
Here 'Sum of Pixels' implies the sum of pixels intensity values ( e.g., for a 8 bit gray scale image, a value between 0 and 255 ) at each pixel element to the above and to the left of pixel (x, y) and including the row/column x and y, considering a gray scale image in the representation.
Significance of the integral image is that it speeds up the computation of the sum of pixel intensities within any rectangular block of pixels. e.g. four array references.
And the integral image value by itself at each point given by ii(x,y) can be computed in one pass over the original image i(x,y)
using the below equations on each point during the pass as detailed in the reference paper:
s(x,y) = s(x,y-1) + i(x,y);
ii(x,y) = ii(x-1,y) + s(x,y);
where
s(x,y) = the cumulative row sum;
s(x,-1) = 0;
ii(-1,y) = 0;
These integral image values are then used to generate features to learn and later detect objects.
The original Viola-Jones algorithm uses "Haar-like" features, which are approximations of first and second Gaussian derivative filters.
Gaussian derivative filters look like this:
Haar-like filters look like this:
The reason Viola and Jones used Haar-like filters, is that they can be evaluated very efficiently. All you have to do is subtract the sum of pixels covered by the black region of the filter from the sum of pixels covered by the white region. And since the regions are rectangular, the sum of the pixels in each region can be efficiently calculated from the corresponding integral image.