Well, I was trying out convolution on grey scale images, but then when I searched for convolution on rgb images, I couldn't find satisfactory explanation. How to apply convolution to rgb images?
A linear combination of vectors can be computed by linearly combining corresponding vector elements:
a * [x1, y1, z1] + b * [x2, y2, z2] = [a*x1+b*x2, a*y1+b*y2 , a*z1+b*z2]
Because a convolution is a linear operation (i.e. you weight each pixel within a neighborhood and add up the results), it follows that you can apply a convolution to each of the RGB channels independently (e.g. using MATLAB syntax):
img = imread(...);
img(:,:,1) = conv2(img(:,:,1),kernel);
img(:,:,2) = conv2(img(:,:,2),kernel);
img(:,:,3) = conv2(img(:,:,3),kernel);
You can look at this in two different ways: First, you may convert the color image into an intensity image with a normal vector. The most applicable one is (.299, .587, .114) which is the natural gray scale conversion. To get intensity you need to convert I = .299*R + .587*G + .114*B.
If you are designing your own convolutional network and intend to keep the color channels as inputs, just treat colored image as a 4D tensor with 3 channels. For example if you have a (h x w) image, the tensor size is (1 x h x w x 3) and you may use a filter of size (kh x kw x 3 x f) which kh and kw are your filter sizes and f is the required output features.
Related
I have the following code in MATLAB:
I=imread(image);
h=fspecial('gaussian',si,sigma);
I=im2double(I);
I=imfilter(I,h,'conv');
figure,imagesc(I),impixelinfo,title('Original Image after Convolving with gaussian'),colormap('gray');
How can I define and apply a Gaussian filter to an image without imfilter, fspecial and conv2?
It's really unfortunate that you can't use the some of the built-in methods from the Image Processing Toolbox to help you do this task. However, we can still do what you're asking, though it will be a bit more difficult. I'm still going to use some functions from the IPT to help us do what you're asking. Also, I'm going to assume that your image is grayscale. I'll leave it to you if you want to do this for colour images.
Create Gaussian Mask
What you can do is create a grid of 2D spatial co-ordinates using meshgrid that is the same size as the Gaussian filter mask you are creating. I'm going to assume that N is odd to make my life easier. This will allow for the spatial co-ordinates to be symmetric all around the mask.
If you recall, the 2D Gaussian can be defined as:
The scaling factor in front of the exponential is primarily concerned with ensuring that the area underneath the Gaussian is 1. We will deal with this normalization in another way, where we generate the Gaussian coefficients without the scaling factor, then simply sum up all of the coefficients in the mask and divide every element by this sum to ensure a unit area.
Assuming that you want to create a N x N filter, and with a given standard deviation sigma, the code would look something like this, with h representing your Gaussian filter.
%// Generate horizontal and vertical co-ordinates, where
%// the origin is in the middle
ind = -floor(N/2) : floor(N/2);
[X Y] = meshgrid(ind, ind);
%// Create Gaussian Mask
h = exp(-(X.^2 + Y.^2) / (2*sigma*sigma));
%// Normalize so that total area (sum of all weights) is 1
h = h / sum(h(:));
If you check this with fspecial, for odd values of N, you'll see that the masks match.
Filter the image
The basics behind filtering an image is for each pixel in your input image, you take a pixel neighbourhood that surrounds this pixel that is the same size as your Gaussian mask. You perform an element-by-element multiplication with this pixel neighbourhood with the Gaussian mask and sum up all of the elements together. The resultant sum is what the output pixel would be at the corresponding spatial location in the output image. I'm going to use the im2col that will take pixel neighbourhoods and turn them into columns. im2col will take each of these columns and create a matrix where each column represents one pixel neighbourhood.
What we can do next is take our Gaussian mask and convert this into a column vector. Next, we would take this column vector, and replicate this for as many columns as we have from the result of im2col to create... let's call this a Gaussian matrix for a lack of a better term. With this Gaussian matrix, we will do an element-by-element multiplication with this matrix and with the output of im2col. Once we do this, we can sum over all of the rows for each column. The best way to do this element-by-element multiplication is through bsxfun, and I'll show you how to use it soon.
The result of this will be your filtered image, but it will be a single vector. You would need to reshape this vector back into matrix form with col2im to get our filtered image. However, a slight problem with this approach is that it doesn't filter pixels where the spatial mask extends beyond the dimensions of the image. As such, you'll actually need to pad the border of your image with zeroes so that we can properly do our filter. We can do this with padarray.
Therefore, our code will look something like this, going with your variables you have defined above:
N = 5; %// Define size of Gaussian mask
sigma = 2; %// Define sigma here
%// Generate Gaussian mask
ind = -floor(N/2) : floor(N/2);
[X Y] = meshgrid(ind, ind);
h = exp(-(X.^2 + Y.^2) / (2*sigma*sigma));
h = h / sum(h(:));
%// Convert filter into a column vector
h = h(:);
%// Filter our image
I = imread(image);
I = im2double(I);
I_pad = padarray(I, [floor(N/2) floor(N/2)]);
C = im2col(I_pad, [N N], 'sliding');
C_filter = sum(bsxfun(#times, C, h), 1);
out = col2im(C_filter, [N N], size(I_pad), 'sliding');
out contains the filtered image after applying a Gaussian filtering mask to your input image I. As an example, let's say N = 9, sigma = 4. Let's also use cameraman.tif that is an image that's part of the MATLAB system path. By using the above parameters, as well as the image, this is the input and output image we get:
I m trying to compute an efficient way to transform an image in cartesian coordinates into a polar representation. I know some functions such as ImToPolar are doing it and it works perfectly but takes a considerable much time for big images, especially when they require to be processed back and forth.
Here´s my input image:
and then I generate a polar mesh using a cartesian mesh centered at 0 and the function cart2pol(). Finally, I plot my image using mesh(theta, r, Input).
And here´s what I obtain:
Its exactly the image I need and it´s the same as ImToPolar or maybe better.
Since MATLAB knows how to compute it, does anybody know how to extract a matrix in polar representation from this output? Or maybe a fast (like in fast fourier transform) way to compute a Polar transform (and inverse) on MATLAB?
pol2cart and meshgrid and interp2 are sufficient to create the result:
I=imread('http://i.stack.imgur.com/HYSyb.png');
[r, c,~] = size(I);
%rgb image can be converted to indexed image to prevent excessive copmutation for each color
[idx, mp] = rgb2ind(I,32);
% add offset to image coordinates
x = (1:c)-(c/2);
y = (1:r)-(r/2);
% create distination coordinates in polar form so value of image can be interpolated in those coordinates
% angle ranges from 0 to 2 * pi and radius assumed that ranges from 0 to 400
% linspace(0,2*pi, 200) leads to a stretched image try it!
[xp yp] = meshgrid(linspace(0,2*pi), linspace(0,400));
%translate coordinate from polar to image coordinates
[xx , yy] = pol2cart(xp,yp);
% interpolate pixel values for unknwon coordinates
out = interp2(x, y, idx, xx, yy);
% save the result to a file
imwrite(out, mp, 'result.png')
So I want to measure the vertical edges of an image to use it later as depth cue for 2D to 3D conversion.
To do so I will have to compute the horizontal gradient value for each block to measure the vertical edges as follow:
̅ g(x,y) = 1/N ∑_((x',y')∈ Ω(x,y))〖g(x', y')〗
Where:
g(x',y') is a horizontal gradient at a pixel location (x',y'),
omega(x,y) is the nighborhood of the pixel location(x',y')
and N is the number of pixels in omega(x,y).
So Here is what I did on matlab:
I = im2double(imread('landscape.jpg'));
% convert RGB to gray
gI = rgb2gray(I);
[nrow, ncol] = size(gI);
% divide the image into 4-by-4 blocks
gI = mat2tiles((gI),[4,4]);
N = 4*4; % block size
% For each block, compute the horizontal gradient
gI = reshape([gI{:}],4*4, []);
mask = fspecial('sobel');
g = imfilter(gI, mask);
g_bar = g./N;
g_bar = reshape(g_bar,nrow, ncol);
I'm new to Matlab so I'm not sure if my code is expressing the equation in the right way.
Can you please let me know if you think it is correct? as I'm not sure how to test the output!
There's no need for you to decompose your image into 4 x 4 blocks. The horizontal gradient can be used with a Sobel filter or Prewitt filter, which is 3 x 3 and can directly be put into imfilter. imfilter performs 2D convolution / filtering with a specified mask / kernel for you, so tiling is not necessary. As such, you can just use imfilter with the mask defined through fspecial, and define N = 9. Therefore:
I = im2double(imread('landscape.jpg'));
% convert RGB to gray
gI = rgb2gray(I);
N = 9;
mask = fspecial('sobel');
g = imfilter(gI, mask);
g_bar = g./N;
From experience, increasing the size of your gradient mask won't give you much better results. You want to ensure that the mask is as small as possible to capture as many local changes as possible.
Seems like that both V # HSV and Y # YUV represents brightness? Which one is better for brightness representation? or any other better indicator for measurement of image brightness?
The Y in YUV according to wikipedia is the sum of the three RGB components multiplied with some constants:
Y = 0.299 * R + 0.587 * G + 0.114 * B
The V in HSV according to wikipedia is the value is defined as the largest component of a color:
V = M = max(R, G, B)
The answer is thus no, they are not the same.
EDIT
The two color spaces have two different backgrounds, the HSV model rearranges the geometry of RGB in an attempt to be more intuitive and perceptually relevant than the cartesian (cube) representation. And the YUV model encodes a color image or video taking human perception into account, allowing reduced bandwidth for chrominance components, thereby typically enabling transmission errors or compression artifacts to be more efficiently masked by the human perception than using a "direct" RGB-representation.
So more suitable or better representation depends on your application.
I have a gray scale image stored in a matrix Mat and am asked to extract the covariances for 1 pixel horizontal and vertical displacement.
I thought of using circshift and cov to extract the covariance.
Mat = magic(5); % this represents my gray scale image
MatHs = circshift(Mat,[0 1]); % horizontal displacement
MatVs = circshift(Mat,[1 0]); % vertical displacement
covMatH = cov(Mat,MatHs)
covMatV = cov(Mat,MatVs)
However, the result of covMatH and covMatV must be of size 1 by 1 where mine is 2 by 2.
Did I misapplied the cov functions or have I not understood the question correctly and this task must be solved completely different?
Since your image is 2-dimensional, you will be receiving a covariance matrix (link) of size 2*2. You have definitely solved the task of finding the covariance, but each element of the 2*2 matrix represents a different index. Lets say the matrix is [A B; C D]. A and D would represent variance of the inputs. B and C would represent cross-covariance between the inputs.