I've found some methods to enlarge an image but there is no solution to shrink an image. I'm currently using the nearest neighbor method. How could I do this with bilinear interpolation without using the imresize function in MATLAB?
In your comments, you mentioned you wanted to resize an image using bilinear interpolation. Bear in mind that the bilinear interpolation algorithm is size independent. You can very well use the same algorithm for enlarging an image as well as shrinking an image. The right scale factors to sample the pixel locations are dependent on the output dimensions you specify. This doesn't change the core algorithm by the way.
Before I start with any code, I'm going to refer you to Richard Alan Peters' II digital image processing slides on interpolation, specifically slide #59. It has a great illustration as well as pseudocode on how to do bilinear interpolation that is MATLAB friendly. To be self-contained, I'm going to include his slide here so we can follow along and code it:
Please be advised that this only resamples the image. If you actually want to match MATLAB's output, you need to disable anti-aliasing.
MATLAB by default will perform anti-aliasing on the images to ensure the output looks visually pleasing. If you'd like to compare apples with apples, make sure you disable anti-aliasing when comparing between this implementation and MATLAB's imresize function.
Let's write a function that will do this for us. This function will take in an image (that is read in through imread) which can be either colour or grayscale, as well as an array of two elements - The image you want to resize and the output dimensions in a two-element array of the final resized image you want. The first element of this array will be the rows and the second element of this array will be the columns. We will simply go through this algorithm and calculate the output pixel colours / grayscale values using this pseudocode:
function [out] = bilinearInterpolation(im, out_dims)
%// Get some necessary variables first
in_rows = size(im,1);
in_cols = size(im,2);
out_rows = out_dims(1);
out_cols = out_dims(2);
%// Let S_R = R / R'
S_R = in_rows / out_rows;
%// Let S_C = C / C'
S_C = in_cols / out_cols;
%// Define grid of co-ordinates in our image
%// Generate (x,y) pairs for each point in our image
[cf, rf] = meshgrid(1 : out_cols, 1 : out_rows);
%// Let r_f = r'*S_R for r = 1,...,R'
%// Let c_f = c'*S_C for c = 1,...,C'
rf = rf * S_R;
cf = cf * S_C;
%// Let r = floor(rf) and c = floor(cf)
r = floor(rf);
c = floor(cf);
%// Any values out of range, cap
r(r < 1) = 1;
c(c < 1) = 1;
r(r > in_rows - 1) = in_rows - 1;
c(c > in_cols - 1) = in_cols - 1;
%// Let delta_R = rf - r and delta_C = cf - c
delta_R = rf - r;
delta_C = cf - c;
%// Final line of algorithm
%// Get column major indices for each point we wish
%// to access
in1_ind = sub2ind([in_rows, in_cols], r, c);
in2_ind = sub2ind([in_rows, in_cols], r+1,c);
in3_ind = sub2ind([in_rows, in_cols], r, c+1);
in4_ind = sub2ind([in_rows, in_cols], r+1, c+1);
%// Now interpolate
%// Go through each channel for the case of colour
%// Create output image that is the same class as input
out = zeros(out_rows, out_cols, size(im, 3));
out = cast(out, class(im));
for idx = 1 : size(im, 3)
chan = double(im(:,:,idx)); %// Get i'th channel
%// Interpolate the channel
tmp = chan(in1_ind).*(1 - delta_R).*(1 - delta_C) + ...
chan(in2_ind).*(delta_R).*(1 - delta_C) + ...
chan(in3_ind).*(1 - delta_R).*(delta_C) + ...
chan(in4_ind).*(delta_R).*(delta_C);
out(:,:,idx) = cast(tmp, class(im));
end
Take the above code, copy and paste it into a file called bilinearInterpolation.m and save it. Make sure you change your working directory where you've saved this file.
Except for sub2ind and perhaps meshgrid, everything seems to be in accordance with the algorithm. meshgrid is very easy to explain. All you're doing is specifying a 2D grid of (x,y) co-ordinates, where each location in your image has a pair of (x,y) or column and row co-ordinates. Creating a grid through meshgrid avoids any for loops as we will have generated all of the right pixel locations from the algorithm that we need before we continue.
How sub2ind works is that it takes in a row and column location in a 2D matrix (well... it can really be any amount of dimensions you want), and it outputs a single linear index. If you're not aware of how MATLAB indexes into matrices, there are two ways you can access an element in a matrix. You can use the row and column to get what you want, or you can use a column-major index. Take a look at this matrix example I have below:
A =
1 2 3 4 5
6 7 8 9 10
11 12 13 14 15
If we want to access the number 9, we can do A(2,4) which is what most people tend to default to. There is another way to access the number 9 using a single number, which is A(11)... now how is that the case? MATLAB lays out the memory of its matrices in column-major format. This means that if you were to take this matrix and stack all of its columns together in a single array, it would look like this:
A =
1
6
11
2
7
12
3
8
13
4
9
14
5
10
15
Now, if you want to access element number 9, you would need to access the 11th element of this array. Going back to the interpolation bit, sub2ind is crucial if you want to vectorize accessing the elements in your image to do the interpolation without doing any for loops. As such, if you look at the last line of the pseudocode, we want to access elements at r, c, r+1 and c+1. Note that all of these are 2D arrays, where each element in each of the matching locations in all of these arrays tell us the four pixels we need to sample from in order to produce the final output pixel. The output of sub2ind will also be 2D arrays of the same size as the output image. The key here is that each element of the 2D arrays of r, c, r+1, and c+1 will give us the column-major indices into the image that we want to access, and by throwing this as input into the image for indexing, we will exactly get the pixel locations that we want.
There are some important subtleties I'd like to add when implementing the algorithm:
You need to make sure that any indices to access the image when interpolating outside of the image are either set to 1 or the number of rows or columns to ensure you don't go out of bounds. Actually, if you extend to the right or below the image, you need to set this to one below the maximum as the interpolation requires that you are accessing pixels to one over to the right or below. This will make sure that you're still within bounds.
You also need to make sure that the output image is cast to the same class as the input image.
I ran through a for loop to interpolate each channel on its own. You could do this intelligently using bsxfun, but I decided to use a for loop for simplicity, and so that you are able to follow along with the algorithm.
As an example to show this works, let's use the onion.png image that is part of MATLAB's system path. The original dimensions of this image are 135 x 198. Let's interpolate this image by making it larger, going to 270 x 396 which is twice the size of the original image:
im = imread('onion.png');
out = bilinearInterpolation(im, [270 396]);
figure;
imshow(im);
figure;
imshow(out);
The above code will interpolate the image by increasing each dimension by twice as much, then show a figure with the original image and another figure with the scaled up image. This is what I get for both:
Similarly, let's shrink the image down by half as much:
im = imread('onion.png');
out = bilinearInterpolation(im, [68 99]);
figure;
imshow(im);
figure;
imshow(out);
Note that half of 135 is 67.5 for the rows, but I rounded up to 68. This is what I get:
One thing I've noticed in practice is that upsampling with bilinear has decent performance in comparison to other schemes like bicubic... or even Lanczos. However, when you're shrinking an image, because you're removing detail, nearest neighbour is very much sufficient. I find bilinear or bicubic to be overkill. I'm not sure about what your application is, but play around with the different interpolation algorithms and see what you like out of the results. Bicubic is another story, and I'll leave that to you as an exercise. Those slides I referred you to does have material on bicubic interpolation if you're interested.
Good luck!
Related
I have the following code in MATLAB:
I=imread(image);
h=fspecial('gaussian',si,sigma);
I=im2double(I);
I=imfilter(I,h,'conv');
figure,imagesc(I),impixelinfo,title('Original Image after Convolving with gaussian'),colormap('gray');
How can I define and apply a Gaussian filter to an image without imfilter, fspecial and conv2?
It's really unfortunate that you can't use the some of the built-in methods from the Image Processing Toolbox to help you do this task. However, we can still do what you're asking, though it will be a bit more difficult. I'm still going to use some functions from the IPT to help us do what you're asking. Also, I'm going to assume that your image is grayscale. I'll leave it to you if you want to do this for colour images.
Create Gaussian Mask
What you can do is create a grid of 2D spatial co-ordinates using meshgrid that is the same size as the Gaussian filter mask you are creating. I'm going to assume that N is odd to make my life easier. This will allow for the spatial co-ordinates to be symmetric all around the mask.
If you recall, the 2D Gaussian can be defined as:
The scaling factor in front of the exponential is primarily concerned with ensuring that the area underneath the Gaussian is 1. We will deal with this normalization in another way, where we generate the Gaussian coefficients without the scaling factor, then simply sum up all of the coefficients in the mask and divide every element by this sum to ensure a unit area.
Assuming that you want to create a N x N filter, and with a given standard deviation sigma, the code would look something like this, with h representing your Gaussian filter.
%// Generate horizontal and vertical co-ordinates, where
%// the origin is in the middle
ind = -floor(N/2) : floor(N/2);
[X Y] = meshgrid(ind, ind);
%// Create Gaussian Mask
h = exp(-(X.^2 + Y.^2) / (2*sigma*sigma));
%// Normalize so that total area (sum of all weights) is 1
h = h / sum(h(:));
If you check this with fspecial, for odd values of N, you'll see that the masks match.
Filter the image
The basics behind filtering an image is for each pixel in your input image, you take a pixel neighbourhood that surrounds this pixel that is the same size as your Gaussian mask. You perform an element-by-element multiplication with this pixel neighbourhood with the Gaussian mask and sum up all of the elements together. The resultant sum is what the output pixel would be at the corresponding spatial location in the output image. I'm going to use the im2col that will take pixel neighbourhoods and turn them into columns. im2col will take each of these columns and create a matrix where each column represents one pixel neighbourhood.
What we can do next is take our Gaussian mask and convert this into a column vector. Next, we would take this column vector, and replicate this for as many columns as we have from the result of im2col to create... let's call this a Gaussian matrix for a lack of a better term. With this Gaussian matrix, we will do an element-by-element multiplication with this matrix and with the output of im2col. Once we do this, we can sum over all of the rows for each column. The best way to do this element-by-element multiplication is through bsxfun, and I'll show you how to use it soon.
The result of this will be your filtered image, but it will be a single vector. You would need to reshape this vector back into matrix form with col2im to get our filtered image. However, a slight problem with this approach is that it doesn't filter pixels where the spatial mask extends beyond the dimensions of the image. As such, you'll actually need to pad the border of your image with zeroes so that we can properly do our filter. We can do this with padarray.
Therefore, our code will look something like this, going with your variables you have defined above:
N = 5; %// Define size of Gaussian mask
sigma = 2; %// Define sigma here
%// Generate Gaussian mask
ind = -floor(N/2) : floor(N/2);
[X Y] = meshgrid(ind, ind);
h = exp(-(X.^2 + Y.^2) / (2*sigma*sigma));
h = h / sum(h(:));
%// Convert filter into a column vector
h = h(:);
%// Filter our image
I = imread(image);
I = im2double(I);
I_pad = padarray(I, [floor(N/2) floor(N/2)]);
C = im2col(I_pad, [N N], 'sliding');
C_filter = sum(bsxfun(#times, C, h), 1);
out = col2im(C_filter, [N N], size(I_pad), 'sliding');
out contains the filtered image after applying a Gaussian filtering mask to your input image I. As an example, let's say N = 9, sigma = 4. Let's also use cameraman.tif that is an image that's part of the MATLAB system path. By using the above parameters, as well as the image, this is the input and output image we get:
I am learning image analysis and trying to average set of color images and get standard deviation at each pixel
I have done this, but it is not by averaging RGB channels. (for ex rchannel = I(:,:,1))
filelist = dir('dir1/*.jpg');
ims = zeros(215, 300, 3);
for i=1:length(filelist)
imname = ['dir1/' filelist(i).name];
rgbim = im2double(imread(imname));
ims = ims + rgbim;
end
avgset1 = ims/length(filelist);
figure;
imshow(avgset1);
I am not sure if this is correct. I am confused as to how averaging images is useful.
Also, I couldn't get the matrix holding standard deviation.
Any help is appreciated.
If you are concerned about finding the mean RGB image, then your code is correct. What I like is that you converted the images using im2double before accumulating the mean and so you are making everything double precision. As what Parag said, finding the mean image is very useful especially in machine learning. It is common to find the mean image of a set of images before doing image classification as it allows the dynamic range of each pixel to be within a normalized range. This allows the training of the learning algorithm to converge quickly to the optimum solution and provide the best set of parameters to facilitate the best accuracy in classification.
If you want to find the mean RGB colour which is the average colour over all images, then no your code is not correct.
You have summed over all channels individually which is stored in sumrgbims, so the last step you need to do now take this image and sum over each channel individually. Two calls to sum in the first and second dimensions chained together will help. This will produce a 1 x 1 x 3 vector, so using squeeze after this to remove the singleton dimensions and get a 3 x 1 vector representing the mean RGB colour over all images is what you get.
Therefore:
mean_colour = squeeze(sum(sum(sumrgbims, 1), 2));
To address your second question, I'm assuming you want to find the standard deviation of each pixel value over all images. What you will have to do is accumulate the square of each image in addition to accumulating each image inside the loop. After that, you know that the standard deviation is the square root of the variance, and the variance is equal to the average sum of squares subtracted by the mean squared. We have the mean image, now you just have to square the mean image and subtract this with the average sum of squares. Just to be sure our math is right, supposing we have a signal X with a mean mu. Given that we have N values in our signal, the variance is thus equal to:
Source: Science Buddies
The standard deviation would simply be the square root of the above result. We would thus calculate this for each pixel independently. Therefore you can modify your loop to do that for you:
filelist = dir('set1/*.jpg');
sumrgbims = zeros(215, 300, 3);
sum2rgbims = sumrgbims; % New - for standard deviation
for i=1:length(filelist)
imname = ['set1/' filelist(i).name];
rgbim = im2double(imread(imname));
sumrgbims = sumrgbims + rgbim;
sum2rgbims = sum2rgbims + rgbim.^2; % New
end
rgbavgset1 = sumrgbims/length(filelist);
% New - find standard deviation
rgbstdset1 = ((sum2rgbims / length(filelist)) - rgbavgset.^2).^(0.5);
figure;
imshow(rgbavgset1, []);
% New - display standard deviation image
figure;
imshow(rgbstdset1, []);
Also to make sure, I've scaled the display of each imshow call so the smallest value gets mapped to 0 and the largest value gets mapped to 1. This does not change the actual contents of the images. This is just for display purposes.
I have this Python code:
cv2.addWeighted(src1, 4, cv2.GaussianBlur(src1, (0, 0), 10), src2, -4, 128)
How can I convert it to Matlab? So far I got this:
f = imread0('X.jpg');
g = imfilter(f, fspecial('gaussian',[size(f,1),size(f,2)],10));
alpha = 4;
beta = -4;
f1 = f*alpha+g*beta+128;
I want to subtract local mean color image.
Input image:
Blending output from OpenCV:
The documentation for cv2.addWeighted has the definition such that:
cv2.addWeighted(src1, alpha, src2, beta, gamma[, dst[, dtype]]) → dst
Also, the operations performed on the output image is such that:
(source: opencv.org)
Therefore, what your code is doing is exactly correct... at least for cv2.addWeighted. You take alpha, multiply this by the first image, then beta, multiply this by the second image, then add gamma on top of this. The only intricacy left to deal with is saturate, which means that any values that are beyond the dynamic range of the data type you are dealing with, you cap it at that much. Because there is a potential for negatives to occur in the result, the saturate option simply means to make any values that are negative 0 and any values that are greater than the maximum expected to that max. In this case, you'll want to make any values larger than 1 equal to 1. As such, it'll be a good idea to convert your image to double through im2double because you want to allow the addition and subtraction of values beyond the dynamic range to happen first, then you saturate after. By using the default image precision of the image (which is uint8), the saturation will happen even before the saturate operation occurs, and that'll give you the wrong results. Because you're doing this double conversion, you'll want to convert the addition of 128 for your gamma to 0.5 to compensate.
Now, the only slight problem is your Gaussian Blur. Looking at the documentation, by doing cv2.GaussianBlur(src1, (0, 0), 10), you are telling OpenCV to infer on the mask size while the standard deviation is 10. MATLAB does not infer the size of the mask for you, so you need to do this yourself. A common practice is to simply find six-times the standard deviation, take the floor and add 1. This is for both the horizontal and vertical dimensions of the mask. You can see my post here on the justification as to why this is common practice: By which measures should I set the size of my Gaussian filter in MATLAB?
Therefore, in MATLAB, you would do this with your Gaussian blur instead. BTW, it's simply imread, not imread0:
f = im2double(imread('http://i.stack.imgur.com/kl3Md.jpg')); %// Change - Reading image directly from StackOverflow
sigma = 10; %// Change
sz = 1 + floor(6*sigma); %// Change
g = imfilter(f, fspecial('gaussian', sz, sigma)); %// Change
%// Rest of the code is the same
alpha = 4;
beta = -4;
f1 = f*alpha+g*beta+0.5; %// Change
%// Saturate
f1(f1 > 1) = 1;
f1(f1 < 0) = 0;
I get this image:
Take a note that there is a slight difference in the way this appears between OpenCV in MATLAB... especially the hallowing around the eye. This is because OpenCV does something different when inferring the mask size for the Gaussian blur. This I'm not sure what is going on, but how I specified the mask size by looking at the standard deviation is one of the most common heuristics for it. Play around with the standard deviation until you get something you like.
I am trying to compute contour of a binary image. Currently i identify the first non zero and the last non zero pixel in the image through looping. Is there a better way? i have encountered few functions:
imcontour(I)
bwtraceboundary(bw,P,fstep,conn,n,dir)
But the first doesn't return the x and y coordinates of the contour. The second function requires a seed point which i cannot provide. An example of the image is shown below. Thanks.
I'm surprised you didn't see bwperim. Did you not try bwperim? This finds the perimeter pixels of all closed objects that are white in a binary image. Using your image directly from StackOverflow:
im = im2bw(imread('http://i.stack.imgur.com/yAZ5L.png'));
out = bwperim(im);
imshow(out);
We get:
#rayryeng have already provided the correct answer. As another approach (might be that bwperim performs this operations internally) boundaries of a binary image can be obtained by calculating the difference between the dilated and the eroded image.
For a given image:
im = im2bw(imread('http://i.stack.imgur.com/yAZ5L.png'));
and a given binary structural element:
selem = ones(3,3); %// square, 8-Negihbours
% selem = [0 1 0; 1 0 1; 0 1 0]; %// cross, 4-Neighbours
The contour of the object can be extracted as:
out = imerode(im, selem) ~= imdilate(im, selem);
Here, however, the boundary is thicker than using bwperim, as the pixels are masked in both inside and outside of the object.
I had the same problem, stumbled across this question and just wanted to add that imcontour(Img); does return a matrix. The first row contains the x-values, the second row contains the y-values.
contour = imcontour(Img); x = contour(1,:); y = contour(2,:);
But I would discard the first column.
I've been performing a 2D mode filter on an RGB image by running medfilt2 independently on the R,G and B channels. However, splitting the RGB channels like this gives artifacts in the colouring. Is there a way to perform the 2D median filter while keeping RGB values 'together'?
Or, I could explain this more abstractly: Imagine I had a 2D matrix, where each value contained a pair of index coordinates (i.e. a cell matrix of 2X1 vectors). How would I go about performing a median filter on this?
Here's how I can do an independent mode filter (giving the artifacts):
r = colfilt(r0,[5 5],'sliding',#mode);
g = colfilt(g0,[5 5],'sliding',#mode);
b = colfilt(b0,[5 5],'sliding',#mode);
However colfilt won't work on a cell matrix.
Another approach could be to somehow combine my RGB channels into a single number and thus create a standard 2D matrix. Not sure how to implement this, though...
Any ideas?
Thanks for your help.
Cheers,
Hugh
EDIT:
OK, so problem solved. Here's how I did it.
I adapted my question so that I'm no longer dealing with (RGB) vectors, but (UV) vectors. Still essentially the same problem, except that my vectors are 2D not 3D.
So firstly I load the individual U and V channels, arrange them each into a 1D list, then combine them, so I essentially have a list of vectors. Then I reduce it to just those which are unique. Then, I assign each pixel in my matrix the value of the index of that unique vector. After this I can do the mode filter. Then I basically do the reverse, in that I go through the filtered image pixelwise, and read the value at each pixel (i.e. an index in my list), and find the unique vector associated with that index and insert it at that pixel.
% Create index list
img_u = img_iuv(:,:,2);
img_v = img_iuv(:,:,3);
coordlist = unique(cat(2,img_u(:),img_v(:)),'rows');
% Create a 2D matrix of indices
img_idx = zeros(size(img_iuv,1),size(img_iuv,2),2);
for y = 1:length(Y)
for x = 1:length(X)
coords = squeeze(img_iuv(x,y,2:3))';
[~,idx] = ismember(coords,coordlist,'rows');
img_idx(x,y) = idx;
end
end
% Apply the mode filter
img_idx = colfilt(img_idx,[n,n],'sliding',#mode);
% Re-construct the original image using the filtered data
for y = 1:length(Y)
for x = 1:length(X)
idx = img_idx(x,y);
try
coords = coordlist(idx,:);
end
img_iuv(x,y,2:3) = coords(:);
end
end
Not pretty but it gets the job done. I suppose this approach would also work for RGB images, or other similar situations.
Cheers,
Hugh
I don't see how you can define the median of a vector variable. You probably need to reduce the R,G,B components to a single value and then compunte the median on that value. Why not use the intensity level as that single value? You could do it easily with rgb2gray.