I have 2 images that I need to compare:
Image 1: size [512 x 512] with pixel dimension: 0.41 mm
Image 2: size [210 x 210] with pixel dimension 1 mm
I tried to use: imresize
imresize(Image_1, [210 210]) % to change size/pixel
However it reduce the resolution and image is not clear at all.
Any suggestion will be welcome!
if you meant to test if the two images are identical, instead of resizing the images, you can use filters with different bandwidths. or a higher level feature, such as sift feature, can usually take care of sizing issues because it picks the most interesting scale internally.
vlfeat is a good toolbox if you use matlab.
You always have that problem with comparing two images of different resolutions. I would do a pre-processing of images to make them comparable, maybe something more than just making them of the same size. That pre-processing really depends on your images.
Anyway, perhaps it would be better to re-size the smaller one to a larger version using one of the methods mentioned here: http://www.mathworks.com/help/images/ref/imresize.html and then compare them. For example, I would enlarge the smaller image using 'lanczos3' method.
imresize(Image_2,[512 512],'lanczos3');
Related
I am new to machine learning. I am trying to create an input matrix (X) from a set of images (Stanford dog set of 120 breeds) to train a convolutional neural network. I aim to resize images and turn each image into one row by making each pixel a separate column.
If I directly resize images to a fixed size, the images lose their originality due to squishing or stretching, which is not good (first solution).
I can resize by fixing either width or height and then crop it (all resultant images will be of the same size as 100x100), but critical parts of the image can be cropped (second solution).
I am thinking of another way of doing it, but I am sure. Assume I want 10000 columns per image. Instead of resizing images to 100x100, I will resize the image so that the total pixel count will be around 10000 pixels. So, images of size 50x200, 100x100 and 250x40 will all converted into 10000 columns. For other sizes like 52x198, the first 10000 pixels out of 10296 will be considered (third solution).
The third solution I mentioned above seems to preserve the original shape of the image. However, it may be losing all of this originality while converting into a row since not all images are of the same size. I wonder about your comments on this issue. It will also be great if you can direct me to sources I can learn about the topic.
Solution 1 (simply resizing the input image) is a common approach. Unless you have a very different aspect ratio from the expected input shape (or your target classes have tight geometric constraints), you can usually still get good performance.
As you mentioned, Solution 2 (cropping your image) has the drawback of potentially excluding a critical part of your image. You can get around that by running the classification on multiple subwindows of the original image (i.e., classify multiple 100 x 100 sub-images by stepping over the input image horizontally and/or vertically at an appropriate stride). Then, you need to decide how to combine your multiple classification results.
Solution 3 will not work because the convolutional network needs to know the image dimensions (otherwise, it wouldn't know which pixels are horizontally and vertically adjacent). So you need to pass an image with explicit dimensions (e.g., 100 x 100) unless the network expects an array that was flattened from assumed dimensions. But if you simply pass an array of 10000 pixel values and the network doesn't know (or can't assume) whether the image was 100 x 100, 50 x 200, or 250 x 40, then the network can't apply the convolutional filters properly.
Solution 1 is clearly the easiest to implement but you need to balance the likely effect of changing the image aspect ratios with the level of effort required for running and combining multiple classifications for each image.
I have roughly 160 images for an experiment. Some of the images, however, have clearly different levels of brightness and contrast compared to others. For instance, I have something like the two pictures below:
I would like to equalize the two pictures in terms of brightness and contrast (probably find some level in the middle and not equate one image to another - though this could be okay if that makes things easier). Would anyone have any suggestions as to how to go about this? I'm not really familiar with image analysis in Matlab so please bear with my follow-up questions should they arise. There is a question for Equalizing luminance, brightness and contrast for a set of images already on here but the code doesn't make much sense to me (due to my lack of experience working with images in Matlab).
Currently, I use Gimp to manipulate images but it's time consuming with 160 images and also just going with subjective eye judgment isn't very reliable. Thank you!
You can use histeq to perform histogram specification where the algorithm will try its best to make the target image match the distribution of intensities / histogram of a source image. This is also called histogram matching and you can read up about it on my previous answer.
In effect, the distribution of intensities between the two images should hopefully be the same. If you want to take advantage of this using histeq, you can specify an additional parameter that specifies the target histogram. Therefore, the input image would try and match itself to the target histogram. Something like this would work assuming you have the images stored in im1 and im2:
out = histeq(im1, imhist(im2));
However, imhistmatch is the more better version to use. It's almost the same way you'd call histeq except you don't have to manually compute the histogram. You just specify the actual image to match itself:
out = imhistmatch(im1, im2);
Here's a running example using your two images. Note that I'll opt to use imhistmatch instead. I read in the two images directly from StackOverflow, I perform a histogram matching so that the first image matches in intensity distribution with the second image and we show this result all in one window.
im1 = imread('http://i.stack.imgur.com/oaopV.png');
im2 = imread('http://i.stack.imgur.com/4fQPq.png');
out = imhistmatch(im1, im2);
figure;
subplot(1,3,1);
imshow(im1);
subplot(1,3,2);
imshow(im2);
subplot(1,3,3);
imshow(out);
This is what I get:
Note that the first image now more or less matches in distribution with the second image.
We can also flip it around and make the first image the source and we can try and match the second image to the first image. Just flip the two parameters with imhistmatch:
out = imhistmatch(im2, im1);
Repeating the above code to display the figure, I get this:
That looks a little more interesting. We can definitely see the shape of the second image's eyes, and some of the facial features are more pronounced.
As such, what you can finally do in the end is choose a good representative image that has the best brightness and contrast, then loop over each of the other images and call imhistmatch each time using this source image as the reference so that the other images will try and match their distribution of intensities to this source image. I can't really write code for this because I don't know how you are storing these images in MATLAB. If you share some of that code, I'd love to write more.
I have an image of the size 640*640*3, while another image of the size 125*314*3. I want to obtain the size ratio of the second image to the first image, but I can't find a way to do it.
I have tried the traditional divide method, as well as using rdivide but both are not working.
If I use the traditional approach of multiplying the image 3D values first, then compare, will the approach be correct?
For example, I would do something like 640*640*3 = 1,228,800 then 125*314*3 = 117,750 and finally, take 117,750 / 1,228,800 = 0.09. Is 0.09 the right answer?
I'm assuming you are referring to the ratio of the areas between the two images. If this is the case, just use the width and height. This looks like you are using RGB images, so don't use the number of channels. However, the number of channels cancels out when you use them in finding the ratio.
Therefore, yes your approach is correct:
(125*314) / (640*640) = 0.0958
This means that the smaller (or second) image occupies roughly 9.5% of the larger (or first) image.
That depends what you mean by size ratio.
Looks like you have RGB images, so if you mean the area, then it is (640*640)/(125*314), if you mean the height, then it is 640/314, more options too, be more specific in your question.
I have a set of 274 color images (each one is 200x150 pixels). Each image is visually distinct. I would like to build an app which accepts an up/down-scaled version of one of the base set of images and determines the closest match.
I'm a senior software engineer but am totally new to image recognition. I'd really appreciate any recommendations as to where to start.
If you're comparing extremely similar images, it's in theory sufficient to calculate the Euclidean distance between the 2 images. The images must be the same size to do so, so it is often necessary to rescale an image to do so (generally the larger image is scaled down). Note that aliasing issues can happen here, so pay some attention to your downsampling algorithm. There's also an issue if your images don't have the same aspect ratio.
However, this is almost never done in practice since it's extremely slow. For N images of size WxH and 3 color channels, it requires N x W x H x 3 comparisons, which quickly gets unworkable (consider that many users can have over 1000 images of size >1000x1000).
Generally we attempt to reduce the image to a smaller array that captures the image information much more briefly, called a visual descriptor. For example taking a 1024x1024x3 image and reducing it to a 128 length vector. This needs only be calculated once for the reference images, and then stored in an appropriate data structure. Then we can compare the descriptor for the query image against the descriptor for the reference images.
The cost of calculating the distance for our dataset of N images for a descriptor of length L is then N x L instead of the original N x W x H x 3
So the issue is to find efficient descriptors that are (a) cheap to compute and (b) capture the image accurately. This is still an active area of research, but I can suggest some:
Histograms are probably the simplest way to do this, although they do very poorly with any illumination change and incorporate only color information, no spatial information. Make sure you normalise your histogram before doing any comparison
Perceptual hashing works well with very similar images or slightly cropped images. See here
GIST descriptors are powerful, but more complex, see here
I have some JPEG images that I need scale down to about 80% of original size. Original image dimension are about 700px × 1000px. Images contain some computer generated text and possibly some graphics (similar to what you would find in corporate word documents).
How to scale image so that the text is as legible as possible? Currently we are scaling the imaeg down using bicubic interpolation, but that makes the text blurry and foggy.
Two options:
Use a different resampling algorithm. Lanczos gives you a much less blurrier result.
You ight use an advances JPEG library that resamples the 8x8 blocks to 6x6 pixels.
If you are not set on exactly 80% you can try getting and building djpeg from http://www.ijg.org/ as it can decompress your jpeg to 6/8ths (75%) or 7/8ths (87.5%) size and the text quality will still be pretty good:
Original
7/8
3/4
(SO decided to scale the images when showing them inline)
There may be a scaling algorithm out there that works similarly, but this is an easy off the shelf solution.
There is always a loss involved in scaling down, but it again depends of your trade offs.
Blurring and artifact generation is normal for jpeg images, so its recommended that you generate images is the correct size the first time.
Lanczos is a fine solution, but you have your trade offs
If its just the text and you are concerned about it, you could try dilation filter over the resampled image. This would correct some blurriness but may also affects the graphics. If you can live with it, its good. Alternatively if you can identify the areas of text, you can apply dilation just over those areas.