I have a big database of pictures (say, 1 million 512x512px images) and I want to do the following query in a fast way:
Given a cropped image, find an image from the database that contains it.
(The closest question that I could find in StackOverflow is this one, which I address later in this post)
The following image illustrates what I'm trying to do.
I have the following restrictions:
(I) – The query must be fast. 10⁶ is a lot, so I don't think I can compare each image in the query to each of the others individually.
(II) – I need to work with cropped images, so solutions like simple image hashing won't do it (of course, this does not apply to crop-resistant hashes)
(III) – I don't know the proportion between the area of the queried image and the image that contains it. In the example above, the refrigerant is just a small portion of the original image, but the cat takes a lot of space in the image where it is contained. Although I estimate that the proportion is always between 10%~100%, I don't know the exact amount beforehand (suppose the images in queries are always 512x512px, for example)
I've gathered some information in my research:
Simple image hash matching isn't possible because of (II) (I'm working with cropped parts)
Reddit's RepostSleuthBot (available on GitHub) is an excellent starting point for me: It can identify if an image was already posted in an efficient way. Instead of simply matching hashes, seems like it uses the ANNOY algorithm to find similar images (so it can match images with slight modifications in text or brightness, for example). The only problem with this approach is that it isn't well adapted for cropped images. So, this addresses (I) but not (II) and (III).
In my StackOverflow searches, the closest thing I found to help in this problem is that if I knew the proportion between the cropped image and the original, I could match it using phase correlation, like this answer says.
This addresses (II), which is awesome, but then I'll have problems with (I) because I'd have to try to match with each image of the database, and it's also inviable because of (III).
A promising feature would be cropping-resistant image hashing - the paper Efficient Cropping-Resistant Robust Image Hashing, 10.1109/ares.2014.85 describes one, but seems like it isn't that performant, especially taking in consideration that I'm aiming at small crops (10%~100% of the original image) and a huge amount of images.
I got stuck after this point. Is there any other algorithm or method I should be aware of? Anything will be very appreciated.
Problem statement:
Given an input image, find and extract the image similar to that from the cluttered scene. Now from the extracted Image find the differences in the extracted image from the input image.
My Approach:
Uptill now I have used SIFT features for feature matching and affine transform to extract the image from the cluttered scene.
But I am not able to find a method good enough and feasible for me to find the difference in the input image and extracted image.
I dont think there exists a particular technique for your problem. If the traditional methods does not suite your need, maybe you can use the keypoints (SIFT) again to estimate the difference.
You have already done most work by matching image using SIFT.
Next you can use corresponding SIFT matched points to estimate the warp-affine factor. Apply required warp affine to second image and crop such that the images are super-imposable.
Now you can calculate absolute difference of the two image and SAD or SSD as a difference indication.
I'm currently working on my thesis on the neural networks. I'm using the CIFAR10 as a reference dataset. Now I would like to show some example results in my paper. The problem is, that the images in the dataset are 32x32 pixels so it's really hard to recognize something on them when printed on paper.
Is there any way to get hold of the original images with higher resolution?
UPDATE: I'm not asking for image processing algorithm, but for the original images presented in CIFAR-10. I need some higher resolution samples to put in my paper.
I now have the same problem and I just found your question.
It seems that CIFAR was built from labeling the tinyimages dataset, and are kind enough to share the indexing from CIFAR to tinyimages. Now tinyimages contain metadata file with URL of the original images and a toolbox for getting for any image you wish (e.g. those included in the CIFAR index).
So one may write a mat file which does this and share the results...
They're just small:
The CIFAR-10 and CIFAR-100 are labeled subsets of the 80 million tiny images dataset.
You could use Google reverse image search if you're curious.
I am working on a image matching project. First we use SURF to find out a matching pair of pictures. There will be at least one object that appears on both images. Now I need to find out, for this same object, what are the sizes of it on the two pictures? Relative size is enough.
Both SIFT and SURF are only local feature point descriptors. I will get a bunch of descriptors and associated feature point locations, but how to use this information to determine an object size? I am thinking of using contours: if I can associate a contour in the two images correctly, then I can find out the object sizes easily by calculating contour point locations. But how to associate contours?
I assume there must be some way to apply SIFT or SURF to get object information, since people can do object tracking using SIFT... but after searched for a long time I still couldn't get any useful information...
Any help would be appreciated! Thanks in advance!
The SIFT/SURF detector gives each feature a canonical scale. By simply comparing the ratios of the scales of the matching features in each image, you should be able to determine their relative size.
You should be already comparing the scales of the potential matches in order to discard spurious matches when there is transformation inconsistency.
Sometimes two image files may be different on a file level, but a human would consider them perceptively identical. Given that, now suppose you have a huge database of images, and you wish to know if a human would think some image X is present in the database or not. If all images had a perceptive hash / fingerprint, then one could hash image X and it would be a simple matter to see if it is in the database or not.
I know there is research around this issue, and some algorithms exist, but is there any tool, like a UNIX command line tool or a library I could use to compute such a hash without implementing some algorithm from scratch?
edit: relevant code from findimagedupes, using ImageMagick
try $image->Sample("160x160!");
try $image->Modulate(saturation=>-100);
try $image->Blur(radius=>3,sigma=>99);
try $image->Normalize();
try $image->Equalize();
try $image->Sample("16x16");
try $image->Threshold();
try $image->Set(magick=>'mono');
($blob) = $image->ImageToBlob();
edit: Warning! ImageMagick $image object seems to contain information about the creation time of an image file that was read in. This means that the blob you get will be different even for the same image, if it was retrieved at a different time. To make sure the fingerprint stays the same, use $image->getImageSignature() as the last step.
findimagedupes is pretty good. You can run "findimagedupes -v fingerprint images" to let it print "perceptive hash", for example.
Cross-correlation or phase correlation will tell you if the images are the same, even with noise, degradation, and horizontal or vertical offsets. Using the FFT-based methods will make it much faster than the algorithm described in the question.
The usual algorithm doesn't work for images that are not the same scale or rotation, though. You could pre-rotate or pre-scale them, but that's really processor intensive. Apparently you can also do the correlation in a log-polar space and it will be invariant to rotation, translation, and scale, but I don't know the details well enough to explain that.
MATLAB example: Registering an Image Using Normalized Cross-Correlation
Wikipedia calls this "phase correlation" and also describes making it scale- and rotation-invariant:
The method can be extended to determine rotation and scaling differences between two images by first converting the images to log-polar coordinates. Due to properties of the Fourier transform, the rotation and scaling parameters can be determined in a manner invariant to translation.
Colour histogram is good for the same image that has been resized, resampled etc.
If you want to match different people's photos of the same landmark it's trickier - look at haar classifiers. Opencv is a great free library for image processing.
I don't know the algorithm behind it, but Microsoft Live Image Search just added this capability. Picasa also has the ability to identify faces in images, and groups faces that look similar. Most of the time, it's the same person.
Some machine learning technology like a support vector machine, neural network, naive Bayes classifier or Bayesian network would be best at this type of problem. I've written one each of the first three to classify handwritten digits, which is essentially image pattern recognition.
resize the image to a 1x1 pixle... if they are exact, there is a small probability they are the same picture...
now resize it to a 2x2 pixle image, if all 4 pixles are exact, there is a larger probability they are exact...
then 3x3, if all 9 pixles are exact... good chance etc.
then 4x4, if all 16 pixles are exact,... better chance.
etc...
doing it this way, you can make efficiency improvments... if the 1x1 pixel grid is off by a lot, why bother checking 2x2 grid? etc.
If you have lots of images, a color histogram could be used to get rough closeness of images before doing a full image comparison of each image against each other one (i.e. O(n^2)).
There is DPEG, "The" Duplicate Media Manager, but its code is not open. It's a very old tool - I remember using it in 2003.
You could use diff to see if they are REALLY different.. I guess it will remove lots of useless comparison. Then, for the algorithm, I would use a probabilistic approach.. what are the chances that they look the same.. I'd based that on the amount of rgb in each pixel. You could also find some other metrics such as luminosity and stuff like that.