Please suggest any template matching algorithms, which are independent of size and rotation.
(any source codes as examples if possible please)
EDIT 1:
Actually I understand how the algorithm works, we can resize template and rotate it. It is computationally expensive, but we can use image pyramids. But the real problem for me now is when the picture is made at some angle to object, so that only a perspective transform can correct the image. I mean that even if we rotate image or scale it, we will not get a good match if the object in image is perspectively transformed. Of course it is possible to try to generate many templates at different perspective, but I think it is very bad idea.
EDIT 2:
One more problem when using template matching based on shape matching.
What if image doesn't have many sharp edges? For example a plate or dish?
EDIT 3:
I've also heard about camera callibration for object detection. What is the algorithm used for that purpose? I don't understand how it can be used for template matching.
I don't think there is an efficient template matching algorithm that is affine-invariant (rotation+scale+translation).
You can make template matching somewhat robust to scale+rotation by using a distance transform (see Chamfering style methods). You should probably also look at SIFT and MSER to get a sense of how the research area has been shaped the past decade. But these are not template matching algorithms.
Check out this recent 2013 paper on efficient affine template matching: "Fast-Match". http://www.eng.tau.ac.il/~simonk/FastMatch/
Matlab code is available on that website. Basic idea is to exhaustively search the affine space, but do it in the sparsest way possible based on how smooth the image is. Has a formal approximation guarantee, although it won't always find the absolute best answer.
Related
I have an idea for an app that takes a printed page with four squares in each corner and allows you to measure objects on the paper given at least two squares are visible. I want to be able to have a user take a picture from less than perfect angles and still have the objects be measured accurately.
I'm unable to figure out exactly how to find information on this subject due to my lack of knowledge in the area. I've been able to find examples of opencv code that does some interesting transforms and the like but I've yet to figure out what I'm asking in simpler terms.
Does anyone know of papers or mathematical concepts I can lookup to get further into this project?
I'm not quite sure how or who to ask other than people on this forum, sorry for the somewhat vague question.
What you describe is very reminiscent of augmented reality marker tracking. Maybe you can start by searching these words on a search engine of your choice.
A single marker, if done correctly, can be used to identify it without confusing it with other markers AND to determine how the surface is placed in 3D space in front of the camera.
But that's all very difficult and advanced stuff, I'd greatly advise to NOT try and implement something like this, it would take years of research... The only way you have is to use a ready-made open source library that outputs the data you need for your app.
It may even not exist. In that case you'll have to buy one. Given the niché of your problem that would be perfectly plausible.
Here I give you only the programming aspect and if you want you can find out about the mathematical aspect from those examples. Most of the functions you need can be done using OpenCV. Here are some examples in python:
To detect the printed paper, you can use cv2.findContours function. The most outer contour is possibly the paper, but you need to test on actual images. https://docs.opencv.org/3.1.0/d4/d73/tutorial_py_contours_begin.html
In case of sloping (not in perfect angle), you can find the angle by cv2.minAreaRect which return the angle of the contour you found above. https://docs.opencv.org/3.1.0/dd/d49/tutorial_py_contour_features.html (part 7b).
If you want to rotate the paper, use cv2.warpAffine. https://docs.opencv.org/3.0-beta/doc/py_tutorials/py_imgproc/py_geometric_transformations/py_geometric_transformations.html
To detect the object in the paper, there are some methods. The easiest way is using the contours above. If the objects are in certain colors, you can detect it by using color filter. https://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_imgproc/py_colorspaces/py_colorspaces.html
I am trying to separate the different kinds of grains in an image. And sometimes the image also contains some impurity substance which need to be considered as an extra type.
here are some example images:
corn and beans
long rice and wheat
I tried to find a general method for the different pics, but the result is not good enough.
I used flood-fill and some gradient method to get the regions, and try to use clustering method to classify the contains, but the feature selection is a hard problem, I try gabor filter, but it cannot get me a clear boundary, and so does the classification method such as kmeans.
Any ideas about segmentation, getting the contours or classification will be appreciated. thanks!
I try to post some more pics of my current results, but I am sorry that there is the 2 pics restriction for the beginner here.
It's almost a craft work dealing with image processing problems. I would suggest you to use a robust library (such as OpenCV of course) and use cvFindContours function to identify the contours. Also, search for mathematical morphology. Basic operators such as erosion and dilation may help you since areas of foreground pixels shrink in size, and holes within those areas become larger and vice-versa. Working with color segmentation is also helpful but you might have some troubles since grain color is not uniform. Lastly, feature extraction is another way out. Scale-invariant feature transform can be used to identify every single grain on the image, based on the fact that it is invariant to linear transformations and illumination issues. Hope it helps.
I have a target image to be searched for a curve along its edges and a template image that contains the curve. What I need to achieve is to find the best match of the curve in the template image within the target image, and based on the score, to find out whether there is a match or not. That also includes rotation and resizing of the curve. The target image can be the output of a Canny Edge detector if that makes things easier.
I am considering to use OpenCV (by using Python or Processing/Java or if those have limited access to the required functions then by using C) to make things practical and efficient, however could not find out if I can use any functions (or a combination of them) in OpenCV that are useable for doing this job. I have been reading through the OpenCV documentation and thought at first that Contours could do this job, however all the examples show closed shapes as opposed to my case where I need to match a open curve to a part of an edge.
So is there a way to do this either by using OpenCV or with any known code or algorithm that you would suggest?
Here are some images to illustrate the problem:
My first thought was Generalized Hough Transform. However I don't know any good implementation for that.
I would try SIFT or SURF first on the canny edge image. It usually is used to find 2d areas, not 1d contours, but if you take the minimum bounding box around your contour and use that as the search pattern, it should work.
OpenCV has an implementation for that:
Features2D + Homography to find a known object
A problem may be getting a good edge image, those black shapes in the back could be distracting.
Also see this Stackoverflow answer:
Image Processing: Algorithm Improvement for 'Coca-Cola Can' Recognition
Imagine we have a simple 2D drawing, filled it with lots of non-overlapping circles and only a few stars.
If we are to find all the stars among all these circles, I can think of very few methods. Brute force is one of them. Another one is possibly reduce the image size (to the optimal point where you can still distinguish the objects apart) and then apply brute force and map to the original image. The drawback of brute force is of course, it is very time consuming. I am looking for faster methods, possibly the fastest one.
What is the fastest image processing method to search for the specified item on a simple 2D image?
One typical way of looking for an object in an image is through cross correlation. Basically, you look for the position where the cross-correlation between a mask (the object you're attempting to find) and the image is the highest. That position is the likely location of the object you're trying to find.
For the sake of simplicity, I will refer to the object you're attempting to find as a star, but in general it can be any shape.
Some problems with the above approach:
The size of the mask has to match the size of the star. If you don't know the size of the star, then you will have to try different size masks. Image pyramids are more effective than just iteratively trying different size masks, but still require extra effort.
Similarly, the orientations of the mask and the star have to match. If they don't, the cross-correlation won't work.
For these reasons, the more you know about your problem, the simpler it becomes. This is the reason why people have asked you for more information in the comments. A general purpose solution doesn't really exist, to the best of my knowledge. Maybe someone more knowledgeable can correct me on this.
As you've mentioned, reducing the size of the image will help you reduce the computational time of your approach. In my opinion, it's hardly the core element of a solution -- it's just an optional optimization step.
If the shapes are easy to segment from the background, you might be able to compute distinguishing shape/color descriptors. Depending on your problem you could choose descriptors that are invariant to scale, translation or rotation (e.g. compactness, if it is unique to each shape). I do not know if this will be faster, though.
If you already know the exact shape and have an idea about the size, you might want to have a look at the Generalized Hough Transform, which is basically a formalized description of your "brute force algorithm"
As you list a property that the shapes are not overlapping then I assume an efficient algorithm would be able to
cut out all the shapes by scanning the image in some way (I can imagine relatively efficient and simple algorithm for convex shapes)
when you are left with cut out shapes you could use cross relation misha mentioned
You should describe the problem a bit better
can the shapes be rotated or scaled (or some other transform?)
is the background uniform colour
are the shapes uniform colour
are the shapes filled
Depending on the answer on the above questions you might have more less or more simple solutions.
Also, maybe this article might be interesting.
If the shapes are very regular maybe turning them into vectors could fit your needs nicely, but it might be an overkill, really depends what you want to do later.
Step 1: Thresholding - reduce the image to 1 bit (black or white) if the general image set permits it. [For the type of example you cite, my guess is thresholding would work nicely - leaving enough details to find objects].
Step 2: Optionally do some smoothing/noise removal.
Step 3: Use some clustering approach to gather the foreground objects.
Step 4: Use an appropriate heuristic to identify the objects.
The parameters in steps 1/2 will depend a lot on the type of images as well as experimentation/observation. 3 is usually straightforward if you have worked out 1/2 correctly. 4 will depend very much on the problem (for example, in your case identifying stars - which would depend on what is the actual shape of the stars expected in the images).
Do you guys know of any algorithms that can be used to compute difference between images?
Take this webpage for example http://tineye.com/ You give it a link or upload an image and it finds similiar images. I doubt that it compares the image in question against all of them (or maybe it does).
By compute I mean like what the Levenshtein_distance or the Hamming distance is for strings.
By no means do I need to the correct answer for a project or anything, I just found the website and got very curious. I know digg pays for a similiar service for their website.
The very simplest measures are going to be RMS-error based approaches, for example:
Root Mean Square Deviation
Peak Signal to Noise Ratio
These probably gel with your notions of distance measures, but their results are really only meaningful if you've got two images that are very close already, like if you're looking at how well a particular compression scheme preserved the original image. Also, the same result from either comparison can mean a lot of different things, depending on what kind of artifacts there are (take a look at the paper I cite below for some example photos of RMS/PSNR can be misleading).
Beyond these, there's a whole field of research devoted to image similarity. I'm no expert, but here are a few pointers:
A lot of work has gone into approaches using dimensionality reduction (PCA, SVD, eigenvalue analysis, etc) to pick out the principal components of the image and compare them across different images.
Other approaches (particularly medical imaging) use segmentation techniques to pick out important parts of images, then they compare the images based on what's found
Still others have tried to devise similarity measures that get around some of the flaws of RMS error and PSNR. There was a pretty cool paper on the spatial domain structural similarity (SSIM) measure, which tries to mimic peoples' perceptions of image error instead of direct, mathematical notions of error. The same guys did an improved translation/rotation-invariant version using wavelet analysis in this paper on WSSIM.
It looks like TinEye uses feature vectors with values for lots of attributes to do their comparison. If you hunt around on their site, you eventually get to the Ideé Labs page, and their FAQ has some (but not too many) specifics on the algorithm:
Q: How does visual search work?
A: Idée’s visual search technology uses sophisticated algorithms to analyze hundreds of image attributes such as colour, shape, texture, luminosity, complexity, objects, and regions.These attributes form a compact digital signature that describes the appearance of each image, and these signatures are calculated by and indexed by our software. When performing a visual search, these signatures are quickly compared by our search engine to return visually similar results.
This is by no means exhaustive (it's just a handful of techniques I've encountered in the course of my own research), but if you google for technical papers or look through proceedings of recent conferences on image processing, you're bound to find more methods for this stuff. It's not a solved problem, but hopefully these pointers will give you an idea of what's involved.
One technique is to use color histograms. You can use machine learning algorithms to find similar images based on the repesentation you use. For example, the commonly used k-means algorithm. I have seen other solutions trying to analyze the vertical and horizontal lines in the image after using edge detection. Texture analysis is also used.
A recent paper clustered images from picasa web. You can also try the clustering algorithm that I am working on.
Consider using lossy wavelet compression and comparing the highest relevance elements of the images.
What TinEye does is a sort of hashing over the image or parts of it (see their FAQ). It's probably not a real hash function since they want similar "hashes" for similar (or nearly identical) images. But all they need to do is comparing that hash and probably substrings of it, to know whether the images are similar/identical or whether one is contained in another.
Heres an image similarity page, but its for polygons. You could convert your image into a finite number of polygons based on color and shape, and run these algorithm on each of them.
here is some code i wrote, 4 years ago in java yikes that does image comparisons using histograms. dont look at any part of it other than buildHistograms()
https://jpicsort.dev.java.net/source/browse/jpicsort/ImageComparator.java?rev=1.7&view=markup
maybe its helpful, atleast if you are using java
Correlation techniques will make a match jump out. If they're JPEGs you could compare the dominant coefficients for each 8x8 block and get a decent match. This isn't exactly correlation but it's based on a cosine transfore, so it's a first cousin.