I'm working on a small program for optical mark recognition.
The processing of the scanned form consists of two steps:
1) Find the form in the scanned image, descew and crop borders.
2) With this "normalized" form, I can simply search the marks by using coordinates from the original document and so on.
For the first step, I'm currently using the Homography functions from OpenCV and a perspecive transform to map the points. I also tried the SurfDetector.
However, both algorithms are quite slow and do not really meet the speed requierements when scanning forms from a document scanner.
Can anyone point me to an alternative algorithm/solution for this specific problem?
Thanks in advance!
Try with ORB or FAST detector: they should be faster than SURF (documentation here).
If those don't match your speed requirement you should probably use a different approach. Do you need scale and rotation invariance? If not, you could try with the cross correlation.
Viola-Jones cascade classifier is pretty quick. It is used in OpenCV for Face detection, but you can train it for different purpose. Depending on the appearance of what you call your "form", you can use simpler algorithms such as cross correlation as said by Muffo.
Related
I am trying to separate the different kinds of grains in an image. And sometimes the image also contains some impurity substance which need to be considered as an extra type.
here are some example images:
corn and beans
long rice and wheat
I tried to find a general method for the different pics, but the result is not good enough.
I used flood-fill and some gradient method to get the regions, and try to use clustering method to classify the contains, but the feature selection is a hard problem, I try gabor filter, but it cannot get me a clear boundary, and so does the classification method such as kmeans.
Any ideas about segmentation, getting the contours or classification will be appreciated. thanks!
I try to post some more pics of my current results, but I am sorry that there is the 2 pics restriction for the beginner here.
It's almost a craft work dealing with image processing problems. I would suggest you to use a robust library (such as OpenCV of course) and use cvFindContours function to identify the contours. Also, search for mathematical morphology. Basic operators such as erosion and dilation may help you since areas of foreground pixels shrink in size, and holes within those areas become larger and vice-versa. Working with color segmentation is also helpful but you might have some troubles since grain color is not uniform. Lastly, feature extraction is another way out. Scale-invariant feature transform can be used to identify every single grain on the image, based on the fact that it is invariant to linear transformations and illumination issues. Hope it helps.
first of all, I have to say I'm new to the field of computervision and I'm currently facing a problem, I tried to solve with opencv (Java Wrapper) without success.
Basicly I have a picture of a part from a Model taken by a camera (different angles, resoultions, rotations...) and I need to find the position of that part in the model.
Example Picture:
Model Picture:
So one question is: Where should I start/which algorithm should I use?
My first try was to use KeyPoint Matching with SURF as Detector, Descriptor and BF as Matcher.
It worked for about 2 pcitures out of 10. I used the default parameters and tried other detectors, without any improvements. (Maybe it's a question of the right parameters. But how to find out the right parameteres combined with the right algorithm?...)
Two examples:
My second try was to use the color to differentiate the certain elements in the model and to compare the structure with the model itself (In addition to the picture of the model I also have and xml representation of the model..).
Right now I extraxted the color red out of the image, adjusted h,s,v values manually to get the best detection for about 4 pictures, which fails for other pictures.
Two examples:
I also tried to use edge detection (canny, gray, with histogramm Equalization) to detect geometric structures. For some results I could imagine, that it will work, but using the same canny parameters for other pictures "fails". Two examples:
As I said I'm not familiar with computervision and just tried out some algorithms. I'm facing the problem, that I don't know which combination of algorithms and techniques is the best and in addition to that which parameters should I use. Testing it manually seems to be impossible.
Thanks in advance
gemorra
Your initial idea of using SURF features was actually very good, just try to understand how the parameters for this algorithm work and you should be able to register your images. A good starting point for your parameters would be varying only the Hessian treshold, and being fearles while doing so: your features are quite well defined, so try to use tresholds around 2000 and above (increasing in steps of 500-1000 till you get good results is totally ok).
Alternatively you can try to detect your ellipses and calculate an affine warp that normalizes them and run a cross-correlation to register them. This alternative does imply much more work, but is quite fascinating. Some ideas on that normalization using the covariance matrix and its choletsky decomposition here.
Let's say I have this image this:
With a black scratch and I want to remove it from my image. I know it is noise. I have tried neighbourhood filter and also gaussian filter but no success.
If you know the location of the scratch, this problem is known as inpainting, and there are very sophisticated algorithms for that. So one approach would be to detect the scratch as good as you can, then use a standard inpainting algorithm on it. I've played with your image in Mathematica a little:
First I applied a median filter to the image. As you found out yourself, this removes the scratch, but also removes a lot of detail. The difference between median and original image is a good indicator for your scratch, though:
When I binarize this image with a manually selected threshold, I get a quick&dirty scratch detector:
If you have more knowledge about what your scratches look like, you can improve this detector a lot. e.g. are the scratches always dark? Do they always have high contrast? Are they always smooth curves, i.e. is their curvature always low? - Each of these properties can be measured somehow, so you'd combine these measurements to a single image and binarize that.
One small improvement is to remove small components:
This is still not perfect, but the result is good enough to use it as an inpainting mask:
This will remove some detail, too, but the differences are harder to spot.
Full Mathematica code:
difference = ImageDifference[sourceImage, MedianFilter[sourceImage, 2]];
mask = DeleteSmallComponents[Binarize[difference, 0.15], 15];
Inpaint[sourceImage, mask]
EDIT:
If you're don't have access to a standard inpainting algorithm (like Navier Stokes or Telea), a poor man's algorithm would be to use the median filtered image in those regions where the mask is 1 (probably something like mask*sourceImage + (1-mask)*medialFilteredImage in Matlab). Depending on the image data, the difference might not be worth the extra effort of a "real" inpainting algorithm:
A filter for Avisynth and a plugin for VirtualDub (my two favourite video editing tools). It will hardly get better than these two (You can learn from them if you really need to implement it yourself).
My result using median filter with ImageJ
I'm using the libraries OpenCV for image processing in C + + and this is my question: can you think possible to do a facial recognition (saying the name of a person based on a database of photos) by comparing the frame of videocamera with images in a database using the technique of image histograms comparison? (Note that i compare only the facial region of an image using an example included in the opecv libraries).
I'm asking this because i've just tried to do a program like above but i have a lot of problem (often i detect the wrong person)
You might want to start with compiling the Face Detection using OpenCV example. As others have pointed out, general facial recognition isn't exactly an easy problem to solve. EigenFaces is one common technique for face recognition that is fairly easy to understand and implement.
As others have stated, it's a hard problem, but this gives you a place to start.
Some method I had experience with them are
metric learning for comparing faces
naming video characters: they use SIFT descriptors computed at specific feducial points on each face. Their code worked quite well for me in the past.
A dataset and benchmark that is dedicated for this task is labeled faces in the wild. You can find there references to working methods for comparing faces after detection.
UPDATE:
I have a description of an experiment on face clustering: unsupervised face identification.
The experiment is described in Section 4.4 of my thesis.
The basic flow is as follows
Metric learning: how to determine if two faces are of the same person or not.
This part is supervised, in the sense that it requires as input face images labeled with the identity of the person who appears in each photo.
a. Detect fiducial points (eyes, corner of mouth, nose).
You may use this code, or more recent versions such as this one.
b. Extract SIFT descriptors at the detected fiducial points.
c. Construct a "face descriptor": each face is described using a single vector.
This vector is a concatenation of the sqrt of all the SIFT descriptors.
d. Use the method described here to learn a mahalanobis distance between faces of different persons.
Unsupervised face identification: Once a metric was learned, you may use new photos of new people (these people need not be part of the training set, you may use photos of unseen-before people!).
a. Repeat stages a-c to construct the same "face descriptor" vector for each input face.
b. Compare the descriptor vectors using the learned mahalanobis distance.
I suggest using an existing algorithm such as the one available in the Luxand FaceSDK: http://www.luxand.com/facesdk/ rather than trying to develop your own.
there are 3 builtin techniques for face-recognition in opencv now, pca(eigenfaces), lda(fisherfaces) and lbph.
nice example code:
https://github.com/Itseez/opencv/blob/master/samples/cpp/facerec_demo.cpp
Sometimes two image files may be different on a file level, but a human would consider them perceptively identical. Given that, now suppose you have a huge database of images, and you wish to know if a human would think some image X is present in the database or not. If all images had a perceptive hash / fingerprint, then one could hash image X and it would be a simple matter to see if it is in the database or not.
I know there is research around this issue, and some algorithms exist, but is there any tool, like a UNIX command line tool or a library I could use to compute such a hash without implementing some algorithm from scratch?
edit: relevant code from findimagedupes, using ImageMagick
try $image->Sample("160x160!");
try $image->Modulate(saturation=>-100);
try $image->Blur(radius=>3,sigma=>99);
try $image->Normalize();
try $image->Equalize();
try $image->Sample("16x16");
try $image->Threshold();
try $image->Set(magick=>'mono');
($blob) = $image->ImageToBlob();
edit: Warning! ImageMagick $image object seems to contain information about the creation time of an image file that was read in. This means that the blob you get will be different even for the same image, if it was retrieved at a different time. To make sure the fingerprint stays the same, use $image->getImageSignature() as the last step.
findimagedupes is pretty good. You can run "findimagedupes -v fingerprint images" to let it print "perceptive hash", for example.
Cross-correlation or phase correlation will tell you if the images are the same, even with noise, degradation, and horizontal or vertical offsets. Using the FFT-based methods will make it much faster than the algorithm described in the question.
The usual algorithm doesn't work for images that are not the same scale or rotation, though. You could pre-rotate or pre-scale them, but that's really processor intensive. Apparently you can also do the correlation in a log-polar space and it will be invariant to rotation, translation, and scale, but I don't know the details well enough to explain that.
MATLAB example: Registering an Image Using Normalized Cross-Correlation
Wikipedia calls this "phase correlation" and also describes making it scale- and rotation-invariant:
The method can be extended to determine rotation and scaling differences between two images by first converting the images to log-polar coordinates. Due to properties of the Fourier transform, the rotation and scaling parameters can be determined in a manner invariant to translation.
Colour histogram is good for the same image that has been resized, resampled etc.
If you want to match different people's photos of the same landmark it's trickier - look at haar classifiers. Opencv is a great free library for image processing.
I don't know the algorithm behind it, but Microsoft Live Image Search just added this capability. Picasa also has the ability to identify faces in images, and groups faces that look similar. Most of the time, it's the same person.
Some machine learning technology like a support vector machine, neural network, naive Bayes classifier or Bayesian network would be best at this type of problem. I've written one each of the first three to classify handwritten digits, which is essentially image pattern recognition.
resize the image to a 1x1 pixle... if they are exact, there is a small probability they are the same picture...
now resize it to a 2x2 pixle image, if all 4 pixles are exact, there is a larger probability they are exact...
then 3x3, if all 9 pixles are exact... good chance etc.
then 4x4, if all 16 pixles are exact,... better chance.
etc...
doing it this way, you can make efficiency improvments... if the 1x1 pixel grid is off by a lot, why bother checking 2x2 grid? etc.
If you have lots of images, a color histogram could be used to get rough closeness of images before doing a full image comparison of each image against each other one (i.e. O(n^2)).
There is DPEG, "The" Duplicate Media Manager, but its code is not open. It's a very old tool - I remember using it in 2003.
You could use diff to see if they are REALLY different.. I guess it will remove lots of useless comparison. Then, for the algorithm, I would use a probabilistic approach.. what are the chances that they look the same.. I'd based that on the amount of rgb in each pixel. You could also find some other metrics such as luminosity and stuff like that.