Match Sketch(Drawing) face photo to digital color photo - image

I'm going to match the sketch face (drawing photo) in to the color photo. so for the research i want to find out what are the challenges that matching sketch drawing in to color faces. for now i have find out that
resolution pixel difference
texture difference
distance difference
and color (not much effect)
I want to know (in technical terms) what are other challenges and what are available OPEN CV and JAVA CV method and algorithms to overcome that challenges?
Here is some example of the sketches and the photos that are known to match them:

This problem is called multi-modal face recognition. There has been a lot of interest in comparing a high quality mugshot (modality 1) to low quality surveillance images (modality 2), another is frontal images to profiles, or pictures to sketches like the OP is interested in. Partial Least Squares (PLS) and Tied Factor Analysis (TFA) have been used for this purpose.
A key difficulty is computing two linear projections from the image in modality 1 (and modality 2) to a space where two points being close means that the individual is the same. This is the key technical step. Here are some papers on this approach:
Abhishek Sharma, David W Jacobs : Bypassing Synthesis: PLS for
Face Recognition with Pose, Low-Resolution and Sketch. CVPR
2011.
S.J.D. Prince, J.H. Elder, J. Warrell, F.M. Felisberti, Tied Factor
Analysis for Face Recognition across Large Pose Differences, IEEE
Patt. Anal. Mach. Intell, 30(6), 970-984, 2008. Elder is a specialist in this area and has a variety of papers on the topic.
B. Klare, Z. Li and A. K. Jain, Matching forensic sketches to
mugshot photos, IEEE Pattern Analysis and Machine Intelligence, 29
Sept. 2010.
As you can understand this is an active research area/problem. In terms using OpenCV to overcome the difficulties, let me give you an analogy: you need to build build a house (match sketches to photos) and you're asking how will having a Stanley hammer (OpenCV) will help. Sure, it will probably help. But you'll also need a lot of other resources: wood, time/money, pipes, cable, etc.

I think that James Elder's old work on the completeness of the edge map (using reconstruction by solving the Laplace equation) is quite relevant here. See the results at the end of this paper: http://elderlab.yorku.ca/~elder/publications/journals/ElderIJCV99.pdf

You could give Eigenfaces a try, though i never tested them with sketches i think they could a least be a good starting point for your research.
See Wiki: http://en.wikipedia.org/wiki/Eigenface and the Tutorial for OpenCV: http://docs.opencv.org/modules/contrib/doc/facerec/facerec_tutorial.html (including not only Eigenfaces!)

OpenCV can be used for feature extraction and machine learning required for this task. I guess you can start with the papers in the answers above, start with some basic features and prototype a classifier with OpenCV.
I guess you might also want to detect and match feature points on the faces. If you use this approach, you will have to do the feature point detectors on your own (training the Viola-Jones detector in OpenCV with your own data is an option).

Related

image registration(non-rigid \ nonlinear)

I'm looking for some algorithm (preferably if source code available)
for image registration.
Image deformation can't been described by homography matrix(because I think that distortion not symmetrical and not
homogeneous),more specifically deformations are like barrel/distortion and trapezoid distortion, maybe some rotation of image.
I want to obtain pairs of pixel of two images and so i can obtain representation of "deformation field".
I google a lot and find out that there are some algorithm base on some phisics ideas, but it seems that they can converge
to local maxima, but not global.
I can affort program to be semi-automatic, it means some simple user interation.
maybe some algorithms like SIFT will be appropriate?
but I think it can't provide "deformation field" with regular sufficient density.
if it important there is no scale changes.
example of complicated field
http://www.math.ucla.edu/~yanovsky/Research/ImageRegistration/2DMRI/2DMRI_lambda400_grid_only1.png
What you are looking for is "optical flow". Searching for these terms will yield you numerous results.
In OpenCV, there is a function called calcOpticalFlowFarneback() (in the video module) that does what you want.
The C API does still have an implementation of the classic paper by Horn & Schunck (1981) called "Determining optical flow".
You can also have a look at this work I've done, along with some code (but be careful, there are still some mysterious bugs in the opencl memory code. I will release a corrected version later this year.): http://lts2www.epfl.ch/people/dangelo/opticalflow
Besides OpenCV's optical flow (and mine ;-), you can have a look at ITK on itk.org for complete image registration chains (mostly aimed at medical imaging).
There's also a lot of optical flow code (matlab, C/C++...) that can be found thanks to google, for example cs.brown.edu/~dqsun/research/software.html, gpu4vision, etc
-- EDIT : about optical flow --
Optical flow is divided in two families of algorithms : the dense ones, and the others.
Dense algorithms give one motion vector per pixel, non-dense ones one vector per tracked feature.
Examples of the dense family include Horn-Schunck and Farneback (to stay with opencv), and more generally any algorithm that will minimize some cost function over the whole images (the various TV-L1 flows, etc).
An example for the non-dense family is the KLT, which is called Lucas-Kanade in opencv.
In the dense family, since the motion for each pixel is almost free, it can deal with scale changes. Keep in mind however that these algorithms can fail in the case of large motions / scales changes because they usually rely on linearizations (Taylor expansions of the motion and image changes). Furthermore, in the variational approach, each pixel contributes to the overall result. Hence, parts that are invisible in one image are likely to deviate the algorithm from the actual solution.
Anyway, techniques such as coarse-to-fine implementations are employed to bypass these limits, and these problems have usually only a small impact. Brutal illumination changes, or large occluded / unoccluded areas can also be explicitly dealt with by some algorithms, see for example this paper that computes a sparse image of "innovation" alongside the optical flow field.
i found some software medical specific, but it's complicate and it's not work with simple image formats, but seems that it do that I need.
http://www.csd.uoc.gr/~komod/FastPD/index.html
Drop - Deformable Registration using Discrete Optimization

Face identification with opencv

I'm using the libraries OpenCV for image processing in C + + and this is my question: can you think possible to do a facial recognition (saying the name of a person based on a database of photos) by comparing the frame of videocamera with images in a database using the technique of image histograms comparison? (Note that i compare only the facial region of an image using an example included in the opecv libraries).
I'm asking this because i've just tried to do a program like above but i have a lot of problem (often i detect the wrong person)
You might want to start with compiling the Face Detection using OpenCV example. As others have pointed out, general facial recognition isn't exactly an easy problem to solve. EigenFaces is one common technique for face recognition that is fairly easy to understand and implement.
As others have stated, it's a hard problem, but this gives you a place to start.
Some method I had experience with them are
metric learning for comparing faces
naming video characters: they use SIFT descriptors computed at specific feducial points on each face. Their code worked quite well for me in the past.
A dataset and benchmark that is dedicated for this task is labeled faces in the wild. You can find there references to working methods for comparing faces after detection.
UPDATE:
I have a description of an experiment on face clustering: unsupervised face identification.
The experiment is described in Section 4.4 of my thesis.
The basic flow is as follows
Metric learning: how to determine if two faces are of the same person or not.
This part is supervised, in the sense that it requires as input face images labeled with the identity of the person who appears in each photo.
a. Detect fiducial points (eyes, corner of mouth, nose).
You may use this code, or more recent versions such as this one.
b. Extract SIFT descriptors at the detected fiducial points.
c. Construct a "face descriptor": each face is described using a single vector.
This vector is a concatenation of the sqrt of all the SIFT descriptors.
d. Use the method described here to learn a mahalanobis distance between faces of different persons.
Unsupervised face identification: Once a metric was learned, you may use new photos of new people (these people need not be part of the training set, you may use photos of unseen-before people!).
a. Repeat stages a-c to construct the same "face descriptor" vector for each input face.
b. Compare the descriptor vectors using the learned mahalanobis distance.
I suggest using an existing algorithm such as the one available in the Luxand FaceSDK: http://www.luxand.com/facesdk/ rather than trying to develop your own.
there are 3 builtin techniques for face-recognition in opencv now, pca(eigenfaces), lda(fisherfaces) and lbph.
nice example code:
https://github.com/Itseez/opencv/blob/master/samples/cpp/facerec_demo.cpp

Explaining the AdaBoost Algorithms to non-technical people

I've been trying to understand the AdaBoost algorithm without much success. I'm struggling with understanding the Viola Jones paper on Face Detection as an example.
Can you explain AdaBoost in laymen's terms and present good examples of when it's used?
Adaboost is an algorithm that combines classifiers with poor performance, aka weak learners, into a bigger classifier with much higher performance.
How does it work? In a very simplified manner:
Train a weak learner.
Add it to the set of weak learners trained so far (with an optimal weight)
Increase the importance of samples that are still miss-classified.
Go to 1.
There is a broad and detailed theory behind the scenes, but the intuition is just that: let each "dumb" classifier focus on the mistakes the previous ones were not able to fix.
AdaBoost is one of the most used algorithms in the machine learning community. In particular, it is useful when you know how to create simple classifiers (possibly many different ones, using different features), and you want to combine them in an optimal way.
In Viola and Jones, each different type of weak-learner is associated to one of the 4 or 5 different Haar features you can have.
AdaBoost uses a number of training sample images (such as faces) to pick a number of good 'features'/'classifiers'. For face recognition a classifiers is typically just a rectangle of pixels that has a certain average color value and a relative size. AdaBoost will look at a number of classifiers and find out which one is the best predictor of a face based on the sample images. After it has chosen the best classifier it will continue to find another and another until some threshold is reached and those classifiers combined together will provide the end result.
This part you may not want to share with non-technical people :) but it is interesting anyway. There are several mathematical tricks which make AdaBoost fast for face recognition such as the ability to add up all the color values of an image and store them in a 2 dimensional array so that the value in any position will be the sum of all the pixels up and to the left of that position. This array can be used to very quickly calculate the average color value of any rectangle within the image by subtracting the value found in the top left corner from the value found in the bottom right corner and dividing by the number of pixels in the rectangle. Using this trick you can quickly scan over an entire image looking for rectangles of different relative sizes that match or are close to a particular color.
Hope this helps.
This is understandable. Most of the papers you can find on Internet retell Viola-Jones and Freund-Shapire papers which are foundation of AdaBoost applied for face recognition in OpenCV. And they mostly consist of difficult formulas and algorithms from several mathematical areas combined.
Here is what can help you (short enough) -
1 - It is used in object and, mostly, in face detection-recognition.The most popular and quite good C++ library is OpenCV from Intel originally. I take the part of Face detection in OpenCV, as an example.
2 - First, a cascade of boosted classifiers working with sample rectangles ("features") is trained on sample of images with faces (called positive) and without faces (negative).
From some Googled paper:
"· Boosting refers to a general and provably effective method of producing a very accurate classifier by combining rough and moderately inaccurate rules of thumb.
· It is based on the observation that finding many rough rules of thumb can be a lot easier than finding a single, highly accurate classifier.
· To begin, we define an algorithm for finding the rules of thumb, which we call a weak learner.
· The boosting algorithm repeatedly calls this weak learner, each time feeding it a different distribution over the training data (in AdaBoost).
· Each call generates a weak classifier and we must combine all of these into a single classifier that, hopefully, is much more accurate than any one of the rules."
During this process the images are scanned to determine the distinctive areas corresponding to certain part of every face. The complex calculation-hypothesis based algorithms are applied (which are not so difficult to understand once you get the main idea).
This can take a week and the output is an XML file which contains the learned information on how to quickly detect the human face, say, in frontal position on any picture (it can be any object in other case).
3 - After that you supply this file to OpenCV face detection program which runs quite fast with up to 99% positive rate (depending on conditions).
As was mentioned here, the scanning speed can be increased greatly with technique known as "integral image".
And finally, these are helpful sources - Object Detection in OpenCV and
Generic Object Detection using AdaBoost from University of California, 2008.

Computing the difference between images

Do you guys know of any algorithms that can be used to compute difference between images?
Take this webpage for example http://tineye.com/ You give it a link or upload an image and it finds similiar images. I doubt that it compares the image in question against all of them (or maybe it does).
By compute I mean like what the Levenshtein_distance or the Hamming distance is for strings.
By no means do I need to the correct answer for a project or anything, I just found the website and got very curious. I know digg pays for a similiar service for their website.
The very simplest measures are going to be RMS-error based approaches, for example:
Root Mean Square Deviation
Peak Signal to Noise Ratio
These probably gel with your notions of distance measures, but their results are really only meaningful if you've got two images that are very close already, like if you're looking at how well a particular compression scheme preserved the original image. Also, the same result from either comparison can mean a lot of different things, depending on what kind of artifacts there are (take a look at the paper I cite below for some example photos of RMS/PSNR can be misleading).
Beyond these, there's a whole field of research devoted to image similarity. I'm no expert, but here are a few pointers:
A lot of work has gone into approaches using dimensionality reduction (PCA, SVD, eigenvalue analysis, etc) to pick out the principal components of the image and compare them across different images.
Other approaches (particularly medical imaging) use segmentation techniques to pick out important parts of images, then they compare the images based on what's found
Still others have tried to devise similarity measures that get around some of the flaws of RMS error and PSNR. There was a pretty cool paper on the spatial domain structural similarity (SSIM) measure, which tries to mimic peoples' perceptions of image error instead of direct, mathematical notions of error. The same guys did an improved translation/rotation-invariant version using wavelet analysis in this paper on WSSIM.
It looks like TinEye uses feature vectors with values for lots of attributes to do their comparison. If you hunt around on their site, you eventually get to the Ideé Labs page, and their FAQ has some (but not too many) specifics on the algorithm:
Q: How does visual search work?
A: Idée’s visual search technology uses sophisticated algorithms to analyze hundreds of image attributes such as colour, shape, texture, luminosity, complexity, objects, and regions.These attributes form a compact digital signature that describes the appearance of each image, and these signatures are calculated by and indexed by our software. When performing a visual search, these signatures are quickly compared by our search engine to return visually similar results.
This is by no means exhaustive (it's just a handful of techniques I've encountered in the course of my own research), but if you google for technical papers or look through proceedings of recent conferences on image processing, you're bound to find more methods for this stuff. It's not a solved problem, but hopefully these pointers will give you an idea of what's involved.
One technique is to use color histograms. You can use machine learning algorithms to find similar images based on the repesentation you use. For example, the commonly used k-means algorithm. I have seen other solutions trying to analyze the vertical and horizontal lines in the image after using edge detection. Texture analysis is also used.
A recent paper clustered images from picasa web. You can also try the clustering algorithm that I am working on.
Consider using lossy wavelet compression and comparing the highest relevance elements of the images.
What TinEye does is a sort of hashing over the image or parts of it (see their FAQ). It's probably not a real hash function since they want similar "hashes" for similar (or nearly identical) images. But all they need to do is comparing that hash and probably substrings of it, to know whether the images are similar/identical or whether one is contained in another.
Heres an image similarity page, but its for polygons. You could convert your image into a finite number of polygons based on color and shape, and run these algorithm on each of them.
here is some code i wrote, 4 years ago in java yikes that does image comparisons using histograms. dont look at any part of it other than buildHistograms()
https://jpicsort.dev.java.net/source/browse/jpicsort/ImageComparator.java?rev=1.7&view=markup
maybe its helpful, atleast if you are using java
Correlation techniques will make a match jump out. If they're JPEGs you could compare the dominant coefficients for each 8x8 block and get a decent match. This isn't exactly correlation but it's based on a cosine transfore, so it's a first cousin.

robust algorithm for surface reconstruction from 3D point cloud?

I am trying to figure out what algorithms there are to do surface reconstruction from 3D range data. At a first glance, it seems that the Ball pivoting algorithm (BPA) and Poisson surface reconstruction are the more established methods?
What are the established, more robust algorithm in the field other than BPA and Poisson surface reconstruction algorithm?
Recommended research publications?
Is there available source code?
I have been facing this dilemma for some months now, and made exhaustive research.
Algorithms
Mainly there are 2 categories of algorithms: computation geometry, and implicit surfaces.
Computation Geometry
They fit the mesh on the existing points.
Probably the most famous algorithm of this group is powercrust, because it is theoretically well-established - it guarantees watertight mesh.
Ball Pivoting is patented by IBM. Also, it is not suitable for pointclouds with varying point density.
Implicit functions
One fits implicit functions on the pointcloud, then uses a marching-cube-like algorithm to extract the zero-set of the function into a mesh.
Methods in this category differ mainly by the different implicit functions used.
Poisson, Hoppe's, and MPU are the most famous algorithms in this category. If you are new to the topic, i recommend to read Hoppe's thesis, it is very explanatory.
The algorithms of this category usually can be implemented so that they are able to process huge inputs very efficiently, and one can scale their quality<->speed trade-off. They are not disturbed by noise, varying point-density, holes. A disadvantage of them is that they require consistently oriented surface normals at the input points.
Implementations
You will find small number of free implementations. However it depends on whether You are going to integrate it into free software (in this case GPL license is acceptable for You) or into a commercial software (in this case You need a more liberal license). The latter is very rare.
One is in VTK. I suspect it to be difficult to integrate (no documentation is available for free), it has a strange, over-complicated architecture, and is not designed for high-performance applications. Also has some limitations for the allowed input pointclouds.
Take a look at this Poisson implementation, and after that share your experience about it with me please.
Also:
here are a few high-performance algorithms, with surface reconstruction among them.
CGAL is a famous 3d library, but it is free only for free projects.
Meshlab is a famous application with GPL.
Also (Added August 2013):
The library PCL has a module dedicated to surface reconstruction and is in active development (and is part of Google's Summer of Code). The surface module contains a number of different algorithms for reconstruction. PCL also has the ability to estimate surface normals, incase you do not have them provided with your point data, this functionality can be found in the features module. PCL is released under the terms of the BSD license and is open source software, it is free for commercial and research use.
If you want make some direct experiments with various surface reconstruction algorithms you should try MeshLab, the mesh-processing system, it is open source and it contains implementations of many of the previously cited surface reconstruction algorithms, like:
Poisson Surface Recon
a couple of MLS based approach,
a ball pivoting implementation
a variant of the Curless volume based approach
Delaunay based techniques (Alpha shapes and Voronoi filtering)
tools for computing normals from scattered point sets
and many other tools for comparing/measuring/cleaning/simplifying the resulting meshes.
Sources are protected by GPL, so you could not use them in a commercial closed source project, but it is very important to get the right feeling about the properties of the various surface reconstruction algorithms (how sensitive to noise they are, the speed, the robustness to outliers, how they preserve fine details etc etc) before starting to implement one of them.
You might start looking at some recent work in the field - currently something like Fast low-memory streaming MLS reconstruction of point-sampled surfaces by Gianmauro Cuccuru, Enrico Gobbetti, Fabio Marton, Renato Pajarola, and Ruggero Pintus. Its citations can get you going through the literature pretty quickly.
While not a mesh representation, an ex-colleague recommended me this link
to source code for a Thin Plate Spline method:
Link
Anyone tried it?
Not sure if it's exactly right for your case, since it seems weird that you omitted it, but marching cubes is commonly mentioned in cases like these.
As I had this problem too, I did develop and implement my own point cloud crust algorithm. The sources, as well as the documentation, can be found on github.com: https://github.com/meixxi/PointCloudCrust. The algorithm is implemented in Java.
Maybe, this can help you. You can find also a short python script on the page which illustrates how to use the library. Have fun!
Here on GitHub, is a open source Mesh Processing Library in C++ by Dr. Hugues Hoppe, in which the surface reconstruction program Recon is a good option for your problem...
There is 3D Delaunay tool by Geometric Tools. This tool is used DirecX and OpenGL. Unfortunately, you may need buy a book to see actual example code of the library. You still read the code and figure out.
Matlab also introduced a surface reconstruction tool using Delaunay, delaunayTriangulation class.
You might be interested in Alpha Shapes.

Resources