Extracting trees from image without picking up background vegetation? - algorithm

I do not have a background in image recognition/feature extraction, but I am in need of a way to extract trees from an image without the background vegetation.
Seen above is a small example of the kind of imagery I'm working with. I have access to multi-spectral imagery as well (though I haven't seen it yet) including NDVI, NIR, Red-edge.
From researching the problem at hand, I am aware that feature extraction is an active area of research and it seems that often supervised and unsupervised machine learning is employed in combination with statistical voodoo such as "PCA". Being able to differentiate between trees and background vegetation has been noted as an area of difficulty in some papers I skimmed over in my research.
There are notable features about the imagery I am working with. First of all, the palm trees have a very distinctive shape. Not only this, but there are obvious differences in the texture of the trees vs the texture of the background vegetation.
I am not an academic, and as such I only have access to publicly available papers for my research. I am looking for relevant algorithms that could help me extract the features of interest to me (trees) that either have an implementation (ideally in C or bindings to C, though I'm aware that it is not a commonly used language in this field) or with publicly available papers/tutorials/sites/etc. detailing the algorithm so that I could implement it myself.
Thanks in advance for any help!

Look into OpenCV, It has a lot of options for supervised/semi supervised Learning methods. As you have mentioned there is a visible texture difference between the tress and background vegetation, a good place for you would be to start would be color based segmentation and evolving it to use textures as well. OpenCV ML tutorial is a good starting point. Moreover you can also combine the NDVI data to create a stronger feature set.

Related

Simple morphing animation between two images

I'm looking to implement a simple morphing animation between two images.
Here's a simple demo of what I'm trying to create: http://i.imgur.com/7377yHr.gif
I'm pretty comfortable with Objective-C and JavaScript but since the concepts and algorithms are abstract, I'm more than willing to see examples in any language or framework.
I would like to know how hard it would be to tackle this -- it doesn't have to be exact but as long as it gives the impression of a morph I'll be satisfied.
Where would I start?
It seems like in your example is being used a combination of mesh wrap morphing and cross dissolve morphing. Mesh morphing can be tricky and as far as I know it requires a manual input (defining the mesh), so depending on what you want to do it might not be suitable for you.
If you are looking for a cheap technique (in terms of effort), probably just doing cross dissolve would work for you, since is very easy to implement. You just need to combine both images by increasing the alpha of the target image and decreasing the alpha on the origin image.
These articles give an overview of the techniques:
[PDF] http://css1a0.engr.ccny.cuny.edu/~wolberg/pub/vc98.pdf
[PDF] http://www.sorging.ro/en/member/serveFile/format/pdf/slug/image-morphing-techniques
[PDF] http://cs.haifa.ac.il/hagit/courses/ip/Lectures/Ip05_GeomOper.pdf
The last link comes from a comment in a similar question: Morphing, 3 algorithms, image processing

Match Sketch(Drawing) face photo to digital color photo

I'm going to match the sketch face (drawing photo) in to the color photo. so for the research i want to find out what are the challenges that matching sketch drawing in to color faces. for now i have find out that
resolution pixel difference
texture difference
distance difference
and color (not much effect)
I want to know (in technical terms) what are other challenges and what are available OPEN CV and JAVA CV method and algorithms to overcome that challenges?
Here is some example of the sketches and the photos that are known to match them:
This problem is called multi-modal face recognition. There has been a lot of interest in comparing a high quality mugshot (modality 1) to low quality surveillance images (modality 2), another is frontal images to profiles, or pictures to sketches like the OP is interested in. Partial Least Squares (PLS) and Tied Factor Analysis (TFA) have been used for this purpose.
A key difficulty is computing two linear projections from the image in modality 1 (and modality 2) to a space where two points being close means that the individual is the same. This is the key technical step. Here are some papers on this approach:
Abhishek Sharma, David W Jacobs : Bypassing Synthesis: PLS for
Face Recognition with Pose, Low-Resolution and Sketch. CVPR
2011.
S.J.D. Prince, J.H. Elder, J. Warrell, F.M. Felisberti, Tied Factor
Analysis for Face Recognition across Large Pose Differences, IEEE
Patt. Anal. Mach. Intell, 30(6), 970-984, 2008. Elder is a specialist in this area and has a variety of papers on the topic.
B. Klare, Z. Li and A. K. Jain, Matching forensic sketches to
mugshot photos, IEEE Pattern Analysis and Machine Intelligence, 29
Sept. 2010.
As you can understand this is an active research area/problem. In terms using OpenCV to overcome the difficulties, let me give you an analogy: you need to build build a house (match sketches to photos) and you're asking how will having a Stanley hammer (OpenCV) will help. Sure, it will probably help. But you'll also need a lot of other resources: wood, time/money, pipes, cable, etc.
I think that James Elder's old work on the completeness of the edge map (using reconstruction by solving the Laplace equation) is quite relevant here. See the results at the end of this paper: http://elderlab.yorku.ca/~elder/publications/journals/ElderIJCV99.pdf
You could give Eigenfaces a try, though i never tested them with sketches i think they could a least be a good starting point for your research.
See Wiki: http://en.wikipedia.org/wiki/Eigenface and the Tutorial for OpenCV: http://docs.opencv.org/modules/contrib/doc/facerec/facerec_tutorial.html (including not only Eigenfaces!)
OpenCV can be used for feature extraction and machine learning required for this task. I guess you can start with the papers in the answers above, start with some basic features and prototype a classifier with OpenCV.
I guess you might also want to detect and match feature points on the faces. If you use this approach, you will have to do the feature point detectors on your own (training the Viola-Jones detector in OpenCV with your own data is an option).

How does Content-Aware fill work?

In the upcoming version of Photoshop there is a feature called Content-Aware fill.
This feature will fill a selection of an image based on the surrounding image - to the point it can generate bushes and clouds while being seamless with the surrounding image.
See http://www.youtube.com/watch?v=NH0aEp1oDOI for a preview of the Photoshop feature I'm talking about.
My question is:
How does this feature work algorithmically?
I am a co-author of the PatchMatch paper previously mentioned here, and I led the development of the original Content-Aware Fill feature in Photoshop, along with Ivan Cavero Belaunde and Eli Shechtman in the Creative Technologies Lab, and Jeff Chien on the Photoshop team.
Photoshop's Content-Aware Fill uses a highly optimized, multithreaded variation of the algorithm described in the PatchMatch paper, and an older method called "SpaceTime Video Completion." Both papers are cited on the following technology page for this feature:
http://www.adobe.com/technology/projects/content-aware-fill.html
You can find out more about us on the Adobe Research web pages.
I'm guessing that for the smaller holes they are grabbing similarly textured patches surrounding the area to fill it in. This is described in a paper entitled "PatchMatch: A Randomized Correspondence Algorithm for Structural Image Editing" by Connelly Barnes and others in SIGGRAPH 2009. For larger holes they can exploit a large database of pictures with similar global statistics or texture, as describe in "Scene Completion Using Millions of Photographs". If they somehow could fused the two together I think it should work like in the video.
There is very similar algorithm for GIMP for a quite long time. It is called resynthesizer and probably you should be able to find a source for it (maybe at the project site)
EDIT
There is also source available at the ubuntu repository
And here you can see processing the same images with GIMP: http://www.youtube.com/watch?v=0AoobQQBeVc&feature=related
Well, they are not going to tell for the obvious reasons. The general name for the technique is "inpainting", you can look this up.
Specifically, if you look at what Criminisi did while in Microsoft http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.67.9407 and what Todor Georgiev does now at Adobe http://www.tgeorgiev.net/Inpainting.html, you'll be able to make a very good guess. A 90% guess, I'd say, which should be good enough.
I work on a similar problem. From what i read they use "PatchMatch" or "non-parametric patch sampling" in general.
PatchMatch: A Randomized Correspondence Algorithm
for Structural Image Editing
As a guess (and that's all that it would be) I'd expect that it does some frequency analysis (some like a Fourier transform) of the image. By looking only at the image at the edge of the selection and ignoring the middle, it could then extrapolate back into the middle. If the designers choose the correct color plains and what not, they should be able to generate a texture that seamlessly blends into the image at the edges.
edit: looking at the last example in the video; if you look at the top of the original image on either edge you see that the selection line runs right down a "gap" in the clouds and that right in the middle there is a "bump". These are the kind of artifacts I'd expect to see if my guess is correct. (OTOH, I'd also expect to see them is it was using some kind of sudo-mirroring across the selection boundary.)
The general approach is either content-aware fill or seam-carving. Ariel Shamir's group is responsible for the seminal work here, which was presented in SIGGRAPH 2007. See:
http://www.faculty.idc.ac.il/arik/site/subject-seam-carve.asp
Edit: Please see answer from the co-author of Content-Aware fill. I will be deleting this soon.

robust algorithm for surface reconstruction from 3D point cloud?

I am trying to figure out what algorithms there are to do surface reconstruction from 3D range data. At a first glance, it seems that the Ball pivoting algorithm (BPA) and Poisson surface reconstruction are the more established methods?
What are the established, more robust algorithm in the field other than BPA and Poisson surface reconstruction algorithm?
Recommended research publications?
Is there available source code?
I have been facing this dilemma for some months now, and made exhaustive research.
Algorithms
Mainly there are 2 categories of algorithms: computation geometry, and implicit surfaces.
Computation Geometry
They fit the mesh on the existing points.
Probably the most famous algorithm of this group is powercrust, because it is theoretically well-established - it guarantees watertight mesh.
Ball Pivoting is patented by IBM. Also, it is not suitable for pointclouds with varying point density.
Implicit functions
One fits implicit functions on the pointcloud, then uses a marching-cube-like algorithm to extract the zero-set of the function into a mesh.
Methods in this category differ mainly by the different implicit functions used.
Poisson, Hoppe's, and MPU are the most famous algorithms in this category. If you are new to the topic, i recommend to read Hoppe's thesis, it is very explanatory.
The algorithms of this category usually can be implemented so that they are able to process huge inputs very efficiently, and one can scale their quality<->speed trade-off. They are not disturbed by noise, varying point-density, holes. A disadvantage of them is that they require consistently oriented surface normals at the input points.
Implementations
You will find small number of free implementations. However it depends on whether You are going to integrate it into free software (in this case GPL license is acceptable for You) or into a commercial software (in this case You need a more liberal license). The latter is very rare.
One is in VTK. I suspect it to be difficult to integrate (no documentation is available for free), it has a strange, over-complicated architecture, and is not designed for high-performance applications. Also has some limitations for the allowed input pointclouds.
Take a look at this Poisson implementation, and after that share your experience about it with me please.
Also:
here are a few high-performance algorithms, with surface reconstruction among them.
CGAL is a famous 3d library, but it is free only for free projects.
Meshlab is a famous application with GPL.
Also (Added August 2013):
The library PCL has a module dedicated to surface reconstruction and is in active development (and is part of Google's Summer of Code). The surface module contains a number of different algorithms for reconstruction. PCL also has the ability to estimate surface normals, incase you do not have them provided with your point data, this functionality can be found in the features module. PCL is released under the terms of the BSD license and is open source software, it is free for commercial and research use.
If you want make some direct experiments with various surface reconstruction algorithms you should try MeshLab, the mesh-processing system, it is open source and it contains implementations of many of the previously cited surface reconstruction algorithms, like:
Poisson Surface Recon
a couple of MLS based approach,
a ball pivoting implementation
a variant of the Curless volume based approach
Delaunay based techniques (Alpha shapes and Voronoi filtering)
tools for computing normals from scattered point sets
and many other tools for comparing/measuring/cleaning/simplifying the resulting meshes.
Sources are protected by GPL, so you could not use them in a commercial closed source project, but it is very important to get the right feeling about the properties of the various surface reconstruction algorithms (how sensitive to noise they are, the speed, the robustness to outliers, how they preserve fine details etc etc) before starting to implement one of them.
You might start looking at some recent work in the field - currently something like Fast low-memory streaming MLS reconstruction of point-sampled surfaces by Gianmauro Cuccuru, Enrico Gobbetti, Fabio Marton, Renato Pajarola, and Ruggero Pintus. Its citations can get you going through the literature pretty quickly.
While not a mesh representation, an ex-colleague recommended me this link
to source code for a Thin Plate Spline method:
Link
Anyone tried it?
Not sure if it's exactly right for your case, since it seems weird that you omitted it, but marching cubes is commonly mentioned in cases like these.
As I had this problem too, I did develop and implement my own point cloud crust algorithm. The sources, as well as the documentation, can be found on github.com: https://github.com/meixxi/PointCloudCrust. The algorithm is implemented in Java.
Maybe, this can help you. You can find also a short python script on the page which illustrates how to use the library. Have fun!
Here on GitHub, is a open source Mesh Processing Library in C++ by Dr. Hugues Hoppe, in which the surface reconstruction program Recon is a good option for your problem...
There is 3D Delaunay tool by Geometric Tools. This tool is used DirecX and OpenGL. Unfortunately, you may need buy a book to see actual example code of the library. You still read the code and figure out.
Matlab also introduced a surface reconstruction tool using Delaunay, delaunayTriangulation class.
You might be interested in Alpha Shapes.

Dilemma about image cropping algorithm - is it possible?

I am building a web application using .NET 3.5 (ASP.NET, SQL Server, C#, WCF, WF, etc) and I have run into a major design dilemma. This is a uni project btw, but it is 100% up to me what I develop.
I need to design a system whereby I can take an image and automatically crop a certain object within it, without user input. So for example, cut out the car in a picture of a road. I've given this a lot of thought, and I can't see any feasible method. I guess this thread is to discuss the issues and feasibility of achieving this goal. Eventually, I would get the dimensions of a car (or whatever it may be), and then pass this into a 3d modelling app (custom) as parameters, to render a 3d model. This last step is a lot more feasible. It's the cropping issue which is an issue. I have thought of all sorts of ideas, like getting the colour of the car and then the outline around that colour. So if the car (example) is yellow, when there is a yellow pixel in the image, trace around it. But this would fail if there are two yellow cars in a photo.
Ideally, I would like the system to be completely automated. But I guess I can't have everything my way. Also, my skills are in what I mentioned above (.NET 3.5, SQL Server, AJAX, web design) as opposed to C++ but I would be open to any solution just to see the feasibility.
I also found this patent: US Patent 7034848 - System and method for automatically cropping graphical images
Thanks
This is one of the problems that needed to be solved to finish the DARPA Grand Challenge. Google video has a great presentation by the project lead from the winning team, where he talks about how they went about their solution, and how some of the other teams approached it. The relevant portion starts around 19:30 of the video, but it's a great talk, and the whole thing is worth a watch. Hopefully it gives you a good starting point for solving your problem.
What you are talking about is an open research problem, or even several research problems. One way to tackle this, is by image segmentation. If you can safely assume that there is one object of interest in the image, you can try a figure-ground segmentation algorithm. There are many such algorithms, and none of them are perfect. They usually output a segmentation mask: a binary image where the figure is white and the background is black. You would then find the bounding box of the figure, and use it to crop. The thing to remember is that none of the existing segmentation algorithm will give you what you want 100% of the time.
Alternatively, if you know ahead of time what specific type of object you need to crop (car, person, motorcycle), then you can try an object detection algorithm. Once again, there are many, and none of them are perfect either. On the other hand, some of them may work better than segmentation if your object of interest is on very cluttered background.
To summarize, if you wish to pursue this, you would have to read a fair number of computer vision papers, and try a fair number of different algorithms. You will also increase your chances of success if you constrain your problem domain as much as possible: for example restrict yourself to a small number of object categories, assume there is only one object of interest in an image, or restrict yourself to a certain type of scenes (nature, sea, etc.). Also keep in mind, that even the accuracy of state-of-the-art approaches to solving this type of problems has a lot of room for improvement.
And by the way, the choice of language or platform for this project is by far the least difficult part.
A method often used for face detection in images is through the use of a Haar classifier cascade. A classifier cascade can be trained to detect any objects, not just faces, but the ability of the classifier is highly dependent on the quality of the training data.
This paper by Viola and Jones explains how it works and how it can be optimised.
Although it is C++ you might want to take a look at the image processing libraries provided by the OpenCV project which include code to both train and use Haar cascades. You will need a set of car and non-car images to train a system!
Some of the best attempts I've see of this is using a large database of images to help understand the image you have. These days you have flickr, which is not only a giant corpus of images, but it's also tagged with meta-information about what the image is.
Some projects that do this are documented here:
http://blogs.zdnet.com/emergingtech/?p=629
Start with analyzing the images yourself. That way you can formulate the criteria on which to match the car. And you get to define what you cannot match.
If all cars have the same background, for example, it need not be that complex. But your example states a car on a street. There may be parked cars. Should they be recognized?
If you have access to MatLab, you could test your pattern recognition filters with specialized software like PRTools.
Wwhen I was studying (a long time ago:) I used Khoros Cantata and found that an edge filter can simplify the image greatly.
But again, first define the conditions on the input. If you don't do that you will not succeed because pattern recognition is really hard (think about how long it took to crack captcha's)
I did say photo, so this could be a black car with a black background. I did think of specifying the colour of the object, and then when that colour is found, trace around it (high level explanation). But, with a black object in a black background (no constrast in other words), it would be a very difficult task.
Better still, I've come across several sites with 3d models of cars. I could always use this, stick it into a 3d model, and render it.
A 3D model would be easier to work with, a real world photo much harder. It does suck :(
If I'm reading this right... This is where AI shines.
I think the "simplest" solution would be to use a neural-network based image recognition algorithm. Unless you know that the car will look the exact same in each picture, then that's pretty much the only way.
If it IS the exact same, then you can just search for the pixel pattern, and get the bounding rectangle, and just set the image border to the inner boundary of the rectangle.
I think that you will never get good results without a real user telling the program what to do. Think of it this way: how should your program decide when there is more than 1 interesting object present (for example: 2 cars)? what if the object you want is actually the mountain in the background? what if nothing of interest is inside the picture, thus nothing to select as the object to crop out? etc, etc...
With that said, if you can make assumptions like: only 1 object will be present, then you can have a go with using image recognition algorithms.
Now that I think of it. I recently got a lecture about artificial intelligence in robots and in robotic research techniques. Their research went on about language interaction, evolution, and language recognition. But in order to do that they also needed some simple image recognition algorithms to process the perceived environment. One of the tricks they used was to make a 3D plot of the image where x and y where the normal x and y axis and the z axis was the brightness of that particular point, then they used the same technique for red-green values, and blue-yellow. And lo and behold they had something (relatively) easy they could use to pick out the objects from the perceived environment.
(I'm terribly sorry, but I can't find a link to the nice charts they had that showed how it all worked).
Anyway, the point is that they were not interested (that much) in image recognition so they created something that worked good enough and used something less advanced and thus less time consuming, so it is possible to create something simple for this complex task.
Also any good image editing program has some kind of magic wand that will select, with the right amount of tweaking, the object of interest you point it on, maybe it's worth your time to look into that as well.
So, it basically will mean that you:
have to make some assumptions, otherwise it will fail terribly
will probably best be served with techniques from AI, and more specifically image recognition
can take a look at paint.NET and their algorithm for their magic wand
try to use the fact that a good photo will have the object of interest somewhere in the middle of the image
.. but i'm not saying that this is the solution for your problem, maybe something simpler can be used.
Oh, and I will continue to look for those links, they hold some really valuable information about this topic, but I can't promise anything.

Resources