Obtaining a HOG feature vector for implementation in SVM in Python - image

I am new to sci-kit learn. I have viewed the online tutorials but they all seem to leverage existing data (e.g., digits, iris, etc). I need the information on how to process images so that they can be used by scikit learn.
Details of my Study: I have a webcam set up outside my office. It captures all of the traffic on my street that passes in the field of view. I have cropped several hundred images of sedans, trucks and SUV's. The goal is to predict whether a vehicle is one of these categories. I have applied Histogram Oriented Gradients (HOG) to these images which I have attached for your review to see the differences in the categories. This blog will not allow me to post any images but you can see them here https://stats.stackexchange.com/questions/149421/obtaining-a-hog-feature-vector-for-implementation-in-svm-in-python. I posted the same question at this site but no response. This post is the closest answer I have found. Resize HOG feature for Scikit-Learn classifier
I wish to train an SVM classifier based on these images. I understand that there are algorithms that exist in scikit-image that prepares the HOG images for use in scikit-learn. Can someone help me understand this process. I am also grateful for any thoughts based on your experience as to the probability of success of this classification study. I also understand that I need to train the model using a negative images ( ones with no vehicles. How is this done?
I know I am asking a lot but I am surprised no one that I am aware of has done a tutorial on these early steps. It seems like a fairly elementary study.

Related

How to programmatically distinguish the professional photo from the amateur photo?

What are the different options and solutions (software) that will help distinguish professional (good) from amateur (bad) photo?
The criteria can be the contrast, sharpness, noise, presence of compression artifacts, etc. The question is, what are the tools that allow all this to determine it (the machine, not the man). For all of these criteria can be represented as mathematical models, you think?
Or in other words - to "feed" tool 1000 high-quality photos and 1000 substandard. And machine itself has identified the factors that distinguish the good from the bad image.
This is a quite vague definition of a problem. The only thing you have is 1000 high-quality photos and 1000 substandard photos. Your application however, is quite concrete, and I doubt (but I'm not sure) that you will find such a software.
Without looking to your images and have some tests is also difficult to say if contrast/gamma would be enough to classify them properly.
What you can do, if you know a bit of coding in matlab/python/C, is to use some existing libraries to try to solve your problem. I can't help you with that, as this itself is a quite tedious work, but, I can give you some insights.
To define your problem you will need:
Input: 1000 pro images, 1000 std images
You can represent this as 2000 images and a 2000 binary vector (1 for pro, 0 for std)
Features
Images itself might not give you enough information. What you can do is extract features from images. This step is called feature extraction and is an open research field in Computer Vision. There several feature extractors out there, you can try a couple of the most used ones, such as HoG or SIFT (have a look here for examples).
This feature extractors will give you a 1xM numerical vector for each image. With N images, you have a NxM matrix composed of N images and their descriptor.
Classification:
Once you managed to extract features from the image, having X = NxM data and y = label binary vector, you can use any machine learning algorithm, such as Deep Neural Networks, Random Forests, Supported Vector Machines, or any other one, to train your data, and classify it later.
By putting everything together, you might be able to get decent results.
Is this professional vs amateur photographer or equipment classification? I mean like distinguishing between a DLSR photo or a cell phone photo. Or is this distinguishing between an amateur with a DLSR and a professional with the same equipment? Or, are we talking about photoshop editing at the end.
In the case of equipment, I think features to look at would be noise, contrast, color gamut. In the case of skill of the photographer, you will probably have to look at features based on edge representations, natural scene metrics etc.
But, you will need to create a data matrix and then run a machine learning classification algorithm on it and hopefully find some features.
Are you making or looking for art for humans or for robots?
The definition of a great photo is subjective by definition. What to one person may be junk to another is genius. Trying to take that and turn it into an equation is asking for AI to take over humankind.
I don't mean to be harsh in my assessment. My degree is in Fine Art, Painting. I put a lot of time and effort into thinking about images. It's a wonder to me how a child might make an image that seemed to be thoughtless but was a breakthrough in my perception. Conversely you can work for hours, day, months or even years and then feel like the result just doesn't measure up.
Photography is an ingenious invention, therefore all photography is a source for amazement.
I do agree with you however. Some photos are truly impressive. I think that if we approach this as coders than what we are seeking is 'likes' or 'page views' or some other method of getting counts from many people. I know that is not the answer you were looking for but I don't think you can find a better one. I wish you well on your quest.
If you want to judge a photo on technicalities then the edge of the physics should be your target. Currently that would be mirrorless cameras and 3d imaging.

How compare two images and check whether both images are having same object or not in OpenCV python or JavaCV

I am working on a feature matching project and i am using OpenCV Python as the tool for developed the application.
According to the project requirement, my database have images of some objects like glass, ball,etc ....with their descriptions. User can send images to the back end of the application and back end is responsible for matching the sent image with images which are exist in the database and send the image description to the user.
I had done some research on the above scenario. Unfortunately still i could not find a algorithm for matching two images and identifying both are matching or not.
If any body have that kind of algorithm please send me.(I have to use OpenCV python or JavaCV)
Thank you
This is a very common problem in Computer Vision nowadays. A simple solution is really simple. But there are many, many variants for more sophisticated solutions.
Simple Solution
Feature Detector and Descriptor based.
The idea here being that you get a bunch of keypoints and their descriptors (search for SIFT/SURF/ORB). You can then find matches easily with tools provided in OpenCV. You would match the keypoints in your query image against all keypoints in the training dataset. Because of typical outliers, you would like to add a robust matching technique, like RanSaC. All of this is part of OpenCV.
Bag-of-Word model
If you want just the image that is as much the same as your query image, you can use Nearest-Neighbour search. Be aware that OpenCV comes with the much faster Approximated-Nearest-Neighbour (ANN) algorithm. Or you can use the BruteForceMatcher.
Advanced Solution
If you have many images (many==1 Million), you can look at Locality-Sensitive-Hashing (see Dean et al, 100,000 Object Categories).
If you do use Bag-of-Visual-Words, then you should probably build an Inverted Index.
Have a look at Fisher Vectors for improved accuracy as compared to BOW.
Suggestion
Start by using Bag-Of-Visual-Words. There are tutorials on how to train the dictionary for this
model.
Training:
Extract Local features (just pick SIFT, you can easily change this as OpenCV is very modular) from a subset of your training images. First detect features and then extract them. There are many tutorials on the web about this.
Train Dictionary. Helpful documentation with a reference to a sample implementation in Python (opencv_source_code/samples/python2/find_obj.py)!
Compute Histogram for each training image. (Also in the BOW documentation from previous step)
Put your image descriptors from the step above into a FLANN-Based-matcher.
Querying:
Compute features on your query image.
Use the dictionary from training to build a BOW histogram for your query image.
Use that feature to find the nearest neighbor(s).
I think you are talking about Content Based Image Retrieval
There are many research paper available on Internet.Get any one of them and Implement Best out of them according to your needs.Select Criteria according to your application like Texture based,color based,shape based image retrieval (This is best when you are working with image retrieval on internet for speed).
So you Need python Implementation, I would like to suggest you to go through Chapter 7, 8 of book Computer Vision Book . It Contains Working Example with code of what you are looking for
One question you may found useful : Are there any API's that'll let me search by image?

Find people at image

I need to implement algorithm which for input has picture ( jpeg ) and create new picture like output, but only with bodies ( background is removed completely ). Input picture is picture with people from vacation and I need to recognize human bodies and remove background. Can someone suggest me what algorithm to use, what book to buy to learn that algorihms ?
Check this link it will perfectly answer your question of removing the background and performing further processing
neural networks are particuarly useful for this kind of task, but the theory is a universe, if you're doing it from scratch ... that's a lot of work
This is a segmentation problem. In the general case, segmenting images is a hard research problem (I just spent five years doing a doctorate on segmenting greyscale medical images, for example) and the way you go about it is strongly tied to the type of images with which you have to deal. The best advice I can give is to go and read the appropriate literature on segmenting colour images (e.g. use Google Scholar). In terms of books, this one's a good general-purpose introduction to image processing:
http://www.amazon.co.uk/Digital-Image-Processing-Rafael-Gonzalez/dp/0130946508/ref=sr_1_7?ie=UTF8&qid=1326236038&sr=8-7
Searching for "segmenting people in colour images" on Google seems to turn up some good links, incidentally.
I have a question for you: you want to implement this using an algorithm? If so, then it might require a lot of things to be done (provided you are new to the field of image processing).
Otherwise you may try using masking techniques in image editing software like Adobe Photoshop (that would hardly take 15 mins, depending upon how well you know it)
A good book to start with image processing techniques is: "Digital Image Processing" by Gonzalez and Woods; it starts from the basics, and explains stuff in depth.
Still it may take a lot of time to develop an algorithm to do this job. I recommend you use some library for the same. OpenCV(opensource computer vision) is an excellent choice. The library itself comes with demos which include programs for face detection etc. The inbuilt functions provide a variety of features (edge detection/Feature identification and extraction, you may have to use this) Here's the link
http://opencv.willowgarage.com/wiki/
The link provides a lot of reference material that you can make use of! :)
Start with facial recognition software and algorithms; they have been the most refined over the years and as long as all of your bodies have heads, you can use exif data to figure image capture orientation (of course you can't completely rely on that), sample the facial skin to get skin tone ranges, and find the attached body. Anything that is not head and body should be deleted. This process assumes that a person has roughly the same skin tone on their face as their body and the camera flash isn't washing this out. You could grab the flash duration and some other attributes from exif and adjust your ranges accordingly.
A lot of software out there can recognize faces (look at iPhoto for example), so you'll have to use the face as a reference point, along with skin tone, to find your body edges. You result isn't going to be perfect, but as long as your approach is sound, you'll end up with something useful.
And release your software as open source when you're done so I can use it... :)
You can download a free PDF of the book Computer Vision by Richard Szeliski from the author's website. Not only do you have a free book on algorithms, but it's a book that addresses this specific problem.
http://szeliski.org/Book/
You'll see this image at the top of that page of the author's website.
Used copies of the hardcover are available for about $62 if you check addall.com. If you spent some time doing image processing, you'll appreciate having a paper copy of at least one good general reference book.
Its tough but not impossible. I can't give you any code but Peter Norvig had a great talk on the value of data and in the talk he shows how he was able to take a picture of lake and remove all the houses blocking the image and have the lake expanded with boats,etc..
The computer basically learned how lakes look and boats go on lakes and then removed the houses and placed it there. He explains his process(but no code or anything).
Here it is:
Peter Norvig - The Unreasonable Effectiveness of Data
http://www.youtube.com/watch?v=yvDCzhbjYWs

OpenCV: Fingerprint Image and Compare Against Database

I have a database of images. When I take a new picture, I want to compare it against the images in this database and receive a similarity score (using OpenCV). This way I want to detect, if I have an image, which is very similar to the fresh picture.
Is it possible to create a fingerprint/hash of my database images and match new ones against it?
I'm searching for a alogrithm code snippet or technical demo and not for a commercial solution.
Best,
Stefan
As Pual R has commented, this "fingerprint/hash" is usually a set of feature vectors or a set of feature descriptors. But most of feature vectors used in computer vision are usually too computationally expensive for searching against a database. So this task need a special kind of feature descriptors because such descriptors as SURF and SIFT will take too much time for searching even with various optimizations.
The only thing that OpenCV has for your task (object categorization) is implementation of Bag of visual Words (BOW).
It can compute special kind of image features and train visual words vocabulary. Next you can use this vocabulary to find similar images in your database and compute similarity score.
Here is OpenCV documentation for bag of words. Also OpenCV has a sample named bagofwords_classification.cpp. It is really big but might be helpful.
Content-based image retrieval systems are still a field of active research: http://citeseerx.ist.psu.edu/search?q=content-based+image+retrieval
First you have to be clear, what constitutes similar in your context:
Similar color distribution: Use something like color descriptors for subdivisions of the image, you should get some fairly satisfying results.
Similar objects: Since the computer does not know, what an object is, you will not get very far, unless you have some extensive domain knowledge about the object (or few object classes). A good overview about the current state of research can be seen here (results) and soon here.
There is no "serve all needs"-algorithm for the problem you described. The more you can share about the specifics of your problem, the better answers you might get. Posting some representative images (if possible) and describing the desired outcome is also very helpful.
This would be a good question for computer-vision.stackexchange.com, if it already existed.
You can use pHash Algorithm and store phash value in Database, then use this code:
double const mismatch = algo->compare(image1Hash, image2Hash);
Here 'mismatch' value can easly tell you the similarity ratio between two images.
pHash function:
AverageHash
PHASH
MarrHildrethHash
RadialVarianceHash
BlockMeanHash
BlockMeanHash
ColorMomentHash
These function are well Enough to evaluate Image Similarities in Every Aspects.

What are techniques and practices on measuring data quality?

If I have a large set of data that describes physical 'things', how could I go about measuring how well that data fits the 'things' that it is supposed to represent?
An example would be if I have a crate holding 12 widgets, and I know each widget weighs 1 lb, there should be some data quality 'check' making sure the case weighs 13 lbs maybe.
Another example would be that if I have a lamp and an image representing that lamp, it should look like a lamp. Perhaps the image dimensions should have the same ratio of the lamp dimensions.
With the exception of images, my data is 99% text (which includes height, width, color...).
I've studied AI in school, but have done very little outside of that.
Are standard AI techniques the way to go? If so, how do I map a problem to an algorithm?
Are some languages easier at this than others? Do they have better libraries?
thanks.
Your question is somewhat open-ended, but it sounds like you want is what is known as a "classifier" in the field of machine learning.
In general, a classifier takes a piece of input and "classifies" it, ie: determines a category for the object. Many classifiers provide a probability with this determination, and some may even return multiple categories with probabilities on each.
Some examples of classifiers are bayes nets, neural nets, decision lists, and decision trees. Bayes nets are often used for spam classification. Emails are classified as either "spam" or "not spam" with a probability.
For you question you'd want to classify your objects as "high quality" or "not high quality".
The first thing you'll need is a bunch of training data. That is, a set of objects where you already know the correct classification. One way to obtain this could be to get a bunch of objects and classify them by hand. If there are too many objects for one person to classify you could feed them to Mechanical Turk.
Once you have your training data you'd then build your classifier. You'll need to figure out what attributes are important to your classification. You'll probably need to do some experimentation to see what works well. You then have your classifier learn from your training data.
One approach that's often used for testing is to split your training data into two sets. Train your classifier using one of the subsets, and then see how well it classifies the other (usually smaller) subset.
AI is one path, natural intelligence is another.
Your challenge is a perfect match to Amazon's Mechanical Turk. Divvy your data space up into extremely small verifiable atoms and assign them as HITs on Mechanical Turk. Have some overlap to give yourself a sense of HIT answer consistency.
There was a shop with a boatload of component CAD drawings that needed to be grouped by similarity. They broke it up and set it loose on Mechanical Turk to very satisfying results. I could google for hours and not find that link again.
See here for a related forum post.
This is a tough answer. For example, what defines a lamp? I could google images a picture of some crazy looking lamps. Or even, look up the definition of a lamp (http://dictionary.reference.com/dic?q=lamp). Theres no physical requirements of what a lamp must look like. Thats the crux of the AI problem.
As for data, you could setup Unit testing on the project to ensure that 12 widget() weighs less than 13 lbs in the widetBox(). Regardless, you need to have the data at hand to be able to test things like that.
I hope i was able to answer your question somewhat. Its a bit vauge, and my answers are broad, but hopefully it'll at least send you in a good direction.

Resources