HIstogram based feature extraction in an Image - image

I have an image which is subdivided into twelve ROIs. ROI is further divided into 10*10 pixel blocks. Now I want to compute features such as local contrast, minimum brightness ,sharpness, hue and saturation in each of these blocks , and plot 5 histograms(corresponding to each feature) for each of the ROI. I want to finally combine all histograms into one extended descriptor vector and use it for classification. Can someone please help me with a step wise approach?

Related

A Summary of How SURF Works

I am trying to figure out how SURF feature detection works. I think I have made some progress. I would like to know how off I am from what's really going on.
A template image you have already got stored and a real-world image
are compared on the basis of "key points" or some important features
in the two images.
The smallest Euclidean distance between the same points constitutes a
good match.
What constitutes an important feature or keypoint? A corner
(intersection of edges) or a blob (sharp change in intensity).
SURF uses blobs.
It uses a Hessian matrix for blob detection or feature extraction.
The Hessian matrix is a matrix of second derivatives: this is to
figure out the minima and maxima associated with the intensity of a
given region in the image.
sift/surf etc have 3 stages:
find features/keypoints that are likely to be found in different images of same object again (surf uses box filters afair). those features should be scale and rotation invariant if possible. corners, blobs etc are good and most often searched in multiple scales.
find the right "orientation" of that point so that if the image is rotated according to that orientation, both images are aligned in regard to that single keypoint.
computation of a "descriptor" that has information of how the neighborhood of the keypoint looks like (after orientation) in the right scale.
now your euclidean distance computation is done only on the descriptors, not on the keypoint locations!
it is important to know that step 1 isnt fixed for SURF. SURF in fact is step 2-3 but the authors give a suggestion how step 1 can be done to have some synergies with steps 2-3. the synergy is that both, step 1 and 3 use integral images to speed things up, so the integral image has to be computed only once.

Color quantization of an image using K-means clustering (using RGB features)

Is it possible to clustering for RGB + spatial features of images with matlab?
NOTE: I want to use kmeans for clustering.
In fact basicly i want to do one thing, i want to get this image
from this
I think you are looking for color quantization.
[imgQ,map]= rgb2ind(img,4,'nodither'); %change this 4 to the number of desired colors
%in quantized image
imshow(imgQ,map);
Result:
Using kmeans :
%img is the original image
imgVec=[reshape(img(:,:,1),[],1) reshape(img(:,:,2),[],1) reshape(img(:,:,3),[],1)];
[imgVecQ,imgVecC]=kmeans(double(imgVec),4); %4 colors
imgVecQK=pdist2(imgVec,imgVecC); %choosing the closest centroid to each pixel,
[~,indMin]=min(imgVecQK,[],2); %avoiding double for loop
imgVecNewQ=imgVecC(indMin,:); %quantizing
imgNewQ=img;
imgNewQ(:,:,1)=reshape(imgVecNewQ(:,1),size(img(:,:,1))); %arranging back into image
imgNewQ(:,:,2)=reshape(imgVecNewQ(:,2),size(img(:,:,1)));
imgNewQ(:,:,3)=reshape(imgVecNewQ(:,3),size(img(:,:,1)));
imshow(img)
figure,imshow(imgNewQ,[]);
Result of kmeans :
If you want to add distance constraint to kmeans, the code will be slightly different. Basically, you need to concatenate pixel coordinates of corresponding pixel vales too. But remember, while assigning nearest centroid to each pixel, assign only the color i.e. the first 3 dimensions, not the last 2. That doesn't make sense, obviously. The code is very similar to the previous, please note the changes and understand them.
[col,row]=meshgrid(1:size(img,2),1:size(img,1));
imgVec=[reshape(img(:,:,1),[],1) reshape(img(:,:,2),[],1) reshape(img(:,:,3),[],1) row(:) col(:)];
[imgVecQ,imgVecC]=kmeans(double(imgVec),4); %4 colors
imgVecQK=pdist2(imgVec(:,1:3),imgVecC(:,1:3));
[~,indMin]=min(imgVecQK,[],2);
imgVecNewQ=imgVecC(indMin,1:3); %quantizing
imgNewQ=img;
imgNewQ(:,:,1)=reshape(imgVecNewQ(:,1),size(img(:,:,1))); %arranging back into image
imgNewQ(:,:,2)=reshape(imgVecNewQ(:,2),size(img(:,:,1)));
imgNewQ(:,:,3)=reshape(imgVecNewQ(:,3),size(img(:,:,1)));
imshow(img)
figure,imshow(imgNewQ,[]);
Result of kmeans with distance constraint:

Percentage difference between two images

I have two images of same height/width they look like similar.But they are not exactly similar pixel by pixel.That is one of the image is moved to right by few pixels.
I am currently using imagemagick compare command.It shows difference as it compares pixel by pixel.Also i tried with fuzz attribute of it.
Please suggest any other tool to compare such type of images.
I don't know what you're really trying to achieve, but if you want a metric to express the similitude between the two images without taking image displacement into account, then maybe you should work in the frequency domain.
As instance, the frequency part of the DFT of your images should be nearly identical, so if you compute the SNR of the two frequency parts, it should be practically null.
In fact, according to the Fourier shift theorem, you can even get an estimation of the displacement offset by calculating the inverse DFT of the combination of the two DFT.

SVM for image feature classification?

I implemented the Spatial Pyramid Matching algorithm designed by
Lazebnik in Matlab and the last step is to do the svm
classification. And at this point I totally don't understand how I
should do that in terms of what input I should provide to the svmtrain and
svmclassify functions to get the pairs of feature point coordinates of
train and test image in the end.
I have:
coordinates of SIFT feature points on the train image
coordinates of SIFT feature points on the train image
intersection kernel matrix for train image
intersection kernel matrix for test image.
Which of these I should use?
A SVM classifier expects as input a set of objects (images) represented by tuples where each tuple is a set of numeric attributes. Some image features (e.g. gray level histogram) provides an image representation in the form of a vector of numerical values which is suitable to train a SVM. However, feature extraction algorithms like SIFT will output for each image a set of vectors. So the question is:
How can we convert this set of feature vectors to a unique vector that represents the image?
To solve this problem, you will have to use a technique that is called bag of visual words.
The problem is that number of points is different, SVM expects feature vector to be the same size for train and for test.
coordinates of SIFT feature points on the train image coordinates of
SIFT feature points on the train image
The coordinates won't help for SVM.
I would use:
the number of found SIFT feature points
segment the images in small rects and use the presence of a SIFT-Feature point in a
particular rect as boolean feature value. The feature is then the rect/SIFT-feature type
combination. for N-Rects and M-SIFt feature point types you obtain
N*M features.
The second approach requires spatial normalization of images - same size, same rotation
P.S.: I'm not expert in ML. I've only done some experiments on cell-recognition in microscope images.

How to find RGB/HSV color parameters for color tracking?

I would like to track a color in a set of images.
For this reason I use the algorithm of constant thresholding mentioned in
Introduction to Autonomous Mobile Robots. This method simply marks all those pixels that are among a minimum and a maximum threshold of red, green, blue (or hue, saturation, value in my case).
My problem is that - although HSV is less sensitive to changing light conditions - I still would like to set the thresholds from program to minimize the number of false positives and false negatives. In other words the algorithm would ensure that only a given set of pixels is marked in the end, for example a rectangle on a calibration image.
I know that the problem is a search in a 6-dimensional parameter space and I could come up with possible solutions but I am looking for other programmers' opinion and experience on this subject.
If that matters I try to implement it in C++ with OpenCV.
As far as I understand the question you are looking for procedure to calibrate 6 thresholds (min and max for each of the HSV channels) from a calibration image that contains your tracking marker. To achieve this I would:
first manually delineate the
region, in the calibration image,
where the marker appears
calculate that region's histograms, one for each of the
HSV channels
set the min and max thresholds to the histogram
percentiles 0.05 and 0.95
respectively
Not using the histogram's minimum and maximum values, but rather its 0.05 and 0.95 percentiles helps the measure be more robust to noise.
EDIT:
A modification of the second step:
If you want to minimize the error, you could establish a normilzed histogram of the marker and a normalized histogram of the environment (this can be 2 separate images) and subtract the latter from the first. The resulting marker histogram will have background pixel values attenuated. This will affect the values of the above mentioned percentiles.

Resources