Plotting ROC curve for binary image - roc

I think ROC has some limitation to evaluate the performance of some binary classifier. I plot the ROC for an image classifier with this instruction: first I apply my detection method to the image, it's result is a gray-scale image. Now I must apply different threshold to obtain different binary images. For a range of threshold (for example th=0.01:0.01:1), I obtain a binary image corresponding to each threshold. Then for each binary image, true positive rate(TPR) and false positive rate(FPR) are calculated which (TPR,FPR) determines a point on the ROC curve. The whole curve includes the points that are calculated for each binary image.
And my problem: if at the first step I have a binary image against a gray-scale image, how I can apply different threshold to it for plotting ROC. Is there any performance evaluation instead of ROC that be suitable for this state?

Related

Can't understand the SIFT keypoint extraction (not description) algorithm

I have read two references about the SIFT algorithm here and here and I am not truly understanding how just some key points are detected considering that the algorithm works on the difference of Gaussians calculated on several resolutions (they call them octaves). Here are the steps from the technique according to what I've understood from the paper.
Given the input image, blur them with Gaussian filters using different sigmas, resulting in Gaussian filtered images. In the paper, they used 5 Gaussian filters per octave (they tell that two adjacent Gaussian filtered images were filtered using sigma and k * sigma gaussian filter parameters), and they consider 4 octaves in the algorithm. So, there are a total of 20 gaussian filtered images (5 per octave), but we will act on 5 Gaussian filtered images from each octave individually.
For each octave, we calculate 4 Difference Of Gaussian images (DOG) from the 5 Gaussian filtered images by just subtracting adjacent Gaussian filtered images. So, now we have a total of 16 difference of Gaussian images, but we will consider 4 Difference of Gaussian images from each octave individually.
Find local extrema (maximum or minimum values) pixels by comparing a pixel in each DOG with 26 neighbor pixels. Among these, 8 pixels are in the same scale as the pixel (in a 3x3 window), 9 are in a 3x3 window in the scale above (Difference of Gaussian image from the same octave) and 9 others in a 3x3 window in the scale below.
Once finding these local extrema in different octaves, we must refine them, eliminating low-contrast points or weak edge points. They filter bad candidates using threshold in a Taylor expansion function and eigenvalue ratio threshold calculated in a Hessian matrix.
(this part I don't understand perfectly): For each interest point that survived (in each octave, I believe), they consider a neighborhood around it and calculate gradient magnitudes and orientation of each pixel from that region. They build a gradient orientation histogram covering 360 degrees and select the highest peak and also peaks that are higher than 80% of the highest peak. They define that the orientation of the keypoint is defined in a parabola (fitting function?) to the 3 histogram values closest to each peak to interpolate the peak position (I really dont understand this part perfectly).
What I am not understanding
1- The tutorial and even the original paper are not clear on how to detect a single key point as we are dealing with multiple octaves (images resolutions). For example, suppose I have detected 1000 key points in the first octave, 500 in the second, 250 in the third and 125 in the fourth octave. The SIFT algorithm will return me the following data about the key points: 1-(x,y) coordinates 2- scale (what is that?) 3-orientation and 4- the feature vector (which I easily understood how it is built). There are also Python functions from Opencv that can draw these keypoints using the original image (thus, the first octave), but how if the keypoints are detected in different octaves and thus the algorithm considers DOG images with different resolutions?
2- I don't understand the part 5 of the algorithm very well. Is it useful for defining the orientation of the keypoint, right? can somebody explain that to me with other words and maybe I can understand?
3- To find Local extrema per octave (step 3), they don't explain how to do that in the first and last DOG images. As we are considering 4 DOG images, It is possible to do that only in the second and third DOG images.
4- There is another thing that the author wrote that completely messed the understanding of the approach from me:
Figure 1: For each octave of scale space, the initial image is
repeatedly convolved with Gaussians to produce the set of scale space
images shown on the left. Adjacent Gaussian images are subtracted to
produce the difference-of-Gaussian images on the right. After each
octave, the Gaussian image is down-sampled by a factor of 2, and the
process repeated.
What? Does he downsample only one Gaussian image? how can the process be repeated by doing that? I mean, the difference of Gaussians is originally done by filtering the INPUT IMAGE differently. So, I believe the INPUT IMAGE, and not the Gaussian image must be resampled. Or the author forgot to write that THE GAUSSIAN IMAGES from a given octave are downsampled and the process is repeated for another octave?

fft of perturbations on a circle in MATLAB

I have an image showing a circle (shown) with a mix of different modes of perturbation imposed on it. It is inconvenient to "unwrap" this circle into Cartesian coordinates to perform FFT on account of some other additional features. Is it possible to use FFT on the circle - can I pass FFT an object or a binary matrix with 1s inside the shape, 0s outside to obtain the power spectrum on that surface?

Is there any implementation of supersampled nearest-neighbor upscaling?

Nearest-neighbor is a commonly-used "filtering" technique for scaling pixel art while showing individual pixels. However, it doesn't work well for scaling with non-integral factors. I had an idea for a modification that works well for non-integral factors significantly larger than the original size.
Nearest-neighbor: For each output pixel, sample the original image at one location.
Linear: For each output pixel, construct a gradient between the two input pixels, and sample the gradient.
Instead, I want to calculate which portion of the original image would map to the output pixel rectangle, then calculate the average color within that region by blending the input pixels according to their coverage of the mapped rectangle.
This algorithm would produce the same results as supersampling with an infinite number of samples. It is not the same as linear filtering, as it does not produce gradients, only blended pixels on the input-pixel boundaries of the output image.
A better description of the algorithm is at this link: What is the best image downscaling algorithm (quality-wise)? . Note that the URL mentions downscaling, which could potentially have more than four pixels per output pixel. Upscaling has a maximum of four input pixels per output pixel processed, though.
Now is there any image editor or utility that supports weighted-average scaling?

Image feature vector length calculation

Here is the problem: I have an image 256x256 on which wavelet transform is applied. As a result I get a table 16x64 coefficients (3.066568 3.386725 and so on). Do you have any idea how the feature vector length can be calculated?

Partial convolution in MATLAB

I have large matrix (image) and a small template. I would like to convolve the small matrix with the larger matrix. For example, the blue region is the section that I want to be used for convolution. In other words, I can use the convolution for all of the image, but since the CPU time is increased, therefore, I would like to just focus on the desired blue part.
Is there any command in MATLAB that can be used for this convolution? Or, how I can force the convolution function to just use that specific irregular section for convolution.
I doubt you can do an irregular shape (fast convolution is done with 2D FFT, which would require a square region). You could optimize it by finding the shape's bounding box and thus discarding the empty border.
#Nicole i would go for the fft2(im).*fft(smallIm) which is the equivalent for conv2(im,smallIm).
as far as recognizing the irregular shape you can use edge detection like canny and find the values of the most (left,right,top,bottom) dots, since canny returns a binary (1,0) image and prepare a bounding box, using the values. however this will take some time to create. and i'm not sure about how much faster will this be.

Resources