Is there any implementation of supersampled nearest-neighbor upscaling? - image

Nearest-neighbor is a commonly-used "filtering" technique for scaling pixel art while showing individual pixels. However, it doesn't work well for scaling with non-integral factors. I had an idea for a modification that works well for non-integral factors significantly larger than the original size.
Nearest-neighbor: For each output pixel, sample the original image at one location.
Linear: For each output pixel, construct a gradient between the two input pixels, and sample the gradient.
Instead, I want to calculate which portion of the original image would map to the output pixel rectangle, then calculate the average color within that region by blending the input pixels according to their coverage of the mapped rectangle.
This algorithm would produce the same results as supersampling with an infinite number of samples. It is not the same as linear filtering, as it does not produce gradients, only blended pixels on the input-pixel boundaries of the output image.
A better description of the algorithm is at this link: What is the best image downscaling algorithm (quality-wise)? . Note that the URL mentions downscaling, which could potentially have more than four pixels per output pixel. Upscaling has a maximum of four input pixels per output pixel processed, though.
Now is there any image editor or utility that supports weighted-average scaling?

Related

Can't understand the SIFT keypoint extraction (not description) algorithm

I have read two references about the SIFT algorithm here and here and I am not truly understanding how just some key points are detected considering that the algorithm works on the difference of Gaussians calculated on several resolutions (they call them octaves). Here are the steps from the technique according to what I've understood from the paper.
Given the input image, blur them with Gaussian filters using different sigmas, resulting in Gaussian filtered images. In the paper, they used 5 Gaussian filters per octave (they tell that two adjacent Gaussian filtered images were filtered using sigma and k * sigma gaussian filter parameters), and they consider 4 octaves in the algorithm. So, there are a total of 20 gaussian filtered images (5 per octave), but we will act on 5 Gaussian filtered images from each octave individually.
For each octave, we calculate 4 Difference Of Gaussian images (DOG) from the 5 Gaussian filtered images by just subtracting adjacent Gaussian filtered images. So, now we have a total of 16 difference of Gaussian images, but we will consider 4 Difference of Gaussian images from each octave individually.
Find local extrema (maximum or minimum values) pixels by comparing a pixel in each DOG with 26 neighbor pixels. Among these, 8 pixels are in the same scale as the pixel (in a 3x3 window), 9 are in a 3x3 window in the scale above (Difference of Gaussian image from the same octave) and 9 others in a 3x3 window in the scale below.
Once finding these local extrema in different octaves, we must refine them, eliminating low-contrast points or weak edge points. They filter bad candidates using threshold in a Taylor expansion function and eigenvalue ratio threshold calculated in a Hessian matrix.
(this part I don't understand perfectly): For each interest point that survived (in each octave, I believe), they consider a neighborhood around it and calculate gradient magnitudes and orientation of each pixel from that region. They build a gradient orientation histogram covering 360 degrees and select the highest peak and also peaks that are higher than 80% of the highest peak. They define that the orientation of the keypoint is defined in a parabola (fitting function?) to the 3 histogram values closest to each peak to interpolate the peak position (I really dont understand this part perfectly).
What I am not understanding
1- The tutorial and even the original paper are not clear on how to detect a single key point as we are dealing with multiple octaves (images resolutions). For example, suppose I have detected 1000 key points in the first octave, 500 in the second, 250 in the third and 125 in the fourth octave. The SIFT algorithm will return me the following data about the key points: 1-(x,y) coordinates 2- scale (what is that?) 3-orientation and 4- the feature vector (which I easily understood how it is built). There are also Python functions from Opencv that can draw these keypoints using the original image (thus, the first octave), but how if the keypoints are detected in different octaves and thus the algorithm considers DOG images with different resolutions?
2- I don't understand the part 5 of the algorithm very well. Is it useful for defining the orientation of the keypoint, right? can somebody explain that to me with other words and maybe I can understand?
3- To find Local extrema per octave (step 3), they don't explain how to do that in the first and last DOG images. As we are considering 4 DOG images, It is possible to do that only in the second and third DOG images.
4- There is another thing that the author wrote that completely messed the understanding of the approach from me:
Figure 1: For each octave of scale space, the initial image is
repeatedly convolved with Gaussians to produce the set of scale space
images shown on the left. Adjacent Gaussian images are subtracted to
produce the difference-of-Gaussian images on the right. After each
octave, the Gaussian image is down-sampled by a factor of 2, and the
process repeated.
What? Does he downsample only one Gaussian image? how can the process be repeated by doing that? I mean, the difference of Gaussians is originally done by filtering the INPUT IMAGE differently. So, I believe the INPUT IMAGE, and not the Gaussian image must be resampled. Or the author forgot to write that THE GAUSSIAN IMAGES from a given octave are downsampled and the process is repeated for another octave?

2D raster image line of sight algorithm

I'm working on a simple mapping application for fun, and one of the things I need to do is to find (and color) all of the points that are visible from the current location. In this case, points are pixels. My map is a raster image where transparent pixels are open space, and any other pixels are opaque. (There are no semi-transparent pixels; alpha is either 0 or 100%.) In this sense, it's sort of like a regular flood fill, with the constraint that each filled pixel has to have a clear line-of-sight to the origin point. The following image shows a couple of such areas colored in (the tiny crosshairs are the origin points, and white = transparent):
(http://tinyurl.com/nf3nqa4)
In addition, what I am ultimately interested in are the points that "border" other colors, i.e., I want the list of points that make up the edge of the visible region.
My current and very inefficient solution is the modified flood-fill I described above. This approach returns correct results, but due to the need to iterate every pixel on a line to the origin for every pixel in the flood fill, it's very slow. My images are downsized and quantized, but I still need about 1MP for acceptable accuracy, and typical LoS areas are at least 100,000 pixels each.
I may well be using the wrong search terms, but I haven't been able to find any discussion of algorithms that would solve this (rasterized) LoS case.
I suspect that this could be done more efficiently if your "walls" were represented as equations rather than simply pixels in a raster image. For example, polygons/triangles, circles, ellipses.
It would then be like raytracing (search for this term) in 2D. In other words, you could consider the ray/line from each pixel in the image to the point of interest and color the pixel only if it does not intersect with any object.
This method does require you to test the intersection for each pixel in the image with each object; however, if you look up raytracing you will find a number of efficient methods for testing these intersections. They will mostly be for the 3D case but it should be straightforward to convert them to 2D.
There are 3D raytracers that are very fast on MUCH larger images so this should be very doable.
You can try a delaunay triangulation on each color. I mean you can try to find the shape of each color with DT.

Matlab - How to measure the dispersion of black in a binary image?

I am comparing RGB images of small colored granules spilled randomly on a white backdrop. My current method involves importing the image into Matlab, converting to a binary image, setting a threshold and forcing all pixels above it to white. Next, I am calculating the percentage of the pixels that are black. In comparing the images to one another, the measurement of % black pixels is great; however, it does not take into account how well the granules are dispersed. Although the % black from two different images may be identical, the images may be far from being alike. For example, assume I have two images to compare. Both show a % black pixels of 15%. In one picture, the black pixels are randomly distributed throughout the image. In the other, a clump of black pixels are in one corner and are very sparse in the rest of the image.
What can I use in Matlab to numerically quantify how "spread out" the black pixels are for the purpose of comparing the two images?
I haven't been able to wrap my brain around this one yet, and need some help. Your thoughts/answers are most appreciated.
Found an answer to a very similar problem -> https://stats.stackexchange.com/a/13274
Basically, you would use the average distance from a central point to every black pixel as a measure of dispersion.
My idea is based upon the mean free path ()used in ideal gad theory / thermodynamics)
First, you must separate your foreground objects, using something like bwconncomp.
The mean free path is calculated by the mean distance between the centers of your regions. So for n regions, you take all n/2*(n-1) pairs, calculate all the distances and average them. If the mean distance is big, your particles are well spread. If it is small, your objects are close together.
You may want to multiply the resulting mean by n and divide it by the edge length to get a dimensionless number. (Independent of your image size and independent of the number of particles)

Partial convolution in MATLAB

I have large matrix (image) and a small template. I would like to convolve the small matrix with the larger matrix. For example, the blue region is the section that I want to be used for convolution. In other words, I can use the convolution for all of the image, but since the CPU time is increased, therefore, I would like to just focus on the desired blue part.
Is there any command in MATLAB that can be used for this convolution? Or, how I can force the convolution function to just use that specific irregular section for convolution.
I doubt you can do an irregular shape (fast convolution is done with 2D FFT, which would require a square region). You could optimize it by finding the shape's bounding box and thus discarding the empty border.
#Nicole i would go for the fft2(im).*fft(smallIm) which is the equivalent for conv2(im,smallIm).
as far as recognizing the irregular shape you can use edge detection like canny and find the values of the most (left,right,top,bottom) dots, since canny returns a binary (1,0) image and prepare a bounding box, using the values. however this will take some time to create. and i'm not sure about how much faster will this be.

raster images zooming algorithms

Suppose we have 20x20 raster image. How does zooming work?
For instance how to map (Xo,Yo) to (Xn, Yn), where o - original, n - new. Obviously there are 2 cases when new resolution is less or greater than original one. Feels like you would try similarity transformation - but then how do you apply it to pixel per pixel, so that resulting image has no holes (or when resulting image has lower dimensions, how would you fit there).
There are lots of algorithms to zoom and resize images. This is generally referred to as resampling. Wikipedia has lots of information about this along with algorithm examples:
https://en.wikipedia.org/wiki/Resampling_(bitmap)

Resources