Does there exist a way to directly figure out the "smoothness" of a digital image?

Does there exist a way to directly figure out the "smoothness" of a digital image? - image

There exist several ways to evaluate an image, brightness, saturation, hue, intensity, contrast etc. And we always hear about the operation of smoothing or sharperning an image. From this, there must exist a way to evaluate the overall smoothness of an image and an exact way to figure out this value in one formula probably based on wavelet. Or fortunately anyone could even provide the MATLAB function or combination of them to directly calculate this value.
Thanks in advance!

Smoothness is a vague term. What considered smooth for one application might not be considered smooth for another.
In the common case, smoothness is a function of the color gradients. Take a 2d gradient on the 3 color channels, then take their magnitude, sqrt(dx^2 + dy^2) and average, sum or some function over the 3 channels. That can give you local smoothness which you can then sum/average/least squares over the image.
In the more common case, however, linear changes in color is also smooth (think 2 color gradients, or how light might be reflected from an object). For that, a second differential could be more suitable. A laplacian does exactly that.
I've had much luck using the laplacian operator for calculating smoothness in Python with the scipy/numpy libraries. Similar utilities exist for matlab and other tools.
Note that the resulting value isn't something absolute from the math books, you should only use it relative to itself and using constants you deem fit.
Specific how to:
First get scipy. If you are on Linux it's it available on pypi. For Windows you'll have to use a precompiled version here. You should open the image using scipy.ndimage.imread and then use scipy.ndimage.filters.laplace on the image you read. You don't actually have to mix the channels, you can simply call numpy.average and it should be close enough.
import scipy as np
import scipy.ndimage as ndi
print np.average(np.absolute(ndi.filters.laplace(ndi.imread(path).astype(float) / 255.0)))
This would give the average smoothness (for some meaning of smoothness) of the image. I use np.absolute since values can be positive or negative and we don't want them to even out when averaging. I convert to float and divide by 255 to have values between 0.0 and 1.0 instead of 0 to 256, since it's easier to work with.
If you want to see the what the laplacian found, you can use matplotlib:
import matplotlib.pyplot as plt
v = np.absolute(ndi.filters.laplace(ndi.imread(path).astype(float) / 255.0))
v2 = np.average(v, axis=2) # Mixing the channels down
plt.imshow(v2);
plt.figure();
plt.imshow(v2 > 0.05);
plt.show()

Related

algorithm for finding closest images based on jitter / translation

I've got a series of images and in some of them the people are only slightly moved, or the camera was shifted slightly, but mostly all is still the same.
I'm wondering algorithmically how I could detect this and find and score images based on their closeness.
A simple euclidian distance might not work - imagine the case in where zebra stripes were shifted just enough to have the "old" white positions filled with black and vice versa. A pathological example, I know, but you get the idea.
As an optional tag along, perhaps there's a nice OpenCV or scipy (preference for Python) function for this or some of the pipeline for doing this.
Thanks!

You can calculate the difference between your images.
The higher the intensity values of the difference image, the more they are different.
So, if you have two exactly the same images and subtract them, there will be a "black" difference image.
You can simply use the overloaded operator-() of Mat-class.

Which way is my yarn oriented?

I have an image processing problem. I have pictures of yarn:
The individual strands are partly (but not completely) aligned. I would like to find the predominant direction in which they are aligned. In the center of the example image, this direction is around 30-34 degrees from horizontal. The result could be the average/median direction for the whole image, or just the average in each local neighborhood (producing a vector map of local directions).
What I've tried: I rotated the image in small steps (1 degree) and calculated statistics in the vertical vs horizontal direction of the rotated image (for example: standard deviation of summed rows or summed columns). I reasoned that when the strands are oriented exactly vertically or exactly horizontally the difference in statistics would be greatest, and so that angle of rotation is the correct direction in the original image. However, for at least several kinds of statistical properties I tried, this did not work.
I further thought that perhaps this wasn't working because there were too many different directions at the same time in the whole image, so I tired it in a small neighborhood. In this case, there is always a very clear preferred direction (different for each neighborhood), but it is not the direction that the fibers really go... I can post my sample code but it is basically useless.
I keep thinking there has to be some kind of simple linear algebra/statistical property of the whole image, or some value derived from the 2D FFT that would give the correct direction in one step... but how?
What probably won't work: detecting individual fibers. They are not necessarily the same color, and the image can shade from light to dark so edge detectors don't work well, and the image may not even be in focus sometimes. Because of that, it is not always even possible to see individual fibers for a human (see top-right in the example), they kinda have to be detected as preferred direction in a statistical sense.

You might try doing this in the frequency domain. The output of a Fourier Transform is orientation dependent so, if you have some kind of oriented pattern, you can apply a 2D FFT and you will see a clustering around a specific orientation.
For example, making a greyscale out of your image and performing FFT (with ImageJ) gives this:
You can see a distinct cluster that is oriented orthogonally with respect to the orientation of your yarn. With some pre-processing on your source image, to remove noise and maybe enhance the oriented features, you can probably achieve a much stronger signal in the FFT. Once you have a cluster, you can use something like PCA to determine the vector for the major axis.
For info, this is a technique that is often used to enhance oriented features, such as fingerprints, by applying a selective filter in the FFT and then taking the inverse to obtain a clearer image.
An alternative approach is to try a series of Gabor filters see here pre-built with a selection of orientations and frequencies and use the resulting features as a metric for identifying the most likely orientation. There is a scikit article that gives some examples here.
UPDATE
Just playing with ImageJ to give an idea of some possible approaches to this - I started with the FFT shown above, then - in the following image, I performed these operations (clockwise from top left) - Threshold => Close => Holefill => Erode x 3:
Finally, rather than using PCA, I calculated the spatial moments of the lower left blob using this ImageJ Plugin which handily calculates the orientation of the longest axis based on the 2nd order moment. The result gives an orientation of approximately -38 degrees (with respect to the X axis):
Depending on your frame of reference you can calculate the approximate average orientation of your yarn from this rather than from PCA.

I tried to use Gabor filters to enhance the orientations of your yarns. The parameters I used are:
phi = x*pi/16; % x = 1, 3, 5, 7
theta = 3;
sigma = 0.65*theta;
filterSize = 3;
And the imag part of the convoluted image are shown below:
As you mentioned, the most orientations lies between 30-34 degrees, thus the filter with phi = 5*pi/16 in left bottom yields the best contrast among the four.

I would consider using a Hough Transform for this type of problem, there is a nice write-up here.

Image Warp Filter - Algorithm and Rasterization

I'd like to implement a Filter that allows resampling of an image by moving a number of control points that mark edges and tangent directions. The goal is to be able to freely transform an image as seen in Photoshop when you use "Free Transform" and chose Warpmode "Custom". The image is fitted into a some kind of Spline-Patch (if that is a valid name) that can be manipulated.
I understand how simple splines (paths) work but how do you connect them to form a patch?
And how can you sample such a patch to render the morphed image? For each pixel in the target I'd need to know what pixel in the source image corresponds. I don't even know where to start searching...
Any helpful info (keywords, links, papers, reference implementations) are greatly appreciated!

This document will get you a good insight into warping: http://www.gson.org/thesis/warping-thesis.pdf
However, this will include filtering out high frequencies, which will make the implementation a lot more complicated but will give a better result.
An easy way to accomplish what you want to do would be to loop through every pixel in your final image, plug the coordinates into your splines and retrieve the pixel in your original image. This pixel might have coordinates 0.4/1.2 so you could bilinearly interpolate between 0/1, 1/1, 0/2 and 1/2.
As for splines: there are many resources and solutions online for the 1D case. As for 2D it gets a bit trickier to find helpful resources.
A simple example for the 1D case: http://www-users.cselabs.umn.edu/classes/Spring-2009/csci2031/quad_spline.pdf
Here's a great guide for the 2D case: http://en.wikipedia.org/wiki/Bicubic_interpolation
Based upon this you could derive an own scheme for splines for the 2D case. Define a bivariate (with x and y) polynomial and set your constraints to solve for the coefficients of the polynomial.
Just keep in mind that the borders of the spline patches have to be consistent (both in value and derivative) to avoid ugly jumps.
Good luck!

How do I choose an image interpolation method? (Emgu/OpenCV)

The image resizing function provided by Emgu (a .net wrapper for OpenCV) can use any one of four interpolation methods:
CV_INTER_NN (default)
CV_INTER_LINEAR
CV_INTER_CUBIC
CV_INTER_AREA
I roughly understand linear interpolation, but can only guess what cubic or area do. I suspect NN stands for nearest neighbour, but I could be wrong.
The reason I'm resizing an image is to reduce the amount of pixels (they will be iterated over at some point) whilst keeping them representative. I mention this because it seems to me that interpolation is central to this purpose - getting the right type ought therefore be quite important.
My question then, is what are the pros and cons of each interpolation method? How do they differ and which one should I use?

Nearest neighbor will be as fast as possible, but you will lose substantial information when resizing.
Linear interpolation is less fast, but will not result in information loss unless you're shrinking the image (which you are).
Cubic interpolation (probably actually "Bicubic") uses one of many possible formulas that incorporate multiple neighbor pixels. This is much better for shrinking images, but you are still limited as to how much shrinking you can do without information loss. Depending on the algorithm, you can probably reduce your images by 50% or 75%. The primary con of this approach is that it is much slower.
Not sure what "area" is - it may actually be "Bicubic". In all likelihood, this setting will give your best result (in terms of information loss / appearance), but at the cost of the longest processing time.
Update: this link gives more details (including a fifth type not included in your list):
http://docs.opencv.org/modules/imgproc/doc/geometric_transformations.html?highlight=resize#resize

The algorithms are: (descriptions are from the OpenCV documentation)
INTER_NEAREST - a nearest-neighbor interpolation
INTER_LINEAR - a bilinear interpolation (used by default)
INTER_AREA - resampling using pixel area relation. It may be a preferred method for image decimation, as it gives moire’-free results. But when the image is zoomed, it is similar to the INTER_NEAREST method.
INTER_CUBIC - a bicubic interpolation over 4x4 pixel neighborhood
INTER_LANCZOS4 - a Lanczos interpolation over 8x8 pixel neighborhood
If you want more speed use Nearest Neighbor method.
If you want to preserve quality of Image after downsampling, you can consider using INTER_AREA based interpolation, but again it depends on image content.
You can find detailed analysis of speed comparison here
Below is the speed comparison on 400*400 px image taken from the above link

The interpolation method to use depends on what you are trying to achieve:
CV_INTER_LINEAR or CV_INTER_CUBIC apply a lowpass filter (average) in order to achieve a trade-off between visual quality and edge removal (lowpass filters tend to remove edges in order to reduce aliasing in images). Between these two, i'd recommend you CV_INTER_CUBIC.
CV_INTER_NN method actually is Nearest neighbour, it's the most basic method and you'll get sharper edges (no lowpass filter will be applied). However this method simply is like "zooming" the image, no visual enhancement.

They all lose information, which you use depends on the speed you need, how much information you can afford to lose and the nature of your image.
Sorry there is no correct answer - that's why there is a choice

How can I choose an image with higher contrast in PHP?

For a thumbnail-engine I would like to develop an algorithm that takes x random thumbnails (crop, no resize) from an image, analyzes them for contrast and chooses the one with the highest contrast. I'm working with PHP and Imagick but I would be glad for some general tips about how to compute contrast of imagery.
It seems that many things are easier than computing contrast, for example counting colors, computing luminosity,etc.
What are your experiences with the analysis of picture material?

I'd do it that way (pseudocode):
L[256] = {0,0,0...}
loop over each pixel:
luminance = avg(R,G,B)
increment L[luminance] by 1
for i = 0 to 255:
if L[i] < C: L[i] = 0 // C = threshold of your chose
find index of first and last non-zero value of L[]
contrast = last - first

In looking for the image "with the highest contrast," you will need to be very careful in how you define contrast for the image. In the simplest way, contrast is the difference between the lowest intensity and the highest intensity in the image. That is not going to be very useful in your case.
I suggest you use a histogram approach to describe the contrast of a given image and then compare the properties of the histograms to determine the image with the highest contrast as you define it. You could use a variety of well known containers to represent the histogram in code, or construct a class to meet your specific needs. (I am not implying that you need to create a histogram in the form of a chart – just a statistical representation of the intensity values.) You could use the variance of each histogram directly as a measure of contrast, or use the standard deviation if that is easier to work with.
The key really lies in how you define the contrast of the image. In general, I would define a high contrast image as one with values present for all, or nearly all, the possible values. And I would further add that in this definition of a high contrast image, the intensity values of the image will tend to be distributed across the range of possible values in a uniform way.
Using this approach, a low contrast image would tend to have relatively few discrete intensity values and they would tend to be closely grouped together rather than uniformly distributed. (As a general rule, they will also tend to be grouped toward the center of the range.)

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio