I am now working on the interpolation of gray value for an image. I wanna use the matlab function griddedInterpolant. For the interpolation method it has 'cubic' and 'spline'. Both of them are C2 continues. I have found the algorithm about 'cubic' (actually 'bicubic' since my data are 2D) as listed in WiKipedia:
https://en.wikipedia.org/wiki/Bicubic_interpolation.
I wanna ask if some one know the algorithm behind 'spline' (actually 'bicubic spline' for 2D). Because I don't quite understand the difference between them.
And for my calculation I need the gradient of the gray value too. Since in function griddedInterpolant it is not possible to get the coefficients of the interpolation, there's no way to calculate the gradients directly.
Can someone help me with that?
Related
Folks,
I have read a number of articles on Discrete Wavelet Transform (DWT) and looked at some sample code as well. However, I am not clear on what exactly does DWT achieve.
Here is what I understand. For a two dimensional image in YUV format, I can pass in the Y plane (brightness) to DWT function as a parameter. The function returns me a matrix of the original width and height containing coefficient values.
What are these coefficient values telling me? Is it how fast or slow the brightness of a pixel has changed compared to its neighbors?
Further, the returned matrix is rearranged in four quarters. As the coefficients have been rearranged, I no longer know which coefficient belongs to which pixel. This is confusing. If I cannot associate the coefficient to its corresponding pixel location, how can I really use the coefficients?
A little bit of background. I am looking at hiding some information in an image as an invisible watermark. From what I understand, DWT can help me identify the best region to hide the information. However, I have not been able to put the whole picture together.
Ok. I figured out how DWT works. I was under the assumption that the coefficients generated have a relationship with the original image. However, the transform converts the input luma into a completely different set. It is possible to run the reverse transform on the new values to once again obtain the original values.
Regards,
Peter
I have an image. I want to resize it to double the original size, filling in the new pixels by interpolation. I need to specify which type of interpolation I want to use.
I see the imresize function, which has an option for 'method'. The problem is, there are only 3 options: nearest, bilinear, bicubic. Bilinear and bicubic are averaging/mean methods, but is there any way to set the neighborhood size / weighting?
The main problem is, I need to do it with a 'median' interpolation method, instead of mean. How can I tell it to use this method?
The way that IMRESIZE implements interpolation is by calculating for each pixel in the output image (inverse mapping), the indices of the pixels in the input image that are going to be involved in the interpolation, along with the contributing weights.
The neighborhood and the weights are determined by the type of the interpolation kernel used, which as #Albert points out, can be passed along to the IMRESIZE function (the 'Method' property can accept {f,w} a cell array with the kernel function and the kernel width)
These two components will be used to compute linear combination of the input pixels involved to fill each value of the output pixels. This process is performed along each dimension separately one-at-a-time (vertically then horizontally).
Now the problem for you is that you can never obtain the median value by using a linear combination, that's because median is a non-linear ordering filter. So your only option is to write your own implementation...
Amro is right that the median filter cannot be computed as a weighted response. But MATLAB has a specific function for the median filter: medfilt2.
imresize has a third way of passing the interpolation method: "Two-element Cell Array Specifying Interpolation Kernel". You can read more about it in Matlab's documentation.
I'd like to implement a Filter that allows resampling of an image by moving a number of control points that mark edges and tangent directions. The goal is to be able to freely transform an image as seen in Photoshop when you use "Free Transform" and chose Warpmode "Custom". The image is fitted into a some kind of Spline-Patch (if that is a valid name) that can be manipulated.
I understand how simple splines (paths) work but how do you connect them to form a patch?
And how can you sample such a patch to render the morphed image? For each pixel in the target I'd need to know what pixel in the source image corresponds. I don't even know where to start searching...
Any helpful info (keywords, links, papers, reference implementations) are greatly appreciated!
This document will get you a good insight into warping: http://www.gson.org/thesis/warping-thesis.pdf
However, this will include filtering out high frequencies, which will make the implementation a lot more complicated but will give a better result.
An easy way to accomplish what you want to do would be to loop through every pixel in your final image, plug the coordinates into your splines and retrieve the pixel in your original image. This pixel might have coordinates 0.4/1.2 so you could bilinearly interpolate between 0/1, 1/1, 0/2 and 1/2.
As for splines: there are many resources and solutions online for the 1D case. As for 2D it gets a bit trickier to find helpful resources.
A simple example for the 1D case: http://www-users.cselabs.umn.edu/classes/Spring-2009/csci2031/quad_spline.pdf
Here's a great guide for the 2D case: http://en.wikipedia.org/wiki/Bicubic_interpolation
Based upon this you could derive an own scheme for splines for the 2D case. Define a bivariate (with x and y) polynomial and set your constraints to solve for the coefficients of the polynomial.
Just keep in mind that the borders of the spline patches have to be consistent (both in value and derivative) to avoid ugly jumps.
Good luck!
The image resizing function provided by Emgu (a .net wrapper for OpenCV) can use any one of four interpolation methods:
CV_INTER_NN (default)
CV_INTER_LINEAR
CV_INTER_CUBIC
CV_INTER_AREA
I roughly understand linear interpolation, but can only guess what cubic or area do. I suspect NN stands for nearest neighbour, but I could be wrong.
The reason I'm resizing an image is to reduce the amount of pixels (they will be iterated over at some point) whilst keeping them representative. I mention this because it seems to me that interpolation is central to this purpose - getting the right type ought therefore be quite important.
My question then, is what are the pros and cons of each interpolation method? How do they differ and which one should I use?
Nearest neighbor will be as fast as possible, but you will lose substantial information when resizing.
Linear interpolation is less fast, but will not result in information loss unless you're shrinking the image (which you are).
Cubic interpolation (probably actually "Bicubic") uses one of many possible formulas that incorporate multiple neighbor pixels. This is much better for shrinking images, but you are still limited as to how much shrinking you can do without information loss. Depending on the algorithm, you can probably reduce your images by 50% or 75%. The primary con of this approach is that it is much slower.
Not sure what "area" is - it may actually be "Bicubic". In all likelihood, this setting will give your best result (in terms of information loss / appearance), but at the cost of the longest processing time.
Update: this link gives more details (including a fifth type not included in your list):
http://docs.opencv.org/modules/imgproc/doc/geometric_transformations.html?highlight=resize#resize
The algorithms are: (descriptions are from the OpenCV documentation)
INTER_NEAREST - a nearest-neighbor interpolation
INTER_LINEAR - a bilinear interpolation (used by default)
INTER_AREA - resampling using pixel area relation. It may be a preferred method for image decimation, as it gives moire’-free results. But when the image is zoomed, it is similar to the INTER_NEAREST method.
INTER_CUBIC - a bicubic interpolation over 4x4 pixel neighborhood
INTER_LANCZOS4 - a Lanczos interpolation over 8x8 pixel neighborhood
If you want more speed use Nearest Neighbor method.
If you want to preserve quality of Image after downsampling, you can consider using INTER_AREA based interpolation, but again it depends on image content.
You can find detailed analysis of speed comparison here
Below is the speed comparison on 400*400 px image taken from the above link
The interpolation method to use depends on what you are trying to achieve:
CV_INTER_LINEAR or CV_INTER_CUBIC apply a lowpass filter (average) in order to achieve a trade-off between visual quality and edge removal (lowpass filters tend to remove edges in order to reduce aliasing in images). Between these two, i'd recommend you CV_INTER_CUBIC.
CV_INTER_NN method actually is Nearest neighbour, it's the most basic method and you'll get sharper edges (no lowpass filter will be applied). However this method simply is like "zooming" the image, no visual enhancement.
They all lose information, which you use depends on the speed you need, how much information you can afford to lose and the nature of your image.
Sorry there is no correct answer - that's why there is a choice
I want to implement the two above mentioned image resampling algorithms (bicubic and Lanczos) in C++. I know that there are dozens of existing implementations out there, but I still want to make my own. I want to make it partly because I want to understand how they work, and partly because I want to give them some capabilities not found in mainstream implementations (like configurable multi-CPU support and progress reporting).
I tried reading Wikipedia, but the stuff is a bit too dry for me. Perhaps there are some nicer explanations of these algorithms? I couldn't find anything either on SO or Google.
Added: Seems like nobody can give me a good link about these topics. Can anyone at least try to explain them here?
The basic operation principle of both algorithms is pretty simple. They're both convolution filters. A convolution filter that for each output value moves the convolution functions point of origin to be centered on the output and then multiplies all the values in the input with the value of the convolution function at that location and adds them together.
One property of convolution is that the integral of the output is the product of the integrals of the two input functions. If you consider the input and output images, then the integral means average brightness and if you want the brightness to remain the same the integral of the convolution function needs to add up to one.
One way how to understand them is to think of the convolution function as something that shows how much input pixels influence the output pixel depending on their distance.
Convolution functions are usually defined so that they are zero when the distance is larger than some value so that you don't have to consider every input value for every output value.
For lanczos interpolation the convolution function is based on the sinc(x) = sin(x*pi)/x function, but only the first few lobes are taken. Usually 3:
lanczos(x) = {
0 if abs(x) > 3,
1 if x == 0,
else sin(x*pi)/x
}
This function is called the filter kernel.
To resample with lanczos imagine you overlay the output and input over eachother, with points signifying where the pixel locations are. For each output pixel location you take a box +- 3 output pixels from that point. For every input pixel that lies in that box, calculate the value of the lanczos function at that location with the distance from the output location in output pixel coordinates as the parameter. You then need to normalize the calculated values by scaling them so that they add up to 1. After that multiply each input pixel value with the corresponding scaling value and add the results together to get the value of the output pixel.
Because lanzos function has the separability property and, if you are resizing, the grid is regular, you can optimize this by doing the convolution horizontally and vertically separately and precalculate the vertical filters for each row and horizontal filters for each column.
Bicubic convolution is basically the same, with a different filter kernel function.
To get more detail, there's a pretty good and thorough explanation in the book Digital Image Processing, section 16.3.
Also, image_operations.cc and convolver.cc in skia have a pretty well commented implementation of lanczos interpolation.
While what Ants Aasma says roughly describes the difference, I don't think it is particularly informative as to why you might do such a thing.
As far as links go, you are asking a very basic question in image processing, and any decent introductory textbook on the subject will describe this. If I remember correctly, Gonzales and Woods is decent on it, but I'm away from my books and can't check.
Now on to the particulars, it should help to think about what you are doing fundamentally. You have a square lattice of measurements that you want to interpolate new values for. In the simple case of upsampling, lets imagine you want a new measurement in between every one that you already have (e.g. double the resolution).
Now you won't get the "correct" value, because in general you don't have that information. So you have to estimate it. How to do this? A very simple way would be to linearly interpolate. Everyone knows how to do this with two points, you just draw a line between them, and read the new value off the line (in this case, at the half way point).
Now an image is two dimensional, so you really want to do this in both the left-right and up-down directions. Use the result for your estimate and voila you have "bilinear" interpolation.
The main problem with this is that it isn't very accurate, although it's better (and slower) than the "nearest neighbor" approach which is also very local and fast.
To address the first problem, you want something better than a linear fit of two points, you want to fit something to more data points (pixels), and something that can be nonlinear. A good trade off on accuracy and computational cost is something called a cubic spline. So this will give you a smooth fit line, and again you approximate your new "measurement" by the value it takes in the middle. Do this in both directions and you've got "bicubic" interpolation.
So that's more accurate, but still heavy. One way to address the speed issue is to use a convolution, which has the nice property that in the Fourier domain, it's just a multiplication, so we can implement it quite quickly. But you don't need to worry about the implementation to understand that the convolution result at any point is one function (your image) being integrated in product another, typically much smaller support (the part that is non-zero) function called the kernel), after that kernel has been centered over that particular point. In the discrete world, these are just sums of the products.
It turns out that you can design a convolution kernel that has properties quite like the cubic spline, and use that to get a fast "bicubic"
Lancsoz resampling is a similar thing, with slightly different properties in the kernel, which primarily means they will have different characteristic artifacts. You can look up the details of these kernel functions easily enough (I'm sure wikipedia has them, or any intro text). The implementations used in graphics programs tend to be highly optimized and sometimes have specialized assumptions which make them more efficient but less general.
I would like suggest the following article for a basic understanding of different image interpolation methods image interpolation via convolution. If you want to try more interpolation methods, the imageresampler is a nice open source project to begin with.
In my opinion image interpolation can be understood from two aspects, one is from function fitting perspective, and one is from convolution perspective. For example, the spline interpolation explained in image interpolation via convolution is well explained from function fitting perspective in Cubic interpolation.
Additionally, image interpolation is always related to a specific application, for example image zooming, image rotation and so on. In fact for a specific application, image interpolation can be implemented i.n a smart way. For example, image rotation can be implemented via a three-shearing method, and during each shearing operation different one-dimension interpolation algorithms can be implemented.