Image Processing: Fastest direct-neighbor pixel comparison - algorithm

I have a written a code, that takes the difference of intensities of neighbor pixels and gets the maximum difference. However I would like some thoughts on how to implement my "algorithm" faster. Till now I resorted to switchand ifstatements.
my code is simple yet messy. There is the thoughts behind it:
go to my point of interest
identify the pixels in its direct neighborhood
calculate the difference of intensities
compare the calculated intensities and deduce the maximum
take the maximum to etc...
That lead me to multiple switch and if statements. Do you have any thoughts on that ?

You can see OpenCv library, not need write this code this library have this and many other function. Read this:
http://www.seas.upenn.edu/~bensapp/opencvdocs/ref/opencvref_cv.htm

Related

standard deviation of a 2D array but with a twist

I'm working on light propagation and I compute PSFs for multifocal systems. I need a way to estimate the quality of each z-slice that I compute. For instance :
This could consider a focal point since all light is condense in a small region
Where this is definitely not one:
I'm looking for a way to express the quality of each slice. I thought about looking for the standard deviation of the best gaussian surface that fit the slice. For that I looked into numpy.std but it's clearly that std that I'm looking for.
I also looked into something called scipy.stats.kurtosis it's interesting but it's not perfect from my test and in the future I will need to apply this computation for the PSF and for the FTM (Fourier transform of the PSF).
I have a precise size for each pixel and I want the standard deviation like the width of the surface that fit my dataset at half height.
I looked into gaussian regression but it's way too long for each slice. I'm sure it exist a somehow simple process to compute this std (if it's actually named like that).
This is written in python but I intentionnaly don't put the tag because I'm sure people from other language could help with this question as well.

Recognize recurring images into a larger one

Edit: this is not a duplicate of Determine if an image exists within a larger image, and if so, find it, using Python since I do not know the pattern beforehand
Suppose I have a big image (usually a picture taken with a camera so it might be a bit noisy, but let's assume it's not for now) made up of multiple smaller images all equal among themselves, something like
I need to find the contour of each one of those. The first step is recognizing that there's a recurring image (or unknown pattern) in the 2D image. How can I achieve this first step?
I did read around that I might use a FFT of the original image and search for duplicate frequencies, would that be a feasible approach?
To build a bit on the problem: I do not know the image beforehand, nor its size or how many will there be on the big image. The images can be shot from camera so they might be noisy. The images won't overlap.
You can try to use described keypoints (Sift/SURF/ORB/etc.) to find features in the image and try to detect the same features in the image.
You can see such a result in How to find euclidean distance between keypoints of a single image in opencv where 3x the same image is present and features are detected and linked between those subimages automatically.
In your image the result looks like
so you can see that the different occurances of the same pattern is indeed automatically detected and linked.
Next steps would be to group features to objects, so that the "whole" pattern can be extracted. Once you have a candidate for a pattern, you can extract a homography for each occurance of the pattern (with one reference candidate pattern) to verify that it is a pattern. One open problem is how to find such candidates. Maybe it is worth trying to find "parallel features", so keypoint matches that have parallel lines and/or same length lines (see image). Or maybe there is some graph theory approach.
All in all, this whole approach will have some advantages and disadvantes:
Advantages:
real world applicability - Sift and other keypoints are working quite well even with noise and some perspective effects, so chances are increased to find such patterns.
Disadvantages
slow
parametric (define what it means that two features are successfully
matched)
not suitable for all kind of patterns - your pattern must have some extractable keypoints
Those are some thoughts and probably not complete ;)
Unfortunately no full code yet for your concrete task, but I hope the idea is clear.
For such a clean image, it suffices to segment the patterns by blob analysis and to compare the segments or ROI that contain them. The size is a first matching criterion. The SAD, SSD or correlation similarity scores can do finer comparison.
In practice you will face more difficulties such as
not possible to segment the patterns
geometric variations in size/orientation
partial occlusion
...
Handling these is out of the scope of this answer; it makes things much harder than in the "toy" case.
The goal is to find several equal or very similar patterns which are not known before in a picture. As it is this problem is still a bit ill posed.
Are the patterns exactly equal or only similar (added noise maybe)?
Do you want to have the largest possible patterns or are smaller subpatterns okay too or are all possible patterns needed? Reason is that of course each pattern could consist of equal patterns too.
Is the background always that simple (completely white) or can it be much more difficult? What do we know about it?
Are the patterns always equally oriented, equally scaled, non-overlapping?
For the simple case of non-overlapping patterns with simple background, the answer of Yves Daoust using segmentation is well performing but fails if patterns are very close or overlapping.
For other cases the idea of the keypoints by Micka will help but might not perform well if there is noise or might be slow.
I have one alternative: look at correlations of subblocks of the image.
In pseudocode:
Divide the image in overlapping areas of size MxN for a suitable M,N (pixel width and height chosen to be approximately the size of the desired pattern)
Correlate each subblock with the whole image. Look for local maxima in the correlation. The position of these maxima denotes the position of similar regions.
Choose a global threshold on all correlations (smartly somehow) and find sets of equal patterns.
Determine the fine structure of these patterns by shanging the shape from rectangular (bounding box) to a more sophisticaed shape (maybe by looking at the shape of the peaks in the correlation)
In case the approximate size of the desired patterns is not known before, try with large values of M, N and go down to smaller ones.
To speed up the whole process start on a coarse scale (downscaled version of the image) and then process finer scales only where needed. Needs balancing of zooming in and performing correlations.
Sorry, I cannot make this a full Matlab project right now, but I hope this helps you.

Finding the angle of stripeline/ Angle of rotation

So I’m trying to find the rotational angle for stripe lines in images like the attached photo.
The only assumption is that the lines are parallel, and their orientation is about 90 degrees approximately more or less [say 5 degrees tolerance].
I have to make sure the stripe lines in the result image will be %100 vertical. The quality of the images varies as well as their histogram/greyscale values. So methods based on non-adaptive thresholding already failed for my cases [I’m not interested in thresholding based methods if I cannot make it adaptive]. Also, there are some random black clusters on top of the stripe lines sometimes.
What I did so far:
1) Of course HoughLines is the first option, but I couldn’t make it work for all my images, I had some partial success though following this great article:
http://felix.abecassis.me/2011/09/opencv-detect-skew-angle/.
The main reason of failure to my understanding was that, I needed to fine tune the parameters for different images. Parameters such as Canny/BW/Morphological edge detection (If needed) | parameters for minLinelength/maxLineGap/etc. For sure there’s a way to hack into this and make it work, but, to me this is a fragile solution!
2) What I’m working on right now, is to divide the image to a top slice and a bottom slice, then find the peaks and valleys of each slice. Then basically find the angle using the width of the image and translation of peaks. I’m currently working on finding which peak of the top slice belongs to which of the bottom slice, since there will be some false positive peaks in my computation due to existence of black/white clusters on top of the strip lines.
Example: Location of peaks for slices:
Top Slice = { 1, 33,67,90,110}
BottomSlice = { 3, 14, 35,63,90,104}
I am actually getting similar vectors when extracting peaks. So as can be seen, the length of vector might vary, any idea how can I get a group like:
{{1,3},{33,35},{67,63},{90,90},{110,104}}
I’m open to any idea about improving any of these algorithms or a completely new approach. If needed, I can upload more images.
If you can get a list of points for a single line, a linear regression will give you a formula for the straight line that best fits the points. A simple trig operation will convert the line formula to an angle.
You can probably use some line thinning operation to turn the stripes into a list of points.
You can run an accumulator of spatial derivatives along different angles. If you want a half-degree precision and a sample of 5 lines, you have a maximum 10*5*1500 = 7.5m iterations. You can safely reduce the sampling rate along the line tenfold, which will give you a sample size of 150 points per sample, reducing the number of iterations to less than a million. Somewhere around that point the operation of straightening the image ought to become the bottleneck.

What type of smoothing to use?

Not sure if this may or may not be valid here on SO, but I was hoping someone can advise of the correct algorithm to use.
I have the following RAW data.
In the image you can see "steps". Essentially I wish to get these steps, but then get a moving average of all the data between. In the following image, you can see the moving average:
However you will notice that at the "steps", the moving average decreases the gradient where I wish to keep the high vertical gradient.
Is there any smoothing technique that will take into account a large vertical "offset", but smooth the other data?
Yup, I had to do something similar with images from a spacecraft.
Simple technique #1: use a median filter with a modest width - say about 5 samples, or 7. This provides an output value that is the median of the corresponding input value and several of its immediate neighbors on either side. It will get rid of those spikes, and do a good job preserving the step edges.
The median filter is provided in all number-crunching toolkits that I know of such as Matlab, Python/Numpy, IDL etc., and libraries for compiled languages such as C++, Java (though specific names don't come to mind right now...)
Technique #2, perhaps not quite as good: Use a Savitzky-Golay smoothing filter. This works by effectively making least-square polynomial fits to the data, at each output sample, using the corresponding input sample and a neighborhood of points (much like the median filter). The SG smoother is known for being fairly good at preserving peaks and sharp transistions.
The SG filter is usually provided by most signal processing and number crunching packages, but might not be as common as the median filter.
Technique #3, the most work and requiring the most experience and judgement: Go ahead and use a smoother - moving box average, Gaussian, whatever - but then create an output that blends between the original with the smoothed data. The blend, controlled by a new data series you create, varies from all-original (blending in 0% of the smoothed) to all-smoothed (100%).
To control the blending, start with an edge detector to detect the jumps. You may want to first median-filter the data to get rid of the spikes. Then broaden (dilation in image processing jargon) or smooth and renormalize the the edge detector's output, and flip it around so it gives 0.0 at and near the jumps, and 1.0 everywhere else. Perhaps you want a smooth transition joining them. It is an art to get this right, which depends on how the data will be used - for me, it's usually images to be viewed by Humans. An automated embedded control system might work best if tweaked differently.
The main advantage of this technique is you can plug in whatever kind of smoothing filter you like. It won't have any effect where the blend control value is zero. The main disadvantage is that the jumps, the small neighborhood defined by the manipulated edge detector output, will contain noise.
I recommend first detecting the steps and then smoothing each step individually.
You know how to do the smoothing, and edge/step detection is pretty easy also (see here, for example). A typical edge detection scheme is to smooth your data and then multiply/convolute/cross-corelate it with some filter (for example the array [-1,1] that will show you where the steps are). In a mathematical context this can be viewed as studying the derivative of your plot to find inflection points (for some of the filters).
An alternative "hackish" solution would be to do a moving average but exclude outliers from the smoothing. You can decide what an outlier is by using some threshold t. In other words, for each point p with value v, take x points surrounding it and find the subset of those points which are between v - t and v + t, and take the average of these points as the new value of p.

Where can I find a good read about bicubic interpolation and Lanczos resampling?

I want to implement the two above mentioned image resampling algorithms (bicubic and Lanczos) in C++. I know that there are dozens of existing implementations out there, but I still want to make my own. I want to make it partly because I want to understand how they work, and partly because I want to give them some capabilities not found in mainstream implementations (like configurable multi-CPU support and progress reporting).
I tried reading Wikipedia, but the stuff is a bit too dry for me. Perhaps there are some nicer explanations of these algorithms? I couldn't find anything either on SO or Google.
Added: Seems like nobody can give me a good link about these topics. Can anyone at least try to explain them here?
The basic operation principle of both algorithms is pretty simple. They're both convolution filters. A convolution filter that for each output value moves the convolution functions point of origin to be centered on the output and then multiplies all the values in the input with the value of the convolution function at that location and adds them together.
One property of convolution is that the integral of the output is the product of the integrals of the two input functions. If you consider the input and output images, then the integral means average brightness and if you want the brightness to remain the same the integral of the convolution function needs to add up to one.
One way how to understand them is to think of the convolution function as something that shows how much input pixels influence the output pixel depending on their distance.
Convolution functions are usually defined so that they are zero when the distance is larger than some value so that you don't have to consider every input value for every output value.
For lanczos interpolation the convolution function is based on the sinc(x) = sin(x*pi)/x function, but only the first few lobes are taken. Usually 3:
lanczos(x) = {
0 if abs(x) > 3,
1 if x == 0,
else sin(x*pi)/x
}
This function is called the filter kernel.
To resample with lanczos imagine you overlay the output and input over eachother, with points signifying where the pixel locations are. For each output pixel location you take a box +- 3 output pixels from that point. For every input pixel that lies in that box, calculate the value of the lanczos function at that location with the distance from the output location in output pixel coordinates as the parameter. You then need to normalize the calculated values by scaling them so that they add up to 1. After that multiply each input pixel value with the corresponding scaling value and add the results together to get the value of the output pixel.
Because lanzos function has the separability property and, if you are resizing, the grid is regular, you can optimize this by doing the convolution horizontally and vertically separately and precalculate the vertical filters for each row and horizontal filters for each column.
Bicubic convolution is basically the same, with a different filter kernel function.
To get more detail, there's a pretty good and thorough explanation in the book Digital Image Processing, section 16.3.
Also, image_operations.cc and convolver.cc in skia have a pretty well commented implementation of lanczos interpolation.
While what Ants Aasma says roughly describes the difference, I don't think it is particularly informative as to why you might do such a thing.
As far as links go, you are asking a very basic question in image processing, and any decent introductory textbook on the subject will describe this. If I remember correctly, Gonzales and Woods is decent on it, but I'm away from my books and can't check.
Now on to the particulars, it should help to think about what you are doing fundamentally. You have a square lattice of measurements that you want to interpolate new values for. In the simple case of upsampling, lets imagine you want a new measurement in between every one that you already have (e.g. double the resolution).
Now you won't get the "correct" value, because in general you don't have that information. So you have to estimate it. How to do this? A very simple way would be to linearly interpolate. Everyone knows how to do this with two points, you just draw a line between them, and read the new value off the line (in this case, at the half way point).
Now an image is two dimensional, so you really want to do this in both the left-right and up-down directions. Use the result for your estimate and voila you have "bilinear" interpolation.
The main problem with this is that it isn't very accurate, although it's better (and slower) than the "nearest neighbor" approach which is also very local and fast.
To address the first problem, you want something better than a linear fit of two points, you want to fit something to more data points (pixels), and something that can be nonlinear. A good trade off on accuracy and computational cost is something called a cubic spline. So this will give you a smooth fit line, and again you approximate your new "measurement" by the value it takes in the middle. Do this in both directions and you've got "bicubic" interpolation.
So that's more accurate, but still heavy. One way to address the speed issue is to use a convolution, which has the nice property that in the Fourier domain, it's just a multiplication, so we can implement it quite quickly. But you don't need to worry about the implementation to understand that the convolution result at any point is one function (your image) being integrated in product another, typically much smaller support (the part that is non-zero) function called the kernel), after that kernel has been centered over that particular point. In the discrete world, these are just sums of the products.
It turns out that you can design a convolution kernel that has properties quite like the cubic spline, and use that to get a fast "bicubic"
Lancsoz resampling is a similar thing, with slightly different properties in the kernel, which primarily means they will have different characteristic artifacts. You can look up the details of these kernel functions easily enough (I'm sure wikipedia has them, or any intro text). The implementations used in graphics programs tend to be highly optimized and sometimes have specialized assumptions which make them more efficient but less general.
I would like suggest the following article for a basic understanding of different image interpolation methods image interpolation via convolution. If you want to try more interpolation methods, the imageresampler is a nice open source project to begin with.
In my opinion image interpolation can be understood from two aspects, one is from function fitting perspective, and one is from convolution perspective. For example, the spline interpolation explained in image interpolation via convolution is well explained from function fitting perspective in Cubic interpolation.
Additionally, image interpolation is always related to a specific application, for example image zooming, image rotation and so on. In fact for a specific application, image interpolation can be implemented i.n a smart way. For example, image rotation can be implemented via a three-shearing method, and during each shearing operation different one-dimension interpolation algorithms can be implemented.

Resources