Pseudo code for Blockwise Non Local Means noise reduction algorithm

Pseudo code for Blockwise Non Local Means noise reduction algorithm - algorithm

I have implemented a nice algorithm ("Non Local Means") for reducing noise in image.
It is based on it's Matlab implementation.
The problem with NLMeans is that the original algorithm is slow even on compiled languages like c/c++ and i am trying to run it using scripting language.
One of best solutions is to use improved Blockwise NLMeans algorithm which is ~60-80 times faster. The problem is that the paper which describes it is written in a complex mathematical language and it's really hard for me to understand an idea and program it
(yes, i didn't learn math at college).
That is why i am desperately looking for a pseudo code of this algorithm.
(modification of original Matlab implementation would be just perfect)

I admit, I was intrigued until I saw the result – 260+ seconds on a dual core, and that doesn't assume the overhead of a scripting language, and that's for the Optimized Block Non Local Means filter.
Let me break down the math for you – My idea of pseudo-code is writing in Ruby.
Non Local Means Filtering
Assume an image that's 200 x 100 pixels (20000 pixels total), which is a pretty tiny image. We're going to have to go through 20,000 pixels and evaluate each one on the weighted average of the other 19,999 pixels: [Sorry about the spacing, but the equation is interpreted as a link without it]
NL [v] (i) = ∑ w(i,j)v(j) [j ∈ I]
where 0 ≤ w(i,j) ≤ 1 and ∑j w(i,j) = 1
Understandably, this last part can be a little confusing, but this is really nothing more than a convolution filter the size of the whole image being applied to each pixel.
Blockwise Non Local Means Filtering
The blockwise implementation takes overlapping sets of voxels (volumetric pixels - the implementation you pointed us to is for 3D space). Presumably, taking a similar approach, you could apply this to 2D space, taking sets of overlapping pixels. Let's see if we can describe this...
NL [v] (ijk) = 1/|Ai|∑ w(ijk, i)v(i)
Where A is a vector of the pixels to be estimated, and similar circumstances as above are applied.
[NB: I may be slightly off; It's been a few years since I did heavy image processing]
Algorithm
In all likelihood, we're talking about reducing complexity of the algorithm at a minimal cost to reduction quality. The larger the sample vector, the higher the quality as well as the higher the complexity. By overlapping then averaging the sample vectors from the image then applying that weighted average to each pixel we're looping through the image far fewer times.
Loop through the image to collect the sample vectors and store their weighted average to an array.
Apply each weighted average (a number between 0 and 1) to each pixel times the pixels value.
Pretty simple, but the processing time is going to be horrid with larger images.
Final Thoughts
You're going to have to make some tough decisions. If you're going to use a scripting language, you're already dealing with significant interpretive overhead. It's far from optimal to use a scripting language for heavy duty image processing. If you're not processing medical images, in all likelihood, there are far better algorithms to use with lesser O's.
Hope this is helpful... I'm not terribly good at making a point clearly and concisely, so if I can clarify anything, let me know.

Related

Should one calculate QR decomposition before Least Squares to speed up the process?

I am reading the book "Introduction to linear algebra" by Gilbert Strang. The section is called "Orthonormal Bases and Gram-Schmidt". The author several times emphasised the fact that with orthonormal basis it's very easy and fast to calculate Least Squares solution, since Qᵀ*Q = I, where Q is a design matrix with orthonormal basis. So your equation becomes x̂ = Qᵀb.
And I got the impression that it's a good idea to every time calculate QR decomposition before applying Least Squares. But later I figured out time complexity for QR decomposition and it turned out to be that calculating QR decomposition and after that applying Least Squares is more expensive than regular x̂ = inv(AᵀA)Aᵀb.
Is that right that there is no point in using QR decomposition to speed up Least Squares? Or maybe I got something wrong?
So the only purpose of QR decomposition regarding Least Squares is numerical stability?

There are many ways to do least squares; typically these vary in applicability, accuracy and speed.
Perhaps the Rolls-Royce method is to use SVD. This can be used to solve under-determined (fewer obs than states) and singular systems (where A'*A is not invertible) and is very accurate. It is also the slowest.
QR can only be used to solve non-singular systems (that is we must have A'*A invertible, ie A must be of full rank), and though perhaps not as accurate as SVD is also a good deal faster.
The normal equations ie
compute P = A'*A
solve P*x = A'*b
is the fastest (perhaps by a large margin if P can be computed efficiently, for example if A is sparse) but is also the least accurate. This too can only be used to solve non singular systems.
Inaccuracy should not be taken lightly nor dismissed as some academic fanciness. If you happen to know that the problems ypu will be solving are nicely behaved, then it might well be fine to use an inaccurate method. But otherwise the inaccurate routine might well fail (ie say there is no solution when there is, or worse come up with a totally bogus answer).
I'm a but confused that you seem to be suggesting forming and solving the normal equations after performing the QR decomposition. The usual way to use QR in least squares is, if A is nObs x nStates:
decompose A as A = Q*(R )
(0 )
transform b into b~ = Q'*b
(here R is upper triangular)
solve R * x = b# for x,
(here b# is the first nStates entries of b~)

Multichannel blind deconvolution in the simplest formulation: how to solve?

Recently I began to study deconvolution algorithms and met the following acquisition model:
where f is the original (latent) image, g is the input (observed) image, h is the point spread function (degradation kernel), n is a random additive noise and * is the convolution operator.
If we know g and h, then we can recover f using Richardson-Lucy algorithm:
where , (W,H) is the size of rectangular support of h and multiplication and division are pointwise. Simple enough to code in C++, so I did just so. It turned out that approximates to f while i is less then some m and then it starts rapidly decay. So the algorithm just needed to be stopped at this m - the most satisfactory iteration.
If the point spread function g is also unknown then the problem is said to be blind, and the modification of Richardson-Lucy algorithm can be applied:
For initial guess for f we can take g, as before, and for initial guess for h we can take trivial PSF, or any simple form that would look similar to observed image degradation. This algorithm also works quit fine on the simulated data.
Now I consider the multiframe blind deconvolution problem with the following acquisition model:
Is there a way to develop Richardson-Lucy algorithm for solving the problem in this formulation? If no, is there any other iterative procedure for recovering f, that wouldn't be much more complicated than the previous ones?

According to your acquisition model, latent image (f) remains same while the observed images are different due to different psf and noise models. One way to look at it, is a motion-blur problem where a sharp and noise-free image(f) is corrupted by the motion blur kernel. As this is an ill-posed problem, in most of the literature it's solved iteratively by estimating the blur kernel and the latent image. The way you solve this depends entirely on your objective function.
For example in some papers IRLS is used to estimate the blur kernel. You can find a lot of literature on this.
If you want to use Richardson Lucy Blind deconvolution, then use it on just one frame.
One strategy can be in each iteration while recovering f, assign different weights for contribution from each g(observed images). You can incorporate different weights in the objective function or calculate them according to the estimated blur kernel.

Is there a way to develop Richardson-Lucy algorithm for solving the problem in this formulation?
I'm not a specialist in this area, but I don't think that such way to construct an algorithm exists, at least not straightforwardly. Here is my argument for this. The first problem you described (when the psf is known) is already ill-posed due to the random nature of the noise and loss of information about convolution near image edges. The second problem on your list — single-channel blind deconvolution — is the extention of the previous one. In this case in addition it's underdetermined, so the ill-posedness expands, and so it's natural that the method to solve this problem is developed from the method for solving the first problem. Now when we consider the multichannel blind deconvolution formulation, we add a bunch of additional information to our previous model and so the problem goes from underdetermined to overdetermined. This is the whole other kind of ill-posedness and hence different approaches to solution are required.
is there any other iterative procedure for recovering f, that wouldn't be much more complicated than the previous ones?
I can recommend the algorithm introduced by Šroubek and Milanfar in [1]. I'm not sure whether it's much more complicated on your opinion or not so much, but it's by far one of the most recent and robust. The formulation of the problem is precisely the same as you wrote. The algorithm takes as input K>1 number of images, the upper bound of the psf size L, and four tuning parameters: alpha, beta, gamma, delta. To specify gamma, for example, you will need to estimate the variance of the noise on your input images and take the largest variance var, then gamma = 1/var. The algorithm solves the following optimization problem using alternating minimization:
where F is the data fidelity term and Q and R are regularizers of the image and blurs, respectively.
For detailed analysis of the algorithm see [1], for a collection of different deconvolution formulation and their solutions see [2]. Hope it helps.
Referenses:
Filip Šroubek, Peyman Milanfar. —- Robust Multichannel Blind Deconvolution via Fast Alternating Minimization.
-— IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 4, APRIL 2012
Patrizio Campisi, Karen Egiazarian. —- Blind Image Deconvolution: Theory and Applications

SGM Disparity subpixel estimation - how to?

Some weeks ago I've implemented a simple block matching stereo algorithm but the results had been bad. So I've searched on the Internet to find better algorithms. There I found the semi global matching (SGM), published by Heiko Hirschmueller. It gets one of the best results in relation to its processing time.
I've implemented the algorithm and got really good results (compared to simple block matching) as you can see here:
I've reprojected the 2D points to 3D by using the calculated disparity values with the following result
At the end of SGM I have an array with aggregated costs for each pixel. The disparity is equivalent to the index with the lowest cost value.
The problem is, that searching for the minimum only returns discrete values. This results in individually layers in the point-cloud. In other words: Round surfaces are cut into many layers (see point cloud).
Heiko mentioned in his paper, that it would be easy to get sub-pixel accuracy by fitting a polynomial function into the cost array and take the lowest point as disparity.
The problem is not bound to stereo vision, so in other words the task is the following:
given: An array of values, representing a polynomial function.
wanted: The lowest point of the polynomial function.
I don't have any idea how to do this. I need a fast algorithm, because I have to run this code for every pixel in the Image
For example: 500x500 Pixel with 60-200 costs each => Algorithm has to run 15000000-50000000 times!!).
I don't need a real time solution! My current SGM implementation (L2R and R2L matching, no cuda or multi-threading yet) takes about 20 seconds to process an image with 500x500 pixels ;).
I don't ask for libraries! I try to implement my own independent computer vision library :).
Thank you for your help!
With kind regards,
Andreas

Finding the exact lowest point in a general polynomial is a hard problem, since it is equivalent to finding the root of the derivative of the polynomial. In particular, if your polynomial is of degree 6, the derivative is a quintic polynomial, which is known not to be solvable by radical. You therefore need to either: fit the function using restricted families for which computing the roots of the derivatives e.g. the integrals of prod_i(x-ri)p(q) where deg(p)<=4, OR
using an iterative method to find an APPROXIMATE minimum, (newton's method, gradient descent).

What do you think of this interest point detection algorithm?

I've been trying to come up with an interest point detection algorithm and this is what I came up with:
You go through the X and the Y axises 3n pixels at a time creating 3n x 3n squares.
For the the n x n square in the middle of the 3n x 3n square (let's call it square Z), the R, G, and B values are averaged and rounded to preset values to limit the number of colors, and that is the color that square will be treated as.
The same is done for the 8 surrounding n x n squares.
After that, the color of square Z is compared to the surrounding squares, if it matches x out of the 8 surrounding squares where x <= 3 or x => 5 then that is an interest point (a corner is detected).
And so on till all the image is covered.
The bigger n is, the faster the image will be scanned and the the less accurate the detection is, and vice versa.
This, supposedly, detects "literal corners", that is corners you can actually SEE on the image.
What do you think of this algorithm? Is it efficient? Can it be used on a live video stream (say from the camera) on a hand-held device?

I'm sorry to say that I don't think this is likely to be very good. Your algorithm looks a bit like a simplistic version of Moravec's algorithm, which is itself one of the simplest corner detection algorithms. The hardcoded limits you test against effectively make your edge test a stepped function, unlike an approach such as summed square differences. This will almost certainly give you discontinuities in your detection function (corners that don't match when they should have), for some values.
You also have the same problem as Moravec, namely that if the edge lies at an angle to the direction of neighbours being considered, then it won't be detected.
Developing algorithms is fun, and if this isn't a business-critical project, then by all means, carry on tinkering and experimenting (and don't be put off by my comments!). But the fact is, for almost any practical problem, a better algorithm for the task you want to solve almost certainly already exists. The real challenge is identifying how you can best model your problem in such a way that you can solve it using an existing, well-understood approach, designed by experts.
In particular, robust identification and analysis of edge-cases and worst-case runtimes is a tricky business; unless you are a professional algorist, you are likely to find the going difficult. But I certainly encourage you to discover this for yourself by trying. nlucaroni mentions some excellent questions to use as starting points for your analysis.

Why not try it and see if it works the way you expect? It sounds like it should. How does the performance compare with other methods? What is the complexity of the algorithm? Is it efficient compared to others? Where can it be improved? What kind of false-positives and false negatives are expected? Are they within reason based on the data I plan to use this on? What threshold should be used to compare surrounding squares? ....
this is stuff you should be doing, not us.

I would suggest you look at the SIFT algorithm. Its the defacto standard for points of interest in an image. Unfortunately, its also patented, because its so good.
If you are interested in a real time version of SIFT you can get it to run on a GPU, but its highly experimental at this point. Note if you are developing a commercial application you'd have to first purchase a license for using SIFT or get approval from David Lowe.

How to 'smooth' data and calculate line gradient?

I'm reading data from a device which measures distance. My sample rate is high so that I can measure large changes in distance (i.e. velocity) but this means that, when the velocity is low, the device delivers a number of measurements which are identical (due to the granularity of the device). This results in a 'stepped' curve.
What I need to do is to smooth the curve in order to calculate the velocity. Following that I then need to calculate the acceleration.
How to best go about this?
(Sample rate up to 1000Hz, calculation rate of 10Hz would be ok. Using C# in VS2005)

The wikipedia entry from moogs is a good starting point for smoothing the data. But it does not help you in making a decision.
It all depends on your data, and the needed processing speed.
Moving Average
Will flatten the top values. If you are interrested in the minimum and maximum value, don't use this. Also I think using the moving average will influence your measurement of the acceleration, since it will flatten your data (a bit), thereby acceleration will appear to be smaller. It all comes down to the needed accuracy.
Savitzky–Golay
Fast algorithm. As fast as the moving average. That will preserve the heights of peaks. Somewhat harder to implement. And you need the correct coefficients. I would pick this one.
Kalman filters
If you know the distribution, this can give you good results (it is used in GPS navigation systems). Maybe somewhat harder to implement. I mention this because I have used them in the past. But they are probably not a good choice for a starter in this kind of stuff.
The above will reduce noise on your signal.
Next you have to do is detect the start and end point of the "acceleration". You could do this by creating a Derivative of the original signal. The point(s) where the derivative crosses the Y-axis (zero) are probably the peaks in your signal, and might indicate the start and end of the acceleration.
You can then create a second degree derivative to get the minium and maximum acceleration itself.

You need a smoothing filter, the simplest would be a "moving average": just calculate the average of the last n points.
The question here is, how to determine n, can you tell us more about your application?
(There are other, more complicated filters. They vary on how they preserve the input data. A good list is in Wikipedia)
Edit!: For 10Hz, average the last 100 values.

Moving averages are generally terrible - but work well for white noise. Both moving averages & Savitzky-Golay both boil down to a correlation - and therefore are very fast and could be implemented in real time. If you need higher order information like first and second derivatives - SG is a good right choice. The magic of SG lies in the constant correlation coefficients needed for the filter - once you have decided the length and degree of polynomial to fit locally, the coefficients need only to be found once. You can compute them using R (sgolay) or Matlab.
You can also estimate a noisy signal's first derivative via the Savitzky-Golay best-fit polynomials - these are sometimes called Savitzky-Golay derivatives - and typically give a good estimate of the first derivative.
Kalman filtering can be very effective, but it's heavier computationally - it's hard to beat a short convolution for speed!
Paul
CenterSpace Software

In addition to the above articles, have a look at Catmull-Rom Splines.

You could use a moving average to smooth out the data.

In addition to GvSs excellent answer above you could also consider smoothing / reducing the stepping effect of your averaged results using some general curve fitting such as cubic or quadratic splines.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio