cv::undistortPoints() - iterative algorithm explanation - algorithm

I'm trying to understand the logic behind the OpenCV's cv::undisortPoints()' iterative approximation algorithm.
The implementation is available at:
https://github.com/Itseez/opencv/blob/master/modules/imgproc/src/undistort.cpp (lines 361-368).
The way I see it:
using last best guessed pixel position (x, y), try to find better guess by applying inverse of the 'distortion at current best guess', and adjust the pixel position in regard to the initial distorted position (x0, y0)
use initial distorted position (x0, y0) as a first 'best guess'
But the above doesn't really tell why this can be done...
One of the users posted (here: Understanding of openCV undistortion) that this is a kind of "non-linear solving algorithm (e.g. Newton's method, Levenberg-Marquardt algorithm, etc)". And from what I've seen there are at least a few possible solutions to this kind of undistorting problem.
Questions:
What iterative algorithm exactly is implemented in cv::undistortPoints()?
Is there any white paper showing (and [what's more important] explaining 'like I'm five') the idea behind it?
How do we know that this algorithm will converge (at least to the local minimum)?
Why do we do the correction in regard to the initial position (x0, y0)?

It uses the false position ("regula falsi") method. I have not seen a proof that the sequence converges for this particular equation, regardless of the choice of distortion parameters (or even for every choice of "physically plausible" parameters). It'd be very easy to write one for a few special cases, e.g. physical pure 2nd-order barrel distortion.
In practice it seems to work well. If you feel uncomfortable with it, you can always replace with the equation solver of your choice. For pure radial distortion of any order (i.e. with a single unknown), you can use any polynomial equation solver, e.g. good old SLATEC's rpqr79.

Related

Upgrading a binary search algorithm to something more sophisticated

I solved an analytically unsolvable problem with numerical methods. I am searching for X, based on a desired Y value. f(x)=y is possible, x=f^-1(y) is not.
Currently the algorithm does a binary search. It starts at X=50%, calculates Y, returns Y_err=Y-Y_demand. It keeps stepping by intervals of 5% in the direction of shrinking Y_err, until Y_err changes sign, then it reduces the step, and steps in the opposite direction. This works, but it's embarassingly slow & inefficient.
Below, an example chart of x=f^-1(y). I chose one with high coefficients for the nonlinear part.
Example chart of x=f^-1(y)
It varies depending on coefficients, but always has this pseudoparabolic shape. It's of course nonlinear and even 9th order polynomial approximations don't offer satisfactory precision.
For simplicity's sake let's say the inflecton point is at X=50%, and am looking only for solutions where X>50%.
How should I proceed? I'm looking to optimise as much as possible. What are some good algorithms? Thanks.
EDIT: Thank you for pointing out that this is not in fact a binary search. I've updated the code and now have much better results by comparison.
I'm not sure if Newton's method applies here, or at least I don't know how to apply it. One-way trial and error is all I can do. When I have some more time I will try to learn and implement regula falsi.

what is the numerical method to guarantee safer matrix inversion?

I am trying to develop an algorithm (in the framework of gradient descent )for an SEM(structural equation model) problem.There is a parameter matrix B(n*n) with all its diagonal elements fixed to be zero.And a term of inv(I-B) (inversion of I - B) in my objective function.There is no other constraint such as symmetry on B.
My question is that how can we make sure (I-B) is not singular in the iterations?
In this problem,because the domain of the objective function is not the whole R^n space,it seems that the strict conditions for the convergence of gradient descent will be not satisfied.Standard textbooks will assume the objective to have a domain in the whole R^n space.It seems that gradient descent will not have a guaranteed convergence.
In the update of the iterative algorithms,currently my implementation is that checking whether (I-B) is close to singular, then if it is not, the step size of the gradient descent will be shrunk.Is there any better numerical approach to dealing with this problem?
You can try putting a logarithmic barrier on det(I-B)>0 or det(I-B)<0 depending on which gives you a better result, or if you have more info about your problem . the gradient of logdet is quite nice https://math.stackexchange.com/questions/38701/how-to-calculate-the-gradient-of-log-det-matrix-inverse
You can also compute the Fenchel dual so you can potentially use a primal-dual approach.

How to simplify a spline?

I have an interesting algorithmic challenge in a project I am working on. I have a sorted list of coordinate points pointing at buildings on either side of a street that, sufficiently zoomed in, looks like this:
I would like to take this zigzag and smooth it out to linearize the underlying street.
I can think of a couple of solutions:
Calculate centroids using rolling averages of six or so points, and use those.
Spline regression.
Is there a better or best way to approach this problem? (I am using Python 3.5)
Based on your description and your comments, you are looking for a line simplification algorithms.
Ramer-Doublas algorithm (suggested in the comment) is most probably the most well-known algorithm in this family, but there are many more.
For example Visvalingam’s algorithm works by removing the point with the smallest change, which is calculated by the smallest square of the triangle. This makes it super easy to code and intuitively understandable. If it is hard to read research paper, you can read this easy article.
Other algorithms in this family are:
Opheim
Lang
Zhao
Read about them, understand what are they trying to minify and select the most suitable for you.
Dali's post correctly surmises that a line simplification algorithm is useful for this task. Before posting this question I actually examined a few such algorithms but wasn't quite comfortable with them because even though they resulted in the simplified geometry that I liked, they didn't directly address the issue I had of points being on either side of the feature and never in the middle.
Thus I used a two-step process:
I computed the centroids of the polyline by using a rolling average of the coordinates of the five surrounding points. This didn't help much with smoothing the function but it did mostly succeed in remapping them to the middle of the street.
I applied Visvalingam’s algorithm to the new polyline, with n=20 points specified (using this wonderful implementation).
The result wasn't quite perfect but it was good enough:
Thanks for the help everyone!

Multichannel blind deconvolution in the simplest formulation: how to solve?

Recently I began to study deconvolution algorithms and met the following acquisition model:
where f is the original (latent) image, g is the input (observed) image, h is the point spread function (degradation kernel), n is a random additive noise and * is the convolution operator.
If we know g and h, then we can recover f using Richardson-Lucy algorithm:
where , (W,H) is the size of rectangular support of h and multiplication and division are pointwise. Simple enough to code in C++, so I did just so. It turned out that approximates to f while i is less then some m and then it starts rapidly decay. So the algorithm just needed to be stopped at this m - the most satisfactory iteration.
If the point spread function g is also unknown then the problem is said to be blind, and the modification of Richardson-Lucy algorithm can be applied:
For initial guess for f we can take g, as before, and for initial guess for h we can take trivial PSF, or any simple form that would look similar to observed image degradation. This algorithm also works quit fine on the simulated data.
Now I consider the multiframe blind deconvolution problem with the following acquisition model:
Is there a way to develop Richardson-Lucy algorithm for solving the problem in this formulation? If no, is there any other iterative procedure for recovering f, that wouldn't be much more complicated than the previous ones?
According to your acquisition model, latent image (f) remains same while the observed images are different due to different psf and noise models. One way to look at it, is a motion-blur problem where a sharp and noise-free image(f) is corrupted by the motion blur kernel. As this is an ill-posed problem, in most of the literature it's solved iteratively by estimating the blur kernel and the latent image. The way you solve this depends entirely on your objective function.
For example in some papers IRLS is used to estimate the blur kernel. You can find a lot of literature on this.
If you want to use Richardson Lucy Blind deconvolution, then use it on just one frame.
One strategy can be in each iteration while recovering f, assign different weights for contribution from each g(observed images). You can incorporate different weights in the objective function or calculate them according to the estimated blur kernel.
Is there a way to develop Richardson-Lucy algorithm for solving the problem in this formulation?
I'm not a specialist in this area, but I don't think that such way to construct an algorithm exists, at least not straightforwardly. Here is my argument for this. The first problem you described (when the psf is known) is already ill-posed due to the random nature of the noise and loss of information about convolution near image edges. The second problem on your list — single-channel blind deconvolution — is the extention of the previous one. In this case in addition it's underdetermined, so the ill-posedness expands, and so it's natural that the method to solve this problem is developed from the method for solving the first problem. Now when we consider the multichannel blind deconvolution formulation, we add a bunch of additional information to our previous model and so the problem goes from underdetermined to overdetermined. This is the whole other kind of ill-posedness and hence different approaches to solution are required.
is there any other iterative procedure for recovering f, that wouldn't be much more complicated than the previous ones?
I can recommend the algorithm introduced by Šroubek and Milanfar in [1]. I'm not sure whether it's much more complicated on your opinion or not so much, but it's by far one of the most recent and robust. The formulation of the problem is precisely the same as you wrote. The algorithm takes as input K>1 number of images, the upper bound of the psf size L, and four tuning parameters: alpha, beta, gamma, delta. To specify gamma, for example, you will need to estimate the variance of the noise on your input images and take the largest variance var, then gamma = 1/var. The algorithm solves the following optimization problem using alternating minimization:
where F is the data fidelity term and Q and R are regularizers of the image and blurs, respectively.
For detailed analysis of the algorithm see [1], for a collection of different deconvolution formulation and their solutions see [2]. Hope it helps.
Referenses:
Filip Šroubek, Peyman Milanfar. —- Robust Multichannel Blind Deconvolution via Fast Alternating Minimization.
-— IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 21, NO. 4, APRIL 2012
Patrizio Campisi, Karen Egiazarian. —- Blind Image Deconvolution: Theory and Applications

What do you think of this interest point detection algorithm?

I've been trying to come up with an interest point detection algorithm and this is what I came up with:
You go through the X and the Y axises 3n pixels at a time creating 3n x 3n squares.
For the the n x n square in the middle of the 3n x 3n square (let's call it square Z), the R, G, and B values are averaged and rounded to preset values to limit the number of colors, and that is the color that square will be treated as.
The same is done for the 8 surrounding n x n squares.
After that, the color of square Z is compared to the surrounding squares, if it matches x out of the 8 surrounding squares where x <= 3 or x => 5 then that is an interest point (a corner is detected).
And so on till all the image is covered.
The bigger n is, the faster the image will be scanned and the the less accurate the detection is, and vice versa.
This, supposedly, detects "literal corners", that is corners you can actually SEE on the image.
What do you think of this algorithm? Is it efficient? Can it be used on a live video stream (say from the camera) on a hand-held device?
I'm sorry to say that I don't think this is likely to be very good. Your algorithm looks a bit like a simplistic version of Moravec's algorithm, which is itself one of the simplest corner detection algorithms. The hardcoded limits you test against effectively make your edge test a stepped function, unlike an approach such as summed square differences. This will almost certainly give you discontinuities in your detection function (corners that don't match when they should have), for some values.
You also have the same problem as Moravec, namely that if the edge lies at an angle to the direction of neighbours being considered, then it won't be detected.
Developing algorithms is fun, and if this isn't a business-critical project, then by all means, carry on tinkering and experimenting (and don't be put off by my comments!). But the fact is, for almost any practical problem, a better algorithm for the task you want to solve almost certainly already exists. The real challenge is identifying how you can best model your problem in such a way that you can solve it using an existing, well-understood approach, designed by experts.
In particular, robust identification and analysis of edge-cases and worst-case runtimes is a tricky business; unless you are a professional algorist, you are likely to find the going difficult. But I certainly encourage you to discover this for yourself by trying. nlucaroni mentions some excellent questions to use as starting points for your analysis.
Why not try it and see if it works the way you expect? It sounds like it should. How does the performance compare with other methods? What is the complexity of the algorithm? Is it efficient compared to others? Where can it be improved? What kind of false-positives and false negatives are expected? Are they within reason based on the data I plan to use this on? What threshold should be used to compare surrounding squares? ....
this is stuff you should be doing, not us.
I would suggest you look at the SIFT algorithm. Its the defacto standard for points of interest in an image. Unfortunately, its also patented, because its so good.
If you are interested in a real time version of SIFT you can get it to run on a GPU, but its highly experimental at this point. Note if you are developing a commercial application you'd have to first purchase a license for using SIFT or get approval from David Lowe.

Resources