Getting covariance matrix when using Levenberg-Marquardt (lsqcurvefit) in MATLAB - algorithm

I am using the lsqcurvefit function in Matlab to model some experimental data. The data takes a specific shape and so the algorithm is just adjusting the coefficients of this shape to change its amplitude etc.
The model works fine and gives a good fit (I have calculated chi-sq). Other implementations of the Levenberg-Marquardt algorithm give covariance as an output but in Matlab it is not an option for output (only 1st order optimality, no of iterations, Lambda and Jaccobian along with the bounds) .
Does anybody know how to calculate the covariance matrix either from the outputs given (I'm not 100% on what use lambda and jaccobian could be or 1st order optimallity) by lsqcurvefit or independantly?
Any help would be greatly appreciated, thanks!

I think I have solved this myself but will post how here if anyone else is having the same trouble.
The covariance matrix can be calculated from the Jacobian by:
C = inv(J'*J)*MSE
Where MSE is mean-square error:
MSE = (R'*R)/(N-p)
Where R = residuals, N = number of observations, p = number of coefficients estimated.
Or MSE can be calculated via iteration.
Hopefully this will help someone else in the future.
If anyone spots error please let me know. Thanks

Related

Vectorized 2D array scipy BDF solver

I'm trying to solve simultaneously the same ODE at different point (each point n is an independent vector of shape m) using the scipy BDF solver. In other world, i have a matrix n x m, and i want to solve n points (by solving, I mean make them advance in time with a while loop ), knowing that each point n are independant from each other.
Obviously you can loop on the different points, but this method takes too much time. Is there any way to make this faster and use it as a vectorized function?
I also tried to reshape my matrix to a 1D vector, but it looks like the solver compute the jacobian matrix of the complete vector, which takes too much time and is useless as the points along n are independent.
Maybe there is a way to specify that the derivatives of points n-m are zeros to speed up the jacobian computation ?
Thanks in advance for the answer
Edit:
Thanks for your answer #Lutz Lehmann. I was able to sped up the computation a little using jac_sparcity, that avoid computing a lot of unnecessary points.
The other improvement I can imagine is regarding the rate of progress h_abs : each independent ODE should have its own h_abs. Using the 1D vector method implies that all the ODE's are advancing at the same rate of progress h_abs i.e. the most restricting one. I don't know if there is anyway of doing this.
I am already using a vectorized atol built as an n x m matrix and reshaped, the same way as the complete set of ODE to make sure that the good tolerances are applied for each variable. I've never used jumba so far, but I will definitely have a look.

Generating Random Upside-down Gaussian Distribution

I am attempting to generate a random distribution that follows an upside-down gaussian distribution, shifted uo so that it is still in range(0,1). I need to do this with as few special functions as possible and can only use a flat random number generator.
I am able to generate according to a Gaussian by putting the flat random numbers through the inverse Gaussian CDF. This works and gives me the gaussian dist that I would expect. In python, this looks like this:
def InverseCDF(x, mu, sigma):
return mu + sigma * special.erfinv(2*x - 1)
Now when I am trying to generate a distribution that follows 1-e^(-x^2), I believe the inverse CDF of this function is the same as for the regular gaussian with the argument of the inverse error function now 2*p + 1. So it would look like below:
def InverseCDF(x, mu, sigma):
return mu + sigma * special.erfinv(2*x + 1)
The problem here is that erfinv is only defined from (-1,1) and the argument is now greater than 1. I have tried scaling this and flipping in all sorts of ways, putting negatives almost everywhere I can, and I can never seem to generate a histogram that follows an upside-down gaussian. In most cases, I actually get back a regular gaussian distribution.
Any idea what I'm doing wrong, or any tips on how to generate this upside-down gaussian? Thanks in advance for any help.
OK, with x between 0 and 1, I get this for the cdf:
-(sqrt(%pi)*(sqrt(2)*sigma*erf((sqrt(2)*x-sqrt(2)*mu)/(2*sigma))
+sqrt(2)*erf(mu/(sqrt(2)*sigma))*sigma)
-2*x)
/(sqrt(%pi)*(sqrt(2)*erf((sqrt(2)*mu-sqrt(2))/(2*sigma))
-sqrt(2)*erf(mu/(sqrt(2)*sigma)))*sigma
+2)
Maybe some algebra will make it possible to figure out a formula for the inverse, if not, I guess a numerical root search will work. I guess it will be simpler for specific values of mu and sigma.
I did that with Maxima (http://maxima.sourceforge.net), by constructing the pdf and integrating it. Plotting the expression above yields a plausible picture.

Two different Covariance Matrices?

I am a little bit confused!
Assume we have observed the Data X = [x1,..,xn] and they are vectors in R^d (with zero mean)
X^T denotes the transposed of X
Sometimes i see that the covariance matrix is in the form of 1/n * X*X^T (e.g. Principal Component Analysis) and sometimes is see it in the form 1/n * X^T*X (e.g. Kernel-Covariance matrix with kernel k(x,y) = x^T*y)
So why are 2 different ways or am i mixing up some things? Thank you for your help.
Well, the results differ in their dimension. One is a nxn-matrix, the other is a dxd-matrix.
I don't know the application for nxn-result, but when I used the covariance matrix to denote the variation of a vector in R^d (with measurements X = [x1,..,xn]) the result has to be a dxd-matrix, whose eigenvectors and -values indicate the main axes and extends of an "variance ellipsoid" (which must be given in dxd)
PS: Only half an answer, I know
Addendum:
Kernels are used for creating inner products of pairwise features, thus reducing the dimension to 1 to find patterns more easily. Have a look at
http://en.wikipedia.org/wiki/Kernel_principal_component_analysis#Introduction_of_the_Kernel_to_PCA
to get an impression, what the kernel covariance matrix is used for

Excel Polynomial Curve-Fitting Algorithm

What is the algorithm that Excel uses to calculate a 2nd-order polynomial regression (curve fitting)? Is there sample code or pseudo-code available?
I found a solution that returns the same formula that Excel gives:
Put together an augmented matrix of values used in a Least-Squares Parabola. See the sum equations in http://www.efunda.com/math/leastsquares/lstsqr2dcurve.cfm
Use Gaussian elimination to solve the matrix. Here is C# code that will do that http://www.codeproject.com/Tips/388179/Linear-Equation-Solver-Gaussian-Elimination-Csharp
After running that, the left-over values in the matrix (M) will equal the coefficients given in Excel.
Maybe I can find the R^2 somehow, but I don't need it for my purposes.
The polynomial trendlines in charts use least squares based on a QR decomposition method like the LINEST worksheet function ( http://support.microsoft.com/kb/828533 ). A second order or quadratic trend for given (x,y) data could be calculated using =LINEST(y,x^{1,2}).
You can call worksheet formulas from C# using the Worksheet.Evaluate method.
It depends, because there are a lot of ways to do such a thing depending on the data you supply and how important it is to have the curve pass through those points.
I'm guessing that you have many more points than you do coefficients in the polynomial (e.g. more than three points for a 2nd order curve).
If that's true, then the best you can do is least square fitting, which calculates the coefficients that minimize the mean square error between all the points and the resulting curve.
Since this is second order, my recommendation would be just create the damn second order terms and do a linear regression.
Ex. If you are doing z~second_order(x,y), it is equivalent to doing z~first_order(x,y,x^2,y^2, xy).

Accurate least-squares fit algorithm needed

I've experimented with the two ways of implementing a least-squares fit (LSF) algorithm shown here.
The first code is simply the textbook approach, as described by Wolfram's page on LSF. The second code re-arranges the equation to minimize machine errors. Both codes produce similar results for my data. I compared these results with Matlab's p=polyfit(x,y,1) function, using correlation coefficients to measure the "goodness" of fit and compare each of the 3 routines. I observed that while all 3 methods produced good results, at least for my data, Matlab's routine had the best fit (the other 2 routines had similar results to each other).
Matlab's p=polyfit(x,y,1) function uses a Vandermonde matrix, V (n x 2 matrix) and QR factorization to solve the least-squares problem. In Matlab code, it looks like:
V = [x1,1; x2,1; x3,1; ... xn,1] % this line is pseudo-code
[Q,R] = qr(V,0);
p = R\(Q'*y); % performs same as p = V\y
I'm not a mathematician, so I don't understand why it would be more accurate. Although the difference is slight, in my case I need to obtain the slope from the LSF and multiply it by a large number, so any improvement in accuracy shows up in my results.
For reasons I can't get into, I cannot use Matlab's routine in my work. So, I'm wondering if anyone has a more accurate equation-based approach recommendation I could use that is an improvement over the above two approaches, in terms of rounding errors/machine accuracy/etc.
Any comments appreciated! thanks in advance.
For a polynomial fitting, you can create a Vandermonde matrix and solve the linear system, as you already done.
Another solution is using methods like Gauss-Newton to fit the data (since the system is linear, one iteration should do fine). There are differences between the methods. One possibly reason is the Runge's phenomenon.

Resources