optimize integral f(x)exp(-x) from x=0,infinity - algorithm

I need a robust integration algorithm for f(x)exp(-x) between x=0 and infinity, with f(x) a positive, differentiable function.
I do not know the array x a priori (it's an intermediate output of my routine). The x array is typically ~log-equispaced, but highly irregular.
Currently, I'm using the Simpson algorithm, buy my problem is that often the domain is highly undersampled by the x array, which produces unrealistic values for the integral.
On each run of my code I need to do this integration thousands of times (each with a different set of x values), so I need to find an efficient and robust way to integrate this function.
More details:
The x array can have between 2 and N points (N known). The first value is always x[0] = 0.0. The last point is always a value greater than a tunable threshold x_max (such that exp(x_max) approx 0). I only know the values of f at the points x[i] (though the function is a smooth function).
My first idea was to do a Laguerre-Gauss quadrature integration. However, this algorithm seems to be highly unreliable when one does not use the optimal quadrature points.
My current idea is to add a set of auxiliary points, interpolating f, such that the Simpson algorithm becomes more stable. If I do this, is there an optimal selection of auxiliary points?
I'd appreciate any advice,
Thanks.

Set t=1-exp(-x), then dt = exp(-x) dx and the integral value is equal to
integral[ f(-log(1-t)) , t=0..1 ]
which you can evaluate with the standard Simpson formula and hopefully get good results.
Note that piecewise linear interpolation will always result in an order 2 error for the integral, as the result amounts to a trapezoid formula even if the method was Simpson. For better errors in the Simpson method you will need higher interpolation degrees, ideally cubic splines. Cubic Bezier polynomials with estimated derivatives to compute the control points could be a fast compromise.

Related

Algorithm to approximate non-linear equation system solution

I'm looking for an algorithm to approximate the solution of the following equation system:
The equations have to be solved on an embedded system, in C++.
Background:
We measure the 2 variables X_m and Y_m, so they are known
We want to compute the real values: X_r and Y_r
X and Y are real numbers
We measure the functions f_xy and f_yx during calibration. We have maximal 18 points of each function.
It's possible to store the functions as a look-up table
I tried to approximate the functions with 2nd order polynomials and compute the solution, but it was not accurate enough, because of the fitting error.
I am looking for an algorithm to approximate the results in an embedded system in C++, but I don't even know what to search for. I found some papers on the theory link, but I think there must be an easier way to do it in my case.
Also: how can I determine during calibration, whether the functions can be solved with the algorithm?
Fitting a second-order polynomial through f_xy? That's generally not viable. The go-to solution would be Runga-Kutta interpolation. You pick two known values left and two to the right of your argument, with weights 1,2,2,1. This gets you an estimate d(f_xy)/dx which you can then use for interpolation.
The normal way is by Newton's iterations, starting from the initial approximation (Xm, Ym) [assuming that the f are mere corrections]. Due to the particular shape of the equations, you can reduce to twice a single equation in a single unknown.
Xr = Xm - Fyx(Ym - Fxy(Xr))
Yr = Ym - Fxy(Xm - Fyx(Yr))
The iterations read
Xr <-- Xr - (Xm - Fyx(Ym - Fxy(Xr))) / (1 + Fxy'(Ym - Fxy(Xr)).Fxy'(Xr))
Yr <-- Yr - (Ym - Fxy(Xm - Fyx(Yr))) / (1 + Fyx'(Xm - Fyx(Yr)).Fyx'(Yr))
So you should tabulate the derivatives of f as well, though accuracy is not so critical than for the computation of the f themselves.
If the calibration points aren't too noisy, I would recommend cubic spline interpolation, for which you can precompute all coefficients. At the same time these coefficients allow you to estimate the derivative (as the corresponding quadratic interpolant, which is continuous).
In principle (unless the points are uniformly spaced), you need to perform a dichotomic search to determine the interval in which the argument lies. But here you will evaluate the functions at nearby values, so that a linear search from the previous location should be better.
A different way to address the problem is by considering the bivariate solution surfaces Xr = G(Xm, Ym) and Yr = G(Xm, Ym) that you compute on a grid of points. If the surfaces are smooth enough, you can use a coarse grid.
So by any method (such as the one above), you precompute the solutions at each grid node, as well as the coefficients of some interpolant in the X and Y directions. I recommend a cubic spline, again.
Now to interpolate inside a grid cell, you combine the two univarite interpolants to a bivariate one by means of the Coons formula https://en.wikipedia.org/wiki/Coons_patch.

Constrained random solution of an underspecified system of linear equations

Premise
I've a system of linear equations
dot(A,x) = y
whose solutions have many degrees of freedom: indeed the Number of linearly independent Equations (E) is less than the dimension of x, A.K.A. the Number of Variables (N).
The number of degrees of freedom left constrains the solutions to be a hyperplane N-E of the overall space R^N. Given the (unimportant) characteristics of A, I am always able to write the solutions x (a vector N x 1) as
x=dot(B,t)+q
where B is a N x (N-E) matrix, t a (N-E) x 1 vector and q a N x 1 vector. This define the hyperplane of the solutions of my original problem, A x = y in parametric form.
I need to extract a random solution, with uniform probability over any possible point of the hyperplane, such that all x are positive (we will refer to it as a positive solution). Note that, for the specific problem I am dealing with, the space of positive solutions of x exists and it is bounded (that's how the notion of uniform probability is reasonable for the specific case, to clarify as suggested by #Petr comment). In the beginning, once I was able to write x=Bt+q, I thought it extremely simple. Now I am starting to doubt it.
Proposed Solution
By now I do something like this:
For each dimension i in range(N-E) I compute the maximum and minimum value of t[i]: t_min[i] and t_max[i]. Intervals big enough to not exclude any possible positive solution. Those are algebraically computed, always existing and defining a limited space.
I extract N-E uniform random values t[i], each comprised between t_min [i] and t_max[i].
I compute x = dot(B,t)+q
If all x[j] are positives, accept the solution. If some x[j] is negative, go back to point 2.
An example is visible for a two dimensional space N-E in the next figure.
Caption: A problem in N dimension reduced to a N-E=2 space. The yellow diamond is the space of positive solutions of the N-dimensional problem. I randomly sample points in the orange box between (t1(min),t2(min)) and (t1(max),t2(max)) until I find a point in the yellow box.
I think it is a good enough solution, but...
Problem
When N-E is big, the space of the hyperparallelogram bounded inside the hypercube can be small. In general it will be small^(N-E), that can be very small. How small?
While for sure an infinite number of positive solutions to the original problem exist, the space of the solutions can have measure zero in the N-E dimensional space. This can happen if all the positive solutions of the original problem have one dimension of x = 0. The borders of a diamond will make contact, transforming the diamond of solutions to a line. Of course you will never randomly pick EXACTLY a line in 2D, let alone in 5D.
A obvious idea would be to further reduce the dimensionality from N-E to a smaller number, i.e. to extract directly points from the aforementioned line instead of the square. Algebra is not easy, but I'm working on it. I'm not positive I will be able to solve it.
Note that choosing first one dimension (for example t1), computing the new limits of t2 conditional to the value of t1 extracted and then extract a possible value of t2 in this boundary, while much faster, does not give a uniform probability among all the possible solutions.
I know that the problem is very specific, but even some general ideas or thoughts would be gladly received. I am doubtful if there is some computing technique to extract directly the solution in the diamond...

Computing the inverse of a polynomial matrix

I am working with a system of the following structure:
L (k,m) = A2 k2 + A1 k + A0 - m B
I have the matrices (A2, A1, A0, and B) numerically and would like to compute coefficient matrices for L-1 such that I can evaluate L-1 for a given combination (k,m) without computing a matrix inverse each time. Could someone point me on the right direction for this type of algorithm / manipulation? I'm not even sure I know the correct search terms to search the linear algebra literature on the subject. I'm using MATLAB.
You can see from http://en.wikipedia.org/wiki/Invertible_matrix#Analytic_solution that the inverse of a matrix can be written as a matrix of sub-determinants divided by the determinant, so its entries are rational functions - one polynomial divided by another. Given that you know this, and that you can work out the order of the polynomials involved, it should in theory be possible to recover them, for example by fitting a rational function of the correct order to inverses computed at a finite number of points. You could then work out more inverses by evaluating the rational functions you found, instead of doing an explicit inverse.
However, note that the determinant for the three by three matrix example worked out below this is a sum of triples, so in your case it will be a polynomial of degree six in k, and with cross-product terms like k^4m. I suspect that this will save little or no time over computing the inverse as usual, and be numerically unstable to boot. However it does point out that any formula doing this will also be quite complex, as it amounts to working out a rational function of quite high degree.
There are some matrix identities used to avoid recalculation of matrix inverses, such as http://en.wikipedia.org/wiki/Binomial_inverse_theorem. I don't think this is directly applicable to your case, but there might be something there, especially if your A and B matrices are not of full rank.

Math - 3d positioning/multilateration

I have a problem involving 3d positioning - sort of like GPS. Given a set of known 3d coordinates (x,y,z) and their distances d from an unknown point, I want to find the unknown point. There can be any number of reference points, however there will be at least four.
So, for example, points are in the format (x,y,z,d). I might have:
(1,0,0,1)
(0,2,0,2)
(0,0,3,3)
(0,3,4,5)
And here the unknown point would be (0,0,0,0).
What would be the best way to go about this? Is there an existing library that supports 3d multilateration? (I have been unable to find one). Since it's unlikely that my data will have an exact solution (all of the 4+ spheres probably won't have a single perfect intersect point), the algorithm would need to be capable of approximating it.
So far, I was thinking of taking each subset of three points, triangulating the unknown based on those three, and then averaging all of the results. Is there a better way to do this?
You could take a non-linear optimisation approach, by defining a "cost" function that incorporates the distance error from each of your observation points.
Setting the unknown point at (x,y,z), and considering a set of N observation points (xi,yi,zi,di) the following function could be used to characterise the total distance error:
C(x,y,z) = sum( ((x-xi)^2 + (y-yi)^2 + (z-zi)^2 - di^2)^2 )
^^^
^^^ for all observation points i = 1 to N
This is the sum of the squared distance errors for all points in the set. (It's actually based on the error in the squared distance, so that there are no square roots!)
When this function is at a minimum the target point (x,y,z) would be at an optimal position. If the solution gives C(x,y,z) = 0 all observations would be exactly satisfied.
One apporach to minimise this type of equation would be Newton's method. You'd have to provide an initial starting point for the iteration - possibly a mean value of the observation points (if they en-circle (x,y,z)) or possibly an initial triangulated value from any three observations.
Edit: Newton's method is an iterative algorithm that can be used for optimisation. A simple version would work along these lines:
H(X(k)) * dX = G(X(k)); // solve a system of linear equations for the
// increment dX in the solution vector X
X(k+1) = X(k) - dX; // update the solution vector by dX
The G(X(k)) denotes the gradient vector evaluated at X(k), in this case:
G(X(k)) = [dC/dx
dC/dy
dC/dz]
The H(X(k)) denotes the Hessian matrix evaluated at X(k), in this case the symmetric 3x3 matrix:
H(X(k)) = [d^2C/dx^2 d^2C/dxdy d^2C/dxdz
d^2C/dydx d^2C/dy^2 d^2C/dydz
d^2C/dzdx d^2C/dzdy d^2C/dz^2]
You should be able to differentiate the cost function analytically, and therefore end up with analytical expressions for G,H.
Another approach - if you don't like derivatives - is to approximate G,H numerically using finite differences.
Hope this helps.
Non-linear solution procedures are not required. You can easily linearise the system. If you take pair-wise differences
$(x-x_i)^2-(x-x_j)^2+(y-y_i)^2-(y-y_j)^2+(z-z_i)^2-(z-z_j)^2=d_i^2-d_j^2$
then a bit of algebra yields the linear equations
$(x_i-x_j) x +(y_i-y_j) y +(zi-zj) z=-1/2(d_i^2-d_j^2+ds_i^2-ds_j^2)$,
where $ds_i$ is the distance from the $i^{th}$ sensor to the origin. These are the equations of the planes defined by intersecting the $i^{th}$ and the $j^{th}$ spheres.
For four sensors you obtain an over-determined linear system with $4 choose 2 = 6$ equations. If $A$ is the resulting matrix and $b$ the corresponding vector of RHS, then you can solve the normal equations
$A^T A r = A^T b$
for the position vector $r$. This will work as long as your sensors are not coplanar.
If you can spend the time, an iterative solution should approach the correct solution pretty quickly. So pick any point the correct distance from site A, then go round the set working out the distance to the point then adjusting the point so that it's in the same direction from the site but the correct distance. Continue until your required precision is met (or until the point is no longer moving far enough in each iteration that it can meet your precision, as per the possible effects of approximate input data).
For an analytic approach, I can't think of anything better than what you already propose.

math: scale coordinate system so that certain points get integer coordinates

this is more a mathematical problem. nonethelesse i am looking for the algorithm in pseudocode to solve it.
given is a one dimensional coordinate system, with a number of points. the coordinates of the points may be in floating point.
now i am looking for a factor that scales this coordinate system, so that all points are on fixed number (i.e. integer coordinate)
if i am not mistaken, there should be a solution for this problem as long as the number of points is not infinite.
if i am wrong and there is no analytical solution for this problem, i am interested in an algorithm that approximates the solution as close as possible. (i.e. the coordinates will look like 15.0001)
if you are interested for the concrete problem:
i would like to overcome the well known pixelsnapping problem in adobe flash, which cuts of half-pixels at the border of bitmaps if the whole stage is scaled. i would like to find out an ideal scaling factor for the stage which makes my bitmaps being placed on whole (screen-)pixel coordinates.
since i am placing two bitmaps on the stage, the number of points will be 4 in each direction (x,y).
thanks!
As suggested, you have to convert your floating point numbers to rational ones. Fix a tolerance epsilon, and for each coordinate, find its best rational approximation within epsilon.
An algorithm and definitions is outlined there in this section.
Once you have converted all the coordinates into rational numbers, the scaling is given by the least common multiple of the denominators.
Note that this latter number can become quite huge, so you may want to experiment with epsilon so that to control the denominators.
My own inclination, if I were in your situation, would be to use rational numbers not with floating point.
And the algorithms you are looking for is finding the lowest common denominator.
A floating point number is an integer, multiplied by a power of two (the power might be negative).
So, find the largest necessary power of two among your inputs, and that gives you a scale factor that will work. The power of two isn't just -1 times the exponent of the float, it's a few more than that (according to where the least significant 1 bit is in the significand).
It's also optimal, because if x times a power of 2 is an odd integer then x in its float representation was already in simplest rational form, there's no smaller integer that you can multiply x by to get an integer.
Obviously if you have a mixture of large and small values among your input, then the resulting integers will tend to be bigger than 64 bit. So there is an analytical solution, but perhaps not a very good one given what you want to do with the results.
Note that this approach treats floats as being precise representations, which they are not. You may get more sensible results by representing each float as a rational number with smaller denominator (within some defined tolerance), then taking the lowest common multiple of all the denominators.
The problem there though is the approximation process - if the input float is 0.334[*] then I can't in general be sure whether the person who gave it to me really mean 0.334, or whether it's 1/3 with some inaccuracy. I therefore don't know whether to use a scale factor of 3 and say the scaled result is 1, or use a scale factor of 500 and say the scaled result is 167. And that's just with 1 input, never mind a bunch of them.
With 4 inputs and allowed final tolerance of 0.0001, you could perhaps find the 10 closest rationals to each input with a certain maximum denominator, then try 10^4 different possibilities and see whether the resulting scale factor gives you any values that are too far from an integer. Brute force seems nasty, but you might a least be able to bound the search a bit as you go. Also "maximum denominator" might be expressed in terms of the primes present in the factorization, rather than just the number, since if you can find a lot of common factors among them then they'll have a smaller lcm and hence smaller deviation from integers after scaling.
[*] Not that 0.334 is an exact float value, but that sort of thing. Decimal examples are easier.
If you are talking about single precision floating point numbers, then the number can be expressed like this according to wikipedia:
From this formula you can deduce that you always get an integer if you multiply by 2127+23. (Actually, when e is 0 you have to use another formula for the special range of "subnormal" numbers so 2126+23 is sufficient. See the linked wikipedia article for details.)
To do this in code you will probably need to do some bit twiddling to extract the factors in the above formula from the bits in the floating point value. And then you will need some kind of support for unlimited size numbers to express the integer result of the scaling (e.g. BigInteger in .NET). Normal primitive types in most languages/platforms are typically limited to much smaller sizes.
It's really a problem in statistical inference combined with noise reduction. This is the method I'm going to try out soon. I'm assuming you're trying to get a regularly spaced 2-D grid but a similar method could work on a regularly spaced grid of 3 or more dimensions.
First tabulate all the differences and note that (dx,dy) and (-dx,-dy) denote the same displacement, so there's an equivalence relation. Group those differenecs that are within a pre-assigned threshold (epsilon) of one another. Epsilon should be large enough to capture measurement errors due to random noise or lack of image resolution, but small enough not to accidentally combine clusters.
Sort the clusters by their average size (dr = root(dx^2 + dy^2)).
If the original grid was, indeed, regularly spaced and generated by two independent basis vectors, then the two smallest linearly independent clusters will indicate so. The smallest cluster is the one centered on (0, 0). The next smallest cluster (dx0, dy0) has the first basis vector up to +/- sign (-dx0, -dy0) denotes the same displacement, recall.
The next smallest clusters may be linearly dependent on this (up to the threshold epsilon) by virtue of being multiples of (dx0, dy0). Find the smallest cluster which is NOT a multiple of (dx0, dy0). Call this (dx1, dy1).
Now you have enough to tag the original vectors. Group the vector, by increasing lexicographic order (x,y) > (x',y') if x > x' or x = x' and y > y'. Take the smallest (x0,y0) and assign the integer (0, 0) to it. Take all the others (x,y) and find the decomposition (x,y) = (x0,y0) + M0(x,y) (dx0, dy0) + M1(x,y) (dx1,dy1) and assign it the integers (m0(x,y),m1(x,y)) = (round(M0), round(M1)).
Now do a least-squares fit of the integers to the vectors to the equations (x,y) = (ux,uy) m0(x,y) (u0x,u0y) + m1(x,y) (u1x,u1y)
to find (ux,uy), (u0x,u0y) and (u1x,u1y). This identifies the grid.
Test this match to determine whether or not all the points are within a given threshold of this fit (maybe using the same threshold epsilon for this purpose).
The 1-D version of this same routine should also work in 1 dimension on a spectrograph to identify the fundamental frequency in a voice print. Only in this case, the assumed value for ux (which replaces (ux,uy)) is just 0 and one is only looking for a fit to the homogeneous equation x = m0(x) u0x.

Resources