I want to write a function f_1(a,b) = (x,y) that approximates the inverse of f, where f(x,y) = (a,b) is a bijective function (over a specific range)
Any suggestions on how to get an efficient numerical approximation?
The programming language used is not important.
Solving f(x,y)=(a,b) for x,y is equivalent to finding the root or minimum of f(x,y)-(a,b) ( = 0) so you can use any of the standard root finding or optimization algorithms. If you are implementing this yourself, I recommend Coordinate descent because it is probably the most simple algorithm. You could also try Adaptive coordinate descent although that may be a bit harder to analyze.
If you want to find the inverse over a range, you can either compute the inverse at various points and interpolate with something like a Cubic Spline or solve the above equation whenever you want to evaluate the inverse function. Even if you solve the equation for each evaluation, it may still be helpful to precompute some values so they can be used as initial values for a solver such as Coordinate descent.
Also see Newton's method and the Bisection method
There is no 'automatic' solution that wil work for any general function. Even in the simpler case of y = f(x) it can be hard to find a suitable starting point. As an example:
y = x^2
has a nice algebraic inverse
x = sqrt(y)
but trying to approximate the sqrt function in the range [0..1] with a polynomial (for instance) sucks badly.
If your range is small enough, and your function well behaved enough, then you might get a fit using 2D splines. If this is going to work, then you should try using independant functions for x and y, i.e. use
y = Y_1(a,b) and x = X_1(a,b)
rather than the more complicated
(x,y) = F_1(a,b)
Related
There's three components to this problem:
A three dimensional vector A.
A "smooth" function F.
A desired vector B (also three dimensional).
We want to find a vector A that when put through F will produce the vector B.
F(A) = B
F can be anything that somehow transforms or distorts A in some manner. The point is that we want to iteratively call F(A) until B is produced.
The question is:
How can we do this, but with the least amount of calls to F before finding a vector that equals B (within a reasonable threshold)?
I am assuming that what you call "smooth" is tantamount to being differentiable.
Since the concept of smoothness only makes sense in the rational / real numbers, I will also assume that you are solving a floating point-based problem.
In this case, I would formulate the problem as a nonlinear programming problem. i.e. minimizing the squared norm of the difference between f(A) and B, given by
(F(A)_1 -B_1)² + (F(A)_2 - B_2)² + (F(A)_3 - B_3)²
It should be clear that this expression is zero if and only if f(A) = B and positive otherwise. Therefore you would want to minimize it.
As an example, you could use the solvers built into the scipy optimization suite (available for python):
from scipy.optimize import minimize
# Example function
f = lambda x : [x[0] + 1, x[2], 2*x[1]]
# Optimization objective
fsq = lambda x : sum(v*v for v in f(x))
# Initial guess
x0 = [0,0,0]
res = minimize(fsq, x0, tol=1e-6)
# res.x is the solution, in this case
# array([-1.00000000e+00, 2.49999999e+00, -5.84117172e-09])
A binary search (as pointed out above) only works if the function is 1-d, which is not the case here. You can try out different optimization methods by adding the method="name" to the call to minimize, see the API. It is not always clear which method works best for your problem without knowing more about the nature of your function. As a rule of thumb, the more information you give to the solver, the better. If you can compute the derivative of F explicitly, passing it to the solver will help reduce the number of required evaluations. If F has a Hessian (i.e., if it is twice differentiable), providing the Hessian will help as well.
As an alternative, you can use the least_squares function on F directly via res = least_squares(f, x0). This could be faster since the solver can take care of the fact that you are solving a least squares problem rather than a generic optimization problem.
From a more general standpoint, the problem of restoring the function arguments producing a given value is called an Inverse Problem. These problems have been extensively studied.
Provided that F(A)=B, F,B are known and A remains unknown, you can start with a simple gradient search:
F(A)~= F(C) + F'(C)*(A-C)~=B
where F'(C) is the gradient of F() evaluated in point C. I'm assuming you can calculate this gradient analytically for now.
So, you can choose a point C that you estimate it is not very far from the solution and iterate by:
C= Co;
While(true)
{
Ai = inverse(F'(C))*(B-F(C)) + C;
convergence = Abs(Ai-C);
C=Ai;
if(convergence<someThreshold)
break;
}
if the gradient of F() cannot be calculated analytically, you can estimate it. Let Ei, i=1:3 be the ortonormal vectors, then
Fi'(C) = (F(C+Ei*d) - F(C-Ei*d))/(2*d);
F'(C) = [F1'(C) | F2'(C) | F3'(C)];
and d can be chosen as fixed or as a function of the convergence value.
These algorithms suffer from the problem of local maxima, null gradient areas, etc., so in order for it to work, the start point (Co) must be not very far from the solution where the function F() behaves monotonically
it seems like you can try a metaheuristic approach for this.
Genetic algorithm (GA) might be the best suite for this.
you can initiate a number of A vector to init a population, and use GA to make evolution on your population, so you will have better generation in which they have better vectors that F(Ax) closer to B.
Your fitness function can be a simple function that compare F(Ai) to B
You can choose how to mutate your population by each generation.
A simple example about GA can be found here link
I'm looking for an algorithm to approximate the solution of the following equation system:
The equations have to be solved on an embedded system, in C++.
Background:
We measure the 2 variables X_m and Y_m, so they are known
We want to compute the real values: X_r and Y_r
X and Y are real numbers
We measure the functions f_xy and f_yx during calibration. We have maximal 18 points of each function.
It's possible to store the functions as a look-up table
I tried to approximate the functions with 2nd order polynomials and compute the solution, but it was not accurate enough, because of the fitting error.
I am looking for an algorithm to approximate the results in an embedded system in C++, but I don't even know what to search for. I found some papers on the theory link, but I think there must be an easier way to do it in my case.
Also: how can I determine during calibration, whether the functions can be solved with the algorithm?
Fitting a second-order polynomial through f_xy? That's generally not viable. The go-to solution would be Runga-Kutta interpolation. You pick two known values left and two to the right of your argument, with weights 1,2,2,1. This gets you an estimate d(f_xy)/dx which you can then use for interpolation.
The normal way is by Newton's iterations, starting from the initial approximation (Xm, Ym) [assuming that the f are mere corrections]. Due to the particular shape of the equations, you can reduce to twice a single equation in a single unknown.
Xr = Xm - Fyx(Ym - Fxy(Xr))
Yr = Ym - Fxy(Xm - Fyx(Yr))
The iterations read
Xr <-- Xr - (Xm - Fyx(Ym - Fxy(Xr))) / (1 + Fxy'(Ym - Fxy(Xr)).Fxy'(Xr))
Yr <-- Yr - (Ym - Fxy(Xm - Fyx(Yr))) / (1 + Fyx'(Xm - Fyx(Yr)).Fyx'(Yr))
So you should tabulate the derivatives of f as well, though accuracy is not so critical than for the computation of the f themselves.
If the calibration points aren't too noisy, I would recommend cubic spline interpolation, for which you can precompute all coefficients. At the same time these coefficients allow you to estimate the derivative (as the corresponding quadratic interpolant, which is continuous).
In principle (unless the points are uniformly spaced), you need to perform a dichotomic search to determine the interval in which the argument lies. But here you will evaluate the functions at nearby values, so that a linear search from the previous location should be better.
A different way to address the problem is by considering the bivariate solution surfaces Xr = G(Xm, Ym) and Yr = G(Xm, Ym) that you compute on a grid of points. If the surfaces are smooth enough, you can use a coarse grid.
So by any method (such as the one above), you precompute the solutions at each grid node, as well as the coefficients of some interpolant in the X and Y directions. I recommend a cubic spline, again.
Now to interpolate inside a grid cell, you combine the two univarite interpolants to a bivariate one by means of the Coons formula https://en.wikipedia.org/wiki/Coons_patch.
I need a robust integration algorithm for f(x)exp(-x) between x=0 and infinity, with f(x) a positive, differentiable function.
I do not know the array x a priori (it's an intermediate output of my routine). The x array is typically ~log-equispaced, but highly irregular.
Currently, I'm using the Simpson algorithm, buy my problem is that often the domain is highly undersampled by the x array, which produces unrealistic values for the integral.
On each run of my code I need to do this integration thousands of times (each with a different set of x values), so I need to find an efficient and robust way to integrate this function.
More details:
The x array can have between 2 and N points (N known). The first value is always x[0] = 0.0. The last point is always a value greater than a tunable threshold x_max (such that exp(x_max) approx 0). I only know the values of f at the points x[i] (though the function is a smooth function).
My first idea was to do a Laguerre-Gauss quadrature integration. However, this algorithm seems to be highly unreliable when one does not use the optimal quadrature points.
My current idea is to add a set of auxiliary points, interpolating f, such that the Simpson algorithm becomes more stable. If I do this, is there an optimal selection of auxiliary points?
I'd appreciate any advice,
Thanks.
Set t=1-exp(-x), then dt = exp(-x) dx and the integral value is equal to
integral[ f(-log(1-t)) , t=0..1 ]
which you can evaluate with the standard Simpson formula and hopefully get good results.
Note that piecewise linear interpolation will always result in an order 2 error for the integral, as the result amounts to a trapezoid formula even if the method was Simpson. For better errors in the Simpson method you will need higher interpolation degrees, ideally cubic splines. Cubic Bezier polynomials with estimated derivatives to compute the control points could be a fast compromise.
I have a series of points representing values of a function, an example is below:
The values for X and Y can be real (non-integers). The function is monotonic, non-decreasing.
I want to be able to interpolate / assess the value of the function for any X (e.g. 1.5), so that a continuous function line would look like the following:
This is a standard interpolation problem, so I used Lagrange interpolation so far. It's quite simple and gives good enough results.
The problem with interpolation is that it also interpolates the values that are given as input, so the end results are for the input data will be different (e.g x=1, x=2)
Is there an algorithm that can guarantee that all the input values will have the same value after the interpolation? Linear interpolation is one solution, but it's linear the distances between X's don't have to be even (the graph is ugly then).
Please forgive my english / math language, I am not a native speaker.
The Lagrange interpolating polynomial in fact passes through all the n points, http://mathworld.wolfram.com/LagrangeInterpolatingPolynomial.html. Although, for the 1d problem, cubic splines is a preferred interpolator.
If you rather want to fit a model, e.g., a linear, quadratic, or a cubic polynomial, or another function, to your data than I think you could still put the constraints on the coefficients to ensure the approximating function passes through some selected points. Begin by choosing the model, and then solve the Least Squares fitting problem.
I have a problem involving 3d positioning - sort of like GPS. Given a set of known 3d coordinates (x,y,z) and their distances d from an unknown point, I want to find the unknown point. There can be any number of reference points, however there will be at least four.
So, for example, points are in the format (x,y,z,d). I might have:
(1,0,0,1)
(0,2,0,2)
(0,0,3,3)
(0,3,4,5)
And here the unknown point would be (0,0,0,0).
What would be the best way to go about this? Is there an existing library that supports 3d multilateration? (I have been unable to find one). Since it's unlikely that my data will have an exact solution (all of the 4+ spheres probably won't have a single perfect intersect point), the algorithm would need to be capable of approximating it.
So far, I was thinking of taking each subset of three points, triangulating the unknown based on those three, and then averaging all of the results. Is there a better way to do this?
You could take a non-linear optimisation approach, by defining a "cost" function that incorporates the distance error from each of your observation points.
Setting the unknown point at (x,y,z), and considering a set of N observation points (xi,yi,zi,di) the following function could be used to characterise the total distance error:
C(x,y,z) = sum( ((x-xi)^2 + (y-yi)^2 + (z-zi)^2 - di^2)^2 )
^^^
^^^ for all observation points i = 1 to N
This is the sum of the squared distance errors for all points in the set. (It's actually based on the error in the squared distance, so that there are no square roots!)
When this function is at a minimum the target point (x,y,z) would be at an optimal position. If the solution gives C(x,y,z) = 0 all observations would be exactly satisfied.
One apporach to minimise this type of equation would be Newton's method. You'd have to provide an initial starting point for the iteration - possibly a mean value of the observation points (if they en-circle (x,y,z)) or possibly an initial triangulated value from any three observations.
Edit: Newton's method is an iterative algorithm that can be used for optimisation. A simple version would work along these lines:
H(X(k)) * dX = G(X(k)); // solve a system of linear equations for the
// increment dX in the solution vector X
X(k+1) = X(k) - dX; // update the solution vector by dX
The G(X(k)) denotes the gradient vector evaluated at X(k), in this case:
G(X(k)) = [dC/dx
dC/dy
dC/dz]
The H(X(k)) denotes the Hessian matrix evaluated at X(k), in this case the symmetric 3x3 matrix:
H(X(k)) = [d^2C/dx^2 d^2C/dxdy d^2C/dxdz
d^2C/dydx d^2C/dy^2 d^2C/dydz
d^2C/dzdx d^2C/dzdy d^2C/dz^2]
You should be able to differentiate the cost function analytically, and therefore end up with analytical expressions for G,H.
Another approach - if you don't like derivatives - is to approximate G,H numerically using finite differences.
Hope this helps.
Non-linear solution procedures are not required. You can easily linearise the system. If you take pair-wise differences
$(x-x_i)^2-(x-x_j)^2+(y-y_i)^2-(y-y_j)^2+(z-z_i)^2-(z-z_j)^2=d_i^2-d_j^2$
then a bit of algebra yields the linear equations
$(x_i-x_j) x +(y_i-y_j) y +(zi-zj) z=-1/2(d_i^2-d_j^2+ds_i^2-ds_j^2)$,
where $ds_i$ is the distance from the $i^{th}$ sensor to the origin. These are the equations of the planes defined by intersecting the $i^{th}$ and the $j^{th}$ spheres.
For four sensors you obtain an over-determined linear system with $4 choose 2 = 6$ equations. If $A$ is the resulting matrix and $b$ the corresponding vector of RHS, then you can solve the normal equations
$A^T A r = A^T b$
for the position vector $r$. This will work as long as your sensors are not coplanar.
If you can spend the time, an iterative solution should approach the correct solution pretty quickly. So pick any point the correct distance from site A, then go round the set working out the distance to the point then adjusting the point so that it's in the same direction from the site but the correct distance. Continue until your required precision is met (or until the point is no longer moving far enough in each iteration that it can meet your precision, as per the possible effects of approximate input data).
For an analytic approach, I can't think of anything better than what you already propose.