Calculating marginal effect in Generalized Additive Model

Calculating marginal effect in Generalized Additive Model - random

I'd used GAM to address spatial auto correlation and considered a smooth function only for location. I would like to know how I can calculate marginal effect of the smooth term

Related

Real-time vector field interpolation for ballistics simulation

I'm working on a real-time ballistics simulation of many particles under the effect of highly non-uniform wind. The wind data is obtained from CFD in a form of 2D discretized vector field (unstructured mesh, each grid point has associated with it a vector which tells the direction and magnitude of air velocity).
The problem is that I need to be able to extract the wind vector at any position that a particle occupies, so that aerodynamic drag can be computed and injected into ballistics physics. This is trivial if the wind data can be approximated by an analytical/numerical vector field where a vector can be computed with an algebraic expression. However, the wind data I'm working with is quite complex and there doesn't seem to be any way to approximate it.
I have two ideas:
Find a way to interpolate the vector field every time each particle's position is updated. This sounds computationally expensive, so I'm not sure if it can be done real-time. Also, the mesh is unstructured, and I'm not sure if 2D interpolation can be done with this kind of mesh.
Just pick the grid point closest to the particle's position and get the vector from there (given that the mesh is fine enough for this to accurately represent the actual vector field). This will then turn into a real-time nearest-neighbor problem with rapid and numerous queries.
I'm not sure if these are the only two solutions for this problem, and if these can be done in real-time at all. How should I go about solving this?

Sampling methods for plotting

Say we are making a program to render the plot of a function (black box) provided by the user as a sequence of line segments. We want to get the minimum number of samples of the function so the resulting image "looks" like the function (the exact meaning of "looks" here is part of the question). A naive approach might be to just sample at fixed intervals but we can probably do better than that eg by sampling the "curvy bits" more than the "linear bits". Are there systematic approaches/research on this problem?

This reference can be helpful which is using the combined sampling method. Before that its related works explain more about other methods of sampling:
There are several strategies for plotting the function y = f(x) on interval Ω = [a, b]. The
naive approach based on sampling of f in a fixed amount of the equally spaced points is
described in [20]. The simple functions suffer from oversampling, while the oscillating curves
are under-sampled; these issues are mentioned in [14]. Another approach based on the interval
constraint plot constructing a hull of the curve was described in [6], [13], [20]. The automated
detection of a useful domain and a range of the function is mentioned in [41]; the generalized
interval arithmetic approach is described in [40].
A significant refinement is represented by adaptive sampling providing a higher sampling
density in the higher-curvature regions. The are several algorithms for the curve interpolation preserving the speed, for example: [37], [42], [43]. The adaptive feed rate technique
is described in [44]. An early implementation in the Mathematica software is presented in
[39]. By reducing data, these methods are very efficient for the curve plotting. The polygonal approximation of the parametric curve based on adaptive sampling is mentioned in the
several papers. The refinement criteria, as well as the recursive approach, are discussed in
[15]. An approximation by the polygonal curves is described in [7], the robust method for
the geometric and spatial approximation of the implicit curves can be found in [27], [10], the
affine arithmetic working in the triangulated models in [32]. However, the map projections
are never defined by the implicit equations. Similar approaches can be used for graph drawing
[21].
Other techniques based on the approximation by the breakpoints can be found in many
papers: [33], [9], [3]; these approaches are used for the polygonal approximation of the closed
curves and applied in computer vision.
Hence, these are the reference methods that define some measures for a "good" plot and introduce an approach to optimize the plot base on the measure:
constructing a hull of the curve
automated detection of a useful domain and a range of the function
adaptive sampling: providing a higher sampling density in the higher-curvature regions
providing a higher sampling density in the higher-curvature regions
approximation by the polygonal curves
affine arithmetic working in the triangulated models
combined sampling: providing the polygonal approximation of the parametric curve involving the discontinuities will be presented. The modified method will be used for the function f(x) reconstruction and plot. Based on the ideas of splitting the domain into the subintervals without the discontinuities, it represents a typical problem solvable by the recursive approach.

Uncertainty on pose estimate when minimizing measurement errors

Let's say I want to estimate the camera pose for a given image I and I have a set of measurements (e.g. 2D points ui and their associated 3D coordinates Pi) for which I want to minimize the error (e.g. the sum of squared reprojection errors).
My question is: How do I compute the uncertainty on my final pose estimate ?
To make my question more concrete, consider an image I from which I extracted 2D points ui and matched them with 3D points Pi. Denoting Tw the camera pose for this image, which I will be estimating, and piT the transformation mapping the 3D points to their projected 2D points. Here is a little drawing to clarify things:
My objective statement is as follows:
There exist several techniques to solve the corresponding non-linear least squares problem, consider I use the following (approximate pseudo-code for the Gauss-Newton algorithm):
I read in several places that JrT.Jr could be considered an estimate of the covariance matrix for the pose estimate. Here is a list of more accurate questions:
Can anyone explain why this is the case and/or know of a scientific document explaining this in details ?
Should I be using the value of Jr on the last iteration or should the successive JrT.Jr be somehow combined ?
Some people say that this actually is an optimistic estimate of the uncertainty, so what would be a better way to estimate the uncertainty ?
Thanks a lot, any insight on this will be appreciated.

The full mathematical argument is rather involved, but in a nutshell it goes like this:
The outer product (Jt * J) of the Jacobian matrix of the reprojection error at the optimum times itself is an approximation of the Hessian matrix of least squares error. The approximation ignores terms of order three and higher in the Taylor expansion of the error function at the optimum. See here (pag 800-801) for proof.
The inverse of the Hessian matrix is an approximation of the covariance matrix of the reprojection errors in a neighborhood of the optimal values of the parameters, under a local linear approximation of parameters-to-errors transformation (pag 814 above ref).
I do not know where the "optimistic" comment comes from. The main assumption underlying the approximation is that the behavior of the cost function (the reproj. error) in a small neighborhood of the optimum is approximately quadratic.

Demons algorithm for image registration (for dummies)

I was trying to make a application that compares the difference between 2 images in java with opencv. After trying various approaches I came across the algorithm called Demons algorithm.
To me it seems to give the difference of images by some transformation on each place. But I couldn't understand it since the references I found were too complex for me.
Even the demons algorithm does not do what I need I'm interested in learning it.
Can any one explain simply what happens in the demons algorithm and how to write a simple code to use that algorithm on 2 images.

I can give you an overview of general algorithms for deformable image registration, demons is one of them
There are 3 components of the algorithm, a similarity metric, a transformation model and an optimization algorithm.
A similarity metric is used to compute pixel based / patch based similarity between pixels/patches. Common similarity measures are SSD, normalized cross correlation for mono-modal images while information theoretic measures like mutual information are used in the case of multi-modal image registration.
In the case of deformable registration, they generally have a regular grid super-imposed over the image and the grid is deformed by solving an optimization problem which is formulated such that the similarity metric and the smoothness penalty imposed over the transformation is minimized. In deformable registration, once there are deformations over the grid, the final transformation at the pixel level is computed using a B-Spine interpolation of the grid at the pixel level so that the transformation is smooth and continuous.
There are 2 general approaches towards solving the optimization problem, some people use discrete optimization and solve it as a MRF optimization problem while some people use gradient descent, I think demons uses gradient descent.
In case of MRF based approaches, the unary cost is the cost for deforming each node in grid and it is the similarity computed between patches, the pairwise cost which imposes the smoothness of the grid, is generally a potts/truncated quadratic potential which ensures that neighboring nodes in the grid have almost the same displacement. Once you have the unary and pairwise cost, you feed it to a MRF optimization algorithm and get the displacements at the grid level, then you use a B-Spline interpolation to compute pixel level displacement. This process is repeated in a coarse to fine fashion over several scales and also the algorithm is run many times at each scale (reducing the displacement at each node every time).
In case of gradient descent based methods, they formulate the problem with the similarity metric and the grid transformation computed over the image and then compute the gradient of the energy function which they have formulated. The energy function is minimized using iterative gradient descent, however these approaches can get stuck in a local minima and are quite slow.
Some popular methods are DROP, Elastix, itk provides some tools

If you want to know more about algorithms related to deformable image registration, I will recommend you to take a look to FAIR( guide book), FAIR is a toolbox for Matlab so you will have examples to understand the theory.
http://www.cas.mcmaster.ca/~modersit/FAIR/
Then if you want to specifically see some demon example,, here you have this other toolbox:
http://www.mathworks.es/matlabcentral/fileexchange/21451-multimodality-non-rigid-demon-algorithm-image-registration

Sensor fusioning with Kalman filter

I'm interested, how is the dual input in a sensor fusioning setup in a Kalman filter modeled?
Say for instance that you have an accelerometer and a gyro and want to present the "horizon level", like in an airplane, a good demo of something like this here.
How do you actually harvest the two sensors positive properties and minimize the negative?
Is this modeled in the Observation Model matrix (usually symbolized by capital H)?
Remark: This question was also asked without any answers at math.stackexchange.com

Usually, the sensor fusion problem is derived from the bayes theorem. Actually you have that your estimate (in this case the horizon level) will be a weighted sum of your sensors, which is caracterized by the sensor model. For dual sensors, you have two common choices: Model a two sensor system and derive the kalman gain for each sensor (using the system model as the predictor), or run two correction stages using different observation models. You should take a look at Bayesian Predictors (a little more general than Kalman Filter) which is precisely derived from minimizing the variance of an estimate, given two different information sources. If you have a weighted sum, and minimize the variance of the sum, for two sensors, then you get the Kalman Gain.
The properties of the sensor can be "seen" in two parts of the filter. First, you have the error matrix for your observations. This is the matrix that represents the noise in the sensors observation (it is assumed to be zero mean gaussian noise, which isn't a too big assumption, given that during calibration, you can achieve a zero mean noise).
The other important matrix is the observation covariance matrix. This matrix gives you an insight about how good is the sensor at giving you information (information meaning something "new" and not dependent on the other sensors reading).
About "harvesting the good characteristics", what you should do is do a good calibration and noise characterization (is that spelled ok?) of the sensors. The best way to get a Kalman Filter to converge is to have a good noise model for your sensors, and that is 100% experimental. Try to determine the variance for your system (dont always trust datasheets).
Hope that helps a bit.

The gyro measures rate of angle change (e.g. in radians per sec), while from accelerometer reading you can calculate the angle itself. Here is a simple way of combining these measurements:
At every gyro reading received:
angle_radians+=gyro_reading_radians_per_sec * seconds_since_last_gyro_reading
At every accelerometer reading received:
angle_radians+=0.02 * (angle_radians_from_accelerometer - angle_radians)
The 0.02 constant is for tuning - it selects the tradeoff between noise rejection and responsiveness (you can't have both at the same time). It also depends on the accuracy of both sensors, and the time intervals at which new readings are received.
These two lines of code implement a simple 1-dimensional (scalar) Kalman filter. It assumes that
the gyro has very low noise compared to accelerometer (true with most consumer-grade sensors). Therefore we do not model gyro noise at all, but instead use gyro in the state transition model (usually denoted by F).
accelerometer readings are received at generally regular time intervals and accelerometer noise level (usually R) is constant
angle_radians has been initialised with an initial estimate (f.ex by averaging angle_radians_from_accelerometer over some time)
therefore also estimate covariance (P) and optimal Kalman gain (K) are constant, which means we do not need to keep estimate covariance in a variable at all.
As you see, this approach is simplified. If the above assumptions are not met, you should learn some Kalman filter theory, and modify the code accordingly.

Horizon line is G' * (u, v, f)=0 ,where G is a gravity vector, u and v image centred coordinates and f focal length. Now pros and cons of sensors: gyro is super fast and accurate but drifts, accelerometer is less accurate but (if calibrated) has zero bias and doesn't drift (given no acceleration except gravity). They measure different things - accelerometer measures acceleration and thus orientation relative to the gravity vector while gyro measures rotation speed and thus the change in orientation. To convert it to orientation one has to integrate its values (thankfully it can be sampled at high fps like 100-200). thus Kalman filter that supposed to be linear is not applicable to gyro. for now we can just simplify sensor fusion as a weighted sum of readings and predictions.
You can combine two readings - accelerometer and integrated gyro and model prediction using weights that are inversely proportional to data variances. You will also have to use compass occasionally since accelerometer doesn't tell you much about the azimuth but I guess it is irrelevant for calculation of a horizon line. The system should be responsive and accurate and for this purpose whenever orientation changes fast the weights for gyro should be large; when the system settles down and rotation stops the weights for accelerometer will go up allowing more integration of zero bias readings and killing the drift from gyro.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio