Which is more efficient, atan2 or sqrt? - performance

There are some situations where there are multiple methods to calculate the same value.
Right now I am coming up with an algorithm to "expand" a 2D convex polygon. To do this I want to find which direction to perturb each vertex. In order to produce a result which expands the polygon with a "skin" of the same thickness all around, the amount to perturb in that direction also depends on the angle at the vertex. But right now I'm just worried about the direction.
One way is to use atan2: Let B be my vertex, A is the previous vertex, and C is the next vertex. My direction is the "angular average" of angle(B-A) and angle(B-C).
Another way involves sqrt: unit(B-A)+unit(B-C) where unit(X) is X/length(X) yields a vector with my direction.
I'm leaning towards method number 2 because averaging angle values requires a bit of work. But I am basically choosing between two calls to atan2 and two calls to sqrt. Which is generally faster? What about if I was doing this in a shader program?
I'm not trying to optimize my program per se, I'd like to know how these functions are generally implemented (e.g. in the standard c libraries) so I'll be able to know, in general, what is the better choice.
From what I know, both sqrt and trig functions require an iterative method to arrive at an answer. This is the reason why we try to avoid them when possible. People have come up with "approximate" functions which use lookup-tables and interpolation and such to try to produce faster results. I will of course never bother with these unless I find strong evidence of bottlenecking in my code due to just these routines or routines heavily involving them, but the differences between sqrt, trig funcs, and inverse trig funcs may be relevant for the sake of discussion.

With typical libraries on common modern hardware, sqrt is faster than atan2. Cases where atan2 is faster may exist, but they are few and far between.
Recent x86 implementations actually have a fairly efficient sqrt instruction, and on that hardware the difference can be quite dramatic. The Intel Optimization Manual quotes a single-precision square root as 14 cycles on Sandybridge, and a double-precision square root as 22 cycles. With a good math library atan2 timings are commonly in the neighborhood of 100 cycles or more.

It sounds like you have all the information you need to profile and find out for yourself.
If you aren't looking for an exact result, and don't mind the additional logic required to make it work, you can use specialized operations such as RSQRTSS, RSQRTPS, which calculate 1/sqrt, to combine the two expensive operations.

Indeed, sqrt is better than atan2, and 1/sqrt is better than sqrt.
For a non built-in solution, you may be interested by the CORDIC approximations.
But in your case, you should develop the complete formulas and optimize them globally before drawing any conclusion, because the transcendent function(s) are just a fraction of the computation.

Related

Does the polygon intersection code in CGAL always use GMP's rational number library?

I am currently working on determining whether two polygons intersect with each other. I have found an example in CGAL's documentation webpage:
http://doc.cgal.org/latest/Boolean_set_operations_2/Boolean_set_operations_2_2do_intersect_8cpp-example.html
However, this code employs GMP's rational number library hence it is relatively slow. In my problem, I need to determine intersection of polygons for thousands of times. Therefore, I wonder whether there is an alternative which only use the floating-point arithmetic so that it can run much faster?
Thanks a lot.
CGAL states: "CGAL combines floating point arithmetic with exact arithmetic, in order to be efficient and reliable. CGAL has a built-in number type for that, but Gmp and Mpfr provide a faster solution, and we recommend to use them." (1)
Also in my experience that is what CGAL is for, exact computation.
If you use CGAL because it supplies the polygon intersection features directly, maybe an alternative library would be a possibility. Here are some from an alternative thread.
One final thought. You can also speed up your code within CGAL. In your case I would suggest computing the bounding box for every polygon and first do a intersection tests with those. It will already eliminate a lot of polygon pairs.

strategy for fitting angle data

I have a set of angles. The distribution could be roughly described as:
there a are usually several values very close (0.0-1.0 degree apart) to the correct solution
there are also noisy values being very far from the correct result, even opposite direction
is there a common solution/strategy for such a problem?
For multidimensional data, I would use RANSAC - but I have the impression that it is unusual to apply Ransac on 1-dimensional data. Another problem is computing the mean of an angle. I read some other posts about how to calculate the mean of angles by using vectors, but I just wonder if there isn't a particular fitting solution which deals with both issues already.
You can use RANSAC even in this case, all the necessary conditions (minimal samples, error of a data point, consensus set) are met. Your minimal sample will be 1 point, a randomly picked angle (although you can try all of them, might be fast enough). Then, all the angles (data points) with error (you can use just absolute distance, modulo 360) less than some threshold (e.g. 1 degree), will be considered as inliers, i.e. within the consensus set.
If you want to play with it a bit more, you can make the results more stable by adding some local optimisation, see e.g.:
Lebeda, Matas, Chum: Fixing the Locally Optimized RANSAC, BMVC 2012.
You could try another approaches, e.g. median, or fitting a mixture of Gaussian and uniform distribution, but you would have to deal with the periodicity of the signal somehow, so I guess RANSAC should be your choice.

Fast find of all local maximums in C++

Problem
I have a formula for calculation of 1D polynomial, joint function. I want to find all local maximums of that function within a given range.
My approach
My current solution is that i evaluate my function in a certain number of points from the range and then I go through these points and remember points where function changed from rising to decline. Of cause I can change number of samples within the interval, but I want to find all maximums with as lowest number of samples as possible.
Question
Can you suggest any effetive algorithm to me?
Finding all the maxima of an unknown function is hard. You can never be sure that a maximum you found is really just one maximum or that you have not overlooked a maximum somewhere.
However, if something is known about the function, you can try to exploit that. The simplest one is, of course, is if the function is known to be rational and bounded in grade. Up to a rational function of grade five it is possible to derive all four extrema from a closed formula, see http://en.wikipedia.org/wiki/Quartic_equation#General_formula_for_roots for details. Most likely, you don't want to implement that, but for linear, square, and cubic roots, the closed formula is feasible and can be used to find maxima of a quartic function.
That is only the most simple information that might be known, other interesting information is whether you can give a bound to the second derivative. This would allow you to reduce the sampling density when you find a strong slope.
You may also be able to exploit information from how you intend to use the maxima you found. It can give you clues about how much precision you need. Is it sufficient to know that a point is near a maximum? Or that a point is flat? Is it really a problem if a saddle point is classified as a maximum? Or if a maximum right next to a turning point is overlooked? And how much is the allowable error margin?
If you cannot exploit information like this, you are thrown back to sampling your function in small steps and hoping you don't make too much of an error.
Edit:
You mention in the comments that your function is in fact a kernel density estimation. This gives you at least the following information:
Unless the kernel is not limited in extend, your estimated function will be a piecewise function: Any point on it will only be influenced by a precisely calculable number of measurement points.
If the kernel is based on a rational function, the resulting estimated function will be piecewise rational. And it will be of the same grade as the kernel!
If the kernel is the uniform kernel, your estimated function will be a step function.
This case needs special handling because there won't be any maxima in the mathematical sense. However, it also makes your job really easy.
If the kernel is the triangular kernel, your estimated function will be a piecewise linear function.
If the kernel is the Epanechnikov kernel, your estimated function will be a piecewise quadratic function.
In all these cases it is next to trivial to produce the piecewise functions and to find their maxima.
If the kernel is of too high grade or transcendental, you still know the measurements that your estimation is based on, and you know the kernel properties. This allows you to derive a heuristic on how dense your maxima can get.
At the very least, you know the first and second derivative of the kernel.
In principle, this allows you to calculate the first and second derivative of the estimated function at any point.
In the case of a local kernel, it might be more prudent to calculate the first derivative and an upper bound to the second derivative of the estimated function at any point.
With this information, it should be possible to constrain the search to the regions where there are maxima and avoid oversampling of the slopes.
As you see, there is a lot of useful information that you can derive from the knowledge of your function, and which you can use to your advantage.
The local maxima are among the roots of the first derivative. To isolate those roots in your working interval you can use the Sturm theorem, and proceed by dichotomy. In theory (using exact arithmetic) it gives you all real roots.
An equivalent approach is to express your polynomial in the Bezier/Bernstein basis and look for changes of signs of the coefficients (hull property). Dichotomic search can be efficiently implemented by recursive subdivision of the Bezier.
There are several classical algorithms available for polynomials, such as Laguerre, that usually look for the complex roots as well.

Optimization of multivariate function with a initial solution close to the optimum

I was wondering if anyone knows which kind of algorithm could be use in my case. I already have run the optimizer on my multivariate function and found a solution to my problem, assuming that my function is regular enough. I slightly perturbate the problem and would like to find the optimum solution which is close to my last solution. Is there any very fast algorithm in this case or should I just fallback to a regular one.
We probably need a bit more information about your problem; but since you know you're near the right solution, and if derivatives are easy to calculate, Newton-Raphson is a sensible choice, and if not, Conjugate-Gradient may make sense.
If you already have an iterative optimizer (for example, based on Powell's direction set method, or CG), why don't you use your initial solution as a starting point for the next run of your optimizer?
EDIT: due to your comment: if calculating the Jacobian or the Hessian matrix gives you performance problems, try BFGS (http://en.wikipedia.org/wiki/BFGS_method), it avoids calculation of the Hessian completely; here
http://www.alglib.net/optimization/lbfgs.php you find a (free-for-non-commercial) implementation of BFGS. A good description of the details you will here.
And don't expect to get anything from finding your initial solution with a less sophisticated algorithm.
So this is all about unconstrained optimization. If you need information about constrained optimization, I suggest you google for "SQP".
there are a bunch of algorithms for finding the roots of equations. If you know approximately where the root is, there are algorithms that will get you arbitrarily close very quickly, in ln n time or better.
One is Newton's method
another is the Bisection Method
Note that these algorithms are for single variable functions, but can be expanded to multivariate functions.
Every minimization algorithm performs better (read: perform at all) if you have a good initial guess. The initial guess for the perturbed problem will be in your case the minimum point of the non perturbed problem.
Then, you have to specify your requirements: you want speed. What accuracy do you want ? Does space efficiency matters ? Most importantly: what information do you have: only the value of the function, or do you also have the derivatives (possibly second derivatives) ?
Some background on the problem would help too. Looking for a smooth function which has been discretized will be very different than looking for hundreds of unrelated parameters.
Global information (ie. is the function convex, is there a guaranteed global minimum or many local ones, etc) can be left aside for now. If you have trouble finding the minimum point of the perturbed problem, this is something you will have to investigate though.
Answering these questions will allow us to select a particular algorithm. There are many choices (and trade-offs) for multivariate optimization.
Also, which is quicker will very much depend on the problem (rather than on the algorithm), and should be determined by experimentation.
Thought I don't know much about using computers in this capacity, I remember an article that used neuroevolutionary techniques to find "best-fit" equations relatively efficiently, given a known function complexity (linear, Nth-polynomial, exponential, logarithmic, etc) and a set of point plots. As I recall it was one of the earliest uses of what we now know as computational neuroevolution; because the functional complexity (and thus the number of terms) of the equation is known and fixed, a static neural net can be used and seeded with your closest values, then "mutated" and tested for fitness, with heuristics to make new nets closer to existing nets with high fitness. Using multithreading, many nets can be created, tested and evaluated in parallel.

Efficient, correct and optimized algorithm to find intersection between two lines

What is the most efficient algorithm to find intersection point between two lines?
You are given four points A, B , C , D.
Find the intersection point between AB and CD.
Optimize the algorithm as much as you can.
There are two approach for this, one is using dot product and another is using slope intercept form for line. which one is better.
This might sound a repeated question but what I want to ask is which approach is better and most efficient with better complexity.
This doesn't require any algorithm, just the solution of two intersecting lines. That's a basic mathematics problem, not a computing one (it's just algebraic manipulation).
That said, here's a discussion you should find helpful.
I prefer Mr. Bourke's website for these types of questions. Here's his article on line intersectoin:
Intersection point of two lines
Given how trivial this is, it's pretty tough to optimize.
I guess the best you can do is make sure that everything is in the CPU cache, that way you can run those math ops at full speed. You may be tempted to precompute some of the differences (P2 - P1), but it's hard to say in this world whether the memory lookup for that will be any faster than just performing the subtraction itself. CPUs can do subtraction and multiplication in 1 op whereas memory lookups, if they miss the cache, can take a couple orders of magnitude longer.
It's not that trivial.
As far I remember the Pascal example ( http://actionsnippet.com/?p=956 ) does not work with collinear points.
I was not able to find a correctly implemented algorithm, so I had to write my own.

Resources