The logistic map is a classic example where floating point numbers fail. It's also a great example of where error propagates very badly in general in numerical algorithms even when dealing with bignums. I was wondering if there are any known algorithms for taming this issue? Is there an efficient way to compute a logistic map that doesn't require naively computing it with huge precision?
It is a classic example because it is a chaotic system. The entire point of a chaotic system is that it shows unbelievable sensitivity to initial conditions. To get an answer within 5% of correct after n iterations requires starting with O(n) digits of the number. Not because your algorithm is bad, but because changing any of those digits changes what the answer should be.
So, no. While you can potentially speed up the calculation somewhat, you can't get away with starting with lower precision.
Related
Let's say I have N positive-valued 1-d functions. Does it take more function evaluations for a numerical minimizer to minimize their product in N-dimensional space rather than do N individual 1d minimizations?
If so, is there an intuitive way to understand this? Somehow I feel like both problems should be equal in complexity.
Minimizing their product is minimizing the sum of their logs. There are many algorithms for min(max)imizing N-dimensional functions. One is the old standby OPTIF9.
If you have to use hard limits, so you're minimizing in a box, that can be a lot harder, but you can usually avoid it.
The complexity is not linear in the number of variables. Typically n small problems is better than one big problem. Or in other words: making the problem twice as big (in terms of variables) will make it more than twice as expensive to solve.
In some special cases it may be somewhat beneficial to batch a few problems, mainly due to fixed overhead (some solvers do a lot of things before actually starting iterating).
I am currently estimating a Markov-switching model with many parameters using direct optimization of the log likelihood function (through the forward-backward algorithm). I do the numerical optimization using matlab's genetic algorithm, since other approaches such as the (mostly gradient or simplex-based) algorithms in fmincon and fminsearchbnd were not very useful, given that likelihood function is not only of very high dimension but also shows many local maxima and is highly nonlinear.
The genetic algorithm seems to work very well. However, I am planning to further increase the dimension of the problem. I have read about an EM algorithm to estimate Markov-switching models. From what I understand this algorithm releases a sequence of increasing log-likelhood values. It thus seems suitable to estimate models with very many parameters.
My question is if the EM algorithm is suitable for my application involving many parameters (perhaps better suitable as the genetic algorithm). Speed is not the main limitation (the genetic algorithm is altready extremely slow) but I would need to have some certainty to end up close to the global optimum and not run into one of the many local optima. Do you have any experience or suggestions regarding this?
The EM algorithm finds local optima, and does not guarantee that they are global optima. In fact, if you start it off with a HMM where one of the transition probabilities is zero, that probability will typically never change from zero, because those transitions will appear only with expectation zero in the expectation step, so those starting points have no hope of finding a global optimum which does not have that transition probability zero.
The standard workaround for this is to start it off from a variety of different random parameter settings, pick the highest local optima found, and hope for the best. You might be slightly reassured if a significant proportion of the runs converged to the same (or to equivalent) best local optimum found, on the not very reliable theory that anything better would be found from at least the same fraction of random starts, and so would have showed up by now.
I haven't worked it out in detail, but the EM algorithm solves such a general set of problems that I expect that if it guaranteed to find the global optimum then it would be capable of finding the solution to NP-complete problems with unprecedented efficiency.
I am trying to write a demo for an embedded processor, which is a multicore architecture and is very fast in floating point calculations. The problem is that the current hardware I have is the processor connected through an evaluation board where the DRAM to chip rate is somewhat limited, and the board to PC rate is very slow and inefficient.
Thus, when demonstrating big matrix multiplication, I can do, say, 128x128 matrices in a couple of milliseconds, but the I/O takes (lots of) seconds kills the demo.
So, I am looking for some kind of a calculation with higher complexity than n^3, the more the better (but preferably easy to program and to explain/understand) to make the computation part more dominant in the time budget, where the dataset is preferably bound to about 16KB per thread (core).
Any suggestion?
PS: I think it is very similar to this question in its essence.
You could generate large (256-bit) numbers and factor them; that's commonly used in "stress-test" tools. If you specifically want to exercise floating point computation, you can build a basic n-body simulator with a Runge-Kutta integrator and run that.
What you can do is
Declare a std::vector of int
populate it with N-1 to 0
Now keep using std::next_permutation repeatedly until they are sorted again i..e..next_permutation returns false.
With N integers this will need O(N !) calculations and also deterministic
PageRank may be a good fit. Articulated as a linear algebra problem, one repeatedly squares a certain floating-point matrix of controllable size until convergence. In the graphical metaphor, one "ripples" change coming into each node onto the other edges. Both treatments can be made parallel.
You could do a least trimmed squares fit. One use of this is to identify outliers in a data set. For example you could generate samples from some smooth function (a polynomial say) and add (large) noise to some of the samples, and then the problem is to find a subset H of the samples of a given size that minimises the sum of the squares of the residuals (for the polynomial fitted to the samples in H). Since there are a large number of such subsets, you have a lot of fits to do! There are approximate algorithms for this, for example here.
Well one way to go would be to implement brute-force solver for the Traveling Salesman problem in some M-space (with M > 1).
The brute-force solution is to just try every possible permutation and then calculate the total distance for each permutation, without any optimizations (including no dynamic programming tricks like memoization).
For N points, there are (N!) permutations (with a redundancy factor of at least (N-1), but remember, no optimizations). Each pair of points requires (M) subtractions, (M) multiplications and one square root operation to determine their pythagorean distance apart. Each permutation has (N-1) pairs of points to calculate and add to the total distance.
So order of computation is O(M((N+1)!)), whereas storage space is only O(N).
Also, this should not be either too hard, nor too intensive to parallelize across the cores, though it does take some overhead. (I can demonstrate, if needed).
Another idea might be to compute a fractal map. Basically, choose a grid of whatever dimensionality you want. Then, for each grid point, do the fractal iteration to get the value. Some points might require only a few iterations; I believe some will iterate forever (chaos; of course, this can't really happen when you have a finite number of floating-point numbers, but still). The ones that don't stop you'll have to "cut off" after a certain number of iterations... just make this preposterously high, and you should be able to demonstrate a high-quality fractal map.
Another benefit of this is that grid cells are processed completely independently, so you will never need to do communication (not even at boundaries, as in stencil computations, and definitely not O(pairwise) as in direct N-body simulations). You can usefully use O(gridcells) number of processors to parallelize this, although in practice you can probably get better utilization by using gridcells/factor processors and dynamically scheduling grid points to processors on an as-ready basis. The computation is basically all floating-point math.
Mandelbrot/Julia and Lyupanov come to mind as potential candidates, but any should do.
Has anyone tried to apply a smoother to the evaluation metric before applying the L-method to determine the number of k-means clusters in a dataset? If so, did it improve the results? Or allow a lower number of k-means trials and hence much greater increase in speed? Which smoothing algorithm/method did you use?
The "L-Method" is detailed in:
Determining the Number of Clusters/Segments in Hierarchical Clustering/Segmentation Algorithms, Salvador & Chan
This calculates the evaluation metric for a range of different trial cluster counts. Then, to find the knee (which occurs for an optimum number of clusters), two lines are fitted using linear regression. A simple iterative process is applied to improve the knee fit - this uses the existing evaluation metric calculations and does not require any re-runs of the k-means.
For the evaluation metric, I am using a reciprocal of a simplified version of the Dunns Index. Simplified for speed (basically my diameter and inter-cluster calculations are simplified). The reciprocal is so that the index works in the correct direction (ie. lower is generally better).
K-means is a stochastic algorithm, so typically it is run multiple times and the best fit chosen. This works pretty well, but when you are doing this for 1..N clusters the time quickly adds up. So it is in my interest to keep the number of runs in check. Overall processing time may determine whether my implementation is practical or not - I may ditch this functionality if I cannot speed it up.
I had asked a similar question in the past here on SO. My question was about coming up with a consistent way of finding the knee to the L-shape you described. The curves in question represented the trade-off between complexity and a fit measure of the model.
The best solution was to find the point with the maximum distance d according to the figure shown:
Note: I haven't read the paper you linked to yet..
Without resorting to asymptotic notation, is tedious step counting the only way to get the time complexity of an algorithm? And without step count of each line of code can we arrive at a big-O representation of any program?
Details: trying to find out the complexity of several numerical analysis algorithms to decide which will be best suited for solving a particular problem.
E.g. - from among Regula-Falsi or Newton-Rhapson method for solving eqns, intention is to evaluate the exact complexity of each method and then decide (putting value of 'n' or whatever arguments there are) which method is less complex.
The only way --- not the "easy" or hard way but the only reasonable way --- to find the exact complexity of a complicated algorithm is to profile it. A modern implementation of an algorithm has a complex interaction with numerical libraries and with the CPU and its floating point unit. For instance in-cache memory access is much faster than out-of-cache memory access, and on top of that there may be more than one level of cache. Counting steps is really much more suitable to the asymptotic complexity that you say isn't enough for your purpose.
But, if you did want to count steps automatically, there are also ways to do that. You can add a counter increment command (like "bloof++;" in C) to every line of code, and then display the value at the end.
You should also know about the more refined time complexity expression, f(n)*(1+o(1)), that is also useful for analytical calculations. For instance n^2+2*n+7 simplifies to n^2*(1+o(1)). If the constant factor is what bothers you about usual asymptotic notation O(f(n)), this refinement is a way to keep track of it and still throw out negligible terms.
The 'easy way' is to simulate it. Try your algorithms with lots of values of n and lots of different data, plot the results then match the curve on the graph to an equation.
Your results may not be strictly correct and they're only as valid as your ability to generate good test data but for most cases this will work.
E.g. - from among Regula-Falsi or Newton-Rhapson method for solving eqns, intention is to evaluate the exact complexity of each method and then decide (putting value of 'n' or whatever arguments there are) which method is less complex.
I don't think it's possible to answer this question in general for nonlinear solvers. You could an exact number of computations per iteration, but you're never going to know in general how many iterations it will take for each solver to converge. There are other complications like needing the Jacobian for Newton's which could make computing the complexity even more difficult.
To sum up, the most efficient nonlinear solver is always dependent on the problem you're solving. If the variety of problems you're solving is very limited, doing a bunch of experiments with different solvers and measuring the number of iterations and CPU time will probably give you more useful information.