Python: How to solve DAE with Jacobian efficiently? - matrix

I am trying to use the Assimulo package to solve a set of differential algebraic equations (DAEs). I am trying to use an algorithm similar to that shown here. However, there does not seem to be an option to pass in a sparse matrix. My Jacobian matrix is very large, approximately 3000 x 3000. Do you know if there is a way to solve my DAEs more computationally efficiently?

In my experience with sparse ODE systems (more precisely with systems of semi-discretized PDEs), using an iterative linear solver greatly enhances numerical efficiency. As far as I know, Assimulo doesn't allow to provide a jacobian sparsity pattern, but changing the linear solver is another way to tackle this.
You would do something like:
model = Explicit_Problem(ode_function, y0=y_init, t0=t_init)
simulator = CVode(model)
sim.linear_solver = 'SPGMR'
I'm not sure if this also applies for DAE systems, but I think it's worth giving this a try.

Related

Scaling on constraints and variables for solving NLP problems, using CONOPT4

I am currently using CONOPT4 solver to solve nonlinear programming problem.The nonlinearity is in the form of z=x*y and z=x/y, all variables are continuous. I specify some scaling factors and solving performance improves a lot. However, when I further refine some scaling factors, to project the value into the range from 0.01 to 100. The solving time becomes longer, which is really weird. I cannot provide my code here and I know it's impossible to give specific reason without the code. Could you talk about your experience on generally tuning the scaling factors when using CONOPT sovler. Thanks a lot.

Symbolic vs Numeric Math - Performance

Do symbolic math calculations (especially for solving nonlinear polynomial systems) cause huge performance (calculation speed) disadvantage compared to numeric calculations? Are there any benchmark/data about this?
Found a related question: Symbolic computation vs. numerical computation
Another one: Computational Efficiency of Forward Mode Automatic vs Numeric vs Symbolic Differentiation
I am the individual who answered the Scicomp question you reference in your question. I personally am not aware of any empirical metrics performed to compare run-time performance for symbolic versus numerical solutions to systems of polynomial equations.
However, it should be fairly intuitive that symbolic solutions will have a bit more overhead for most aspects of solving the problem due to things such as manipulation of terms in the equation symbolically, searching how to simplify/rearrange equations to make them easier to solve, searching through known closed form solutions, etc. One major issue with symbolic solvers is that you may not have a closed form solution you can find and use, so solving it numerically would have to happen either way.
The only way I can see symbolic solvers outperforming numerical solutions in terms of run-time is if the symbolic solver can quickly enough recognize your problem as one with a known analytical solution or if it arrives at the solution eventually while the numerical solver never does (aka it diverges).
Given you can find a numerical solver that converges, I think the numerical case will generally be much more efficient since there's just much less overhead to make progress in refining your solution. Since you mention solving systems of polynomial equations, I suspect there are also some tailored algorithms for your type of problem that may be superior to typical nonlinear equation solving schemes.
This is not a direct answer to the question but a suggested course correction.
While it is possible to evaluate math expressions in a purely numeric means or in a purely symbolic means, it is also possible to use a hybrid approach.
This is know as Symbolic-numeric computation
Maple is one software package that has this ability.
Note: I have never used Maple so I can't add more.
Searching for packages
I find I get better results when searching for math packages that use symbolic-numeric computation by searching for the name of the package combined with Symbolic-numeric computation, e.g.
wolfram symbolic-numeric computation
A specific example related to neural networks
In the world of neural networks one has to be able to calculate the derivative, however if a derivate can be simplified before calculating then the cost of calculating goes down. Since simplifying the derivative is a one time action while the cost of calculating occurs thousands to millions of times, the simplification is done symbolically and then the calculation is done numerically. Theano is a software package that does this specifically for use with neural networks.

A good parameter optimization algorithm for a limited number of points with variance

I'm trying to meta-optimize an algorithm, which has almost a dosen constants. I guess some form of genetic algorithm should be used. However, the algorithm itself is quite heavy and probabilistic by nature (a version of ant colony optimization). Thus calculating the fitness for some set of parameters is quite slow and the results include a lot of variance. Even the order of magnitude for some of the parameters is not exactly clear, so the distribution on some components will likely need to be logarithmic.
Would someone have ideas about suitable algorithms for this problem? I.e. it would need to converge with a limited number of measurement points and also be able to handle randomness in the measured fitness. Also, the easier it is to implement with Java the better of course. :)
If you can express you model algebraically (or as differential equations), consider trying a derivative-based optimization methods. These have the theoretical properties you desire and are much more computationally efficient than black-box/derivative free optimization methods. If you have a MATLAB license, give fmincon a try. Note: fmincon will work much better if you supply derivative information. Other modeling environments include Pyomo, CasADi and Julia/JuMP, which will automatically calculate derivatives and interface with powerful optimization solvers.

Why Mahout doesn't yet have Linear Regression

I am just starting to work with Mahout, and one thing which perplexed me a great deal is the lack of Linear Regression. Even logistic regression, which is much harder, is supported to some degree with research going on, but it's all silent on linear regression front!
From what I understand, OLS is one of the easiest problems to solve -
Y = Xb + e
has a linear regression solution of b = (X^T X)^(-1) X^T Y, where X^T is transpose of X, and if the matrix (X^T X) turns out singular (i.e. not invertible) then it's perfectly fine to show error message even though a solution using generalized inverse exists.
Computation of both X^T X and X^ Y are just computations of sums and sum of products of elements, which is probably the easiest thing to do with MapReduce as I understand.
(Which makes me think... is there any module that supports native matrix operations required to compute regression cofficients? That would make a regression module unnecessary indeed...)
Am I missing something which makes regression hard to compute in Mahout?
I don't know if there's a "why" to things like this. It just doesn't exist.
However I think it's the opposite of what you suppose; it's too "easy". Unless you're solving a solution of a ten million equations, it's probably not of a scale that Hadoop is called for. There are plenty of existing packages that can do this really well on one machine. If you want something also in Java from Apache just look at Commons Math for example.
Not to say there couldn't be a fine non-distributed version in the project, but since the emphasis is mostly big-scale and Hadoop, that's probably "why".
I think it's simply because NxN matrix inversion's complexity is O(N^3) and subject to numerical instability, which is quite common whith sparse high-dimensionnal matrices.
Does anyone have another explanation or can someone confirm my thoughts?

Which optimization algorithm should I use to optimize the weights of a multilayer perceptron?

Actually these are 3 questions:
Which optimization algorithm should I use to optimize the weights of a multilayer perceptron, if I knew...
1) only the value of the error function? (blackbox)
2) the gradient? (first derivative)
3) the gradient and the hessian? (second derivative)
I heard CMA-ES should work very well for 1) and BFGS for 2) but I would like to know if there are any alternatives and I don't know wich algorithm to take for 3).
Ok, so this doesn't really answer the question you initially asked, but it does provide a solution to the problem you mentioned in the comments.
Problems like dealing with a continuous action space are normally not dealt with via changing the error measure, but rather by changing the architecture of the overall network. This allows you to keep using the same highly informative error information while still solving the problem you want to solve.
Some possible architectural changes that could accomplish this are discussed in the solutions to this question. In my opinion, I'd suggest using a modified Q-learning technique where the state and action spaces are both represented by self organizing maps, which is discussed in a paper mentioned in the above link.
I hope this helps.
I solved this problem finally: there are some efficient algorithms for optimizing neural networks in reinforcement learning (with fixed topology), e. g. CMA-ES (CMA-NeuroES) or CoSyNE.
The best optimization algorithm for supervised learning seems to be Levenberg-Marquardt (LMA). This is an algorithm that is specifically designed for least square problems. When there are many connections and weights, LMA does not work very well because the required space is huge. In this case I am using Conjugate Gradient (CG).
The hessian matrix does not accelerate optimization. Algorithms that approximate the 2nd derivative are faster and more efficient (BFGS, CG, LMA).
edit: For large scale learning problems often Stochastic Gradient Descent (SGD) outperforms all other algorithms.

Resources