Fast source of high quality randomness - random

Evolution algorithm is highly dependent on good randomness. Unfortunately, good randomness sources are slow (and hence the AI).
Question is, if I take one highest quality number and use it as a seed for poor quality (but fast) random generator, how much 'random' will be result?

I have conducted some research into this area previously. Evolutionary algorithms are part of a family of meta heuristic algorithms of which the particle swarm algorithm also belongs. A study has been conducted into the effectiveness of random number generators on the particle swarm algorithm here: Impact of the quality of random numbers generators on the performance of particle swarm optimization. It should directly apply to your evolutionary algorithm.

Related

Is there an accepted "current industry standard best" of stochastic optimization? (Simulated annealing, Particle swarm optimization, etc)

Sorting algorithms are well understood enough that Java Collections uses some flavor of MergeSort or Timsort. (Even though it is possible to hand-craft collections that "fight" the algorithm and perform poorly, those choices are "often enough ideal" for most real world sorting situations)
Statistical ML algorithms kinda/sorta have winners as well, e.g. "You won't go wrong first trying Logistic Regression, Random Forests, and SVM."
Q: Is there a similar "best of breed" choice between the various global optimum approximation functions? For example, it seems that particle swarm optimization (PSO) is several simulated annealing processes running in parallel and sharing information...

Impact of fitness function domain selection in multi-objective evolutionary optimization

I am using evolutionary algorithms e.g. the NSGA-II algorithm to solve unconstrained optimization problems with multiple objectives.
As my fitness functions sometimes have very different domains (e.g. f1(x) generates fitness values within [0..1] and f2(x) within [10000..10000000]) I am wondering if this has an effect on the search behaviour of the selected algorithm.
Does the selection of the fitness function domain (e.g. scaling all domains to a common domain from [lb..ub]) impact the solution quality and the speed of finding good solutions? Or is there no general answer to this question?
Unfortunately, I could not find anything on this topic. Any hints are welcome!
Your question is related to the selection strategy implemented in the algorithm. In the case of the original NSGA II, selection is made using a mixture of pareto rank and crowding distance. While the pareto rank (i.e. the non dominated front id of a point) is not changing scaling the numerical values by some constant, the crowding distance does.
So the answer is yes, if your second objective is in [10000 .. 10000000] its contribution to the crowding distance might be eating up the one of the other objective.
In algorithms such as NSGA II units count!
I have just come across your question and I have to disagree with the previous answer. If you read the paper carefully you will see that the crowding distance is supposed to be calculated in the normalized objective space. Exactly for the reason that one objective shall not dominating another.
My PhD advisor is Kalyanmoy Deb who has proposed NSGA-II and I have implemented the algorithm by myself (available in our evolutionary multi-objective optimization framework pymoo). So I can state with certainty that normalization is supposed to be incorporated into the algorithm.
If you are curious about a vectorized crowding distance implementation feel free to have a look at pymoo/algorithms/nsga2.py in pymoo on GitHub.

Expectation vs. direct numerical optimization of likelihood function for estimating high-dimensional Markov-Switching /HMM model

I am currently estimating a Markov-switching model with many parameters using direct optimization of the log likelihood function (through the forward-backward algorithm). I do the numerical optimization using matlab's genetic algorithm, since other approaches such as the (mostly gradient or simplex-based) algorithms in fmincon and fminsearchbnd were not very useful, given that likelihood function is not only of very high dimension but also shows many local maxima and is highly nonlinear.
The genetic algorithm seems to work very well. However, I am planning to further increase the dimension of the problem. I have read about an EM algorithm to estimate Markov-switching models. From what I understand this algorithm releases a sequence of increasing log-likelhood values. It thus seems suitable to estimate models with very many parameters.
My question is if the EM algorithm is suitable for my application involving many parameters (perhaps better suitable as the genetic algorithm). Speed is not the main limitation (the genetic algorithm is altready extremely slow) but I would need to have some certainty to end up close to the global optimum and not run into one of the many local optima. Do you have any experience or suggestions regarding this?
The EM algorithm finds local optima, and does not guarantee that they are global optima. In fact, if you start it off with a HMM where one of the transition probabilities is zero, that probability will typically never change from zero, because those transitions will appear only with expectation zero in the expectation step, so those starting points have no hope of finding a global optimum which does not have that transition probability zero.
The standard workaround for this is to start it off from a variety of different random parameter settings, pick the highest local optima found, and hope for the best. You might be slightly reassured if a significant proportion of the runs converged to the same (or to equivalent) best local optimum found, on the not very reliable theory that anything better would be found from at least the same fraction of random starts, and so would have showed up by now.
I haven't worked it out in detail, but the EM algorithm solves such a general set of problems that I expect that if it guaranteed to find the global optimum then it would be capable of finding the solution to NP-complete problems with unprecedented efficiency.

Multi objective convex optimization using genetic algorithm or cvx tool

I have solved a single objective convex optimization problem (actually related to reducing interference reduction) using cvx package with MATLAB. Now I want to extend the problem to multi objective one. What are the pros-cons of solving it using genetic algorithm in comparison to cvx package? I haven't read anything about genetic algorithms and it came about by searching net for multiobjective optimization.
The optimization algorithms based on derivatives (or gradients) including convex optimization algorithm essentially try to find a local minimum. The pros and cons are as follows.
Pros:
1. It can be extremely fast since it only tries to follow the path given by derivative.
2. Sometimes, it achieves the global minimum (e.g., the problem is convex).
Cons:
1. When the problem is highly nonlinear and non-convex, the solution depends on initial point, hence there is high probability that the solution achieved is far from the global optimum.
2. It's not quite for multi-objective optimization problem.
Because of the disadvantages described above, for multi-objective optimization, we generally use evolutionary algorithm. Genetic algorithms belong to evolutionary algorithm.
Evolutionary algorithms developed for multi-objective optimization problems are fundamentally different from the gradient-based algorithms. They are population-based, i.e., maintain multiple solutions (hundreds or thousands of them) where as the latter ones maintain only one solution.
NSGA-II is an example: https://ieeexplore.ieee.org/document/996017, https://mae.ufl.edu/haftka/stropt/Lectures/multi_objective_GA.pdf, https://web.njit.edu/~horacio/Math451H/download/Seshadri_NSGA-II.pdf
The purpose of the multi-objective optimization is find the Pareto surface (or optimal trade-off surface). Since the surface consists of multiple points, population-based evolutionary algorithms suit well.
(You can solve a series of single objective optimization problems using gradient-based algorithms, but unless the feasible set is convex, it cannot find them accurately.)

Statistically optimize a genetic algorithms selection operator

I am familiar with the methods of selection for genetic algorithms such as stochastic universal sampling, roulette wheel, tournament and others. However, I realize that these methods are close to random sampling used in statistics. I would like to know if there are implementation methods which are close to statistical clustering based on some features of individuals contained in the population, without having to first check all individuals for that specific feature before doing the sample. Essentially I would like to reduce the randomness of the other sampling methods while maintaining enough diversity in each population.
For the genetic algorithm generally, look for niching/crowding strategies. They try to preserve a diverse population by e.g. keeping unique or very diverse solutions and replacing solutions in very densly populated regions instead. This is especially useful in multiobjective optimization where the "solution" is a population of non-dominated individuals.
If you don't do multiobjective optimization and you do not need to maintain a diverse population over the whole run then you could also use the Offspring Selection Genetic Algorithm (OSGA). It is comparing children to its parents and only considering them for the next population if they've surpassed their parents in quality. This has been shown to a) work even with unbiased random parent selection and b) maintains the diversity until very late in the search at which point the population converges to a single solution.
You can for example use our software HeuristicLab, try different configurations of genetic algorithms and analyze their behavior. The software is GPL and runs on Windows.

Resources