OpenMDAO SimpleGADriver penalty - genetic-algorithm

In SimpleGADriver of OpenMDAO, is the penalty factor applied on the scaled constraint values or the original ones?
In my problem I have an objective and a few constraints each of different orders of magnitude, therefore I apply scaling factors when defining them, for instance: model.add_objective('obj', ref=1e6). This way, at the driver level, I have all functions of the order of 1.
I set penalty_exponent=2 and penalty_parameter=20, which are quite high, yet the driver seems to favour highly unfeasible points with low objective function value.
I would appreciate any tips.

The code is applying the penalty to the scaled objectives and constraints. The relevant lines are here, and specifically lines
obj_values = self.get_objective_values()
and
fun = obj + penalty * sum(np.power(constraint_violations, exponent))
The get_objective_values() method by default returns thing in driver scaled values. The constraints work the same way.

Related

How do I constrain the outputs of Gaussian Processes in PYMC?

So I have a very challenging MCMC run I would like to do in PyMC, which I have run several times before for much simpler analyses. However, my newest challenge requires me to combine many different Gaussian Processes in a very specific way, and I don't know enough about Gaussian processes in general or how they are implemented in PyMC to engineer the code I need.
Here is the problem I am trying to tackle:
The data I have is five time series (we'll call them A(t), B(t), C(t), D(t), and E(t)) , each measurement of which has Gaussian/Normal uncertainties. Each of these can be modeled as the product of one series-specific efficiency function and one underlying function shared between all five time series, so A(t) = a(t) * f(t), B(t) = b(t) * f(t), C(t) = c(t) * f(t), etc... I need to measure the posterior for f(t), or more specifically, the posterior of the integral of f(t) dt over a domain.
So I have read over some documentation about implementing Gaussian Processes in PyMC, but I have a few additional wrinkles with my efficiency functions that need to be addressed specifically before I can start coding up my model. Mainly -
1) I have no strong prior about the shape of the efficiency functions a(t), b(t), etc... So long as they vary smoothly there is no shape that is strongly forbidden.
2) These efficiency functions are physically bound to be between 0 and 1 for all times. So while I have no prior on the shape of the curve it has to fall between these bounds. I do have some prior about its typical value but since I need to marginalize over it I can't put too many other constraints on this.
Has anyone out there tackled a similar type of problem before, and what might be the most elegant way to guarantee that my efficiency priors are implemented in this complex MCMC run? I simply don't know enough about Gaussian Processes/Covariance functions to know how to force these constraints on the data.

Optimal parameters for genetic algorithm

I am solving an optimization problem in matlab. The optimization takes for 10 variables with search space consisting of (30*21*30*21*15*21*15*21*13*13= 6.6e12) combinations.
I have currently set the following parameters for ga optimization.
CrossoverFraction=0.4;
PopulationSize=500;
EliteCount=4;
Generations=25;
Rest of the values are set to default taken from gaoptimset as follows :
options=gaoptimset('PopInitRange',Bound,'PopulationSize',PopulationSize,...
'EliteCount',EliteCount, 'Generations',Generations,'StallGenL',25,...
'Display','iter');
Now I understand the search space is large but given the limitation by time due to number of times I have to run this GA algorithm for various instruments, I cannot increase (PopulationSize*Generations). I am running the optimization as a single threaded application, hence I am not using migration options.
Please suggest ways to improve the optimisation capability of my problem by tweaking other parameters in the options. Alternative ways of optimization are also welcome.
To increase the speed of the algorithm, try specifying bounds of your 10 variables. This forces the algorithm to explore values for your variables within a smaller data set and leads to a faster convergence to a suitable answer. You will have to make educated guesses for these values based on your specific problem.
This leaves you with additional time to try and increase other parameters such as the generations etc.
One way to specify bounds is when calling the ga function:
nvars = 10; // 10 Variables
lower = [0,0,0,0,0,0,0,0,0,0]; // Lower bounds for each variable
upper = [10,10,10,10,10,10,10,10,10,10]; // Upper bounds for each variable
[x fval] = ga(#objectiveFunction, nvars, [],[],[],[],lower, upper,[], integers, options)

Work sizes for completely independent calculations in OpenCL

I have a 2D matrix where I want to modify every value by applying a function that is only dependent on the coordinates in the matrix and values set at compile-time. Since no synchronization is necessary between each such calculation, it seems to me like the work group size could really be 1, and the number of work groups equal to the number of elements in the matrix.
My question is whether this will actually yield the desired result, or whether other forces are at play here that might make a different setting for these values better?
My recomendation: Just set global size to your 2D matrix size, and local size to NULL. This will make the compiler select for you an optimal local size.
In your specific case, the local size does not need to hav any shape. In fact, any value value will do the work, but the performance may differ. You can tune it manually for different HW. But it is easyer to let the compiler do this job for you. And it is even more portable.

Why do Perlin noise algorithms use lookup tables for random numbers

I have been researching noise algorithms for a library I wish to build, and have started with Perlin noise (more accurately Simplex noise, I want to work with arbitrary dimensions, or at least up to 6). Reading Simplex noise demystified, helped, but looking through the implementations at the end, i saw a big lookup table named perm.
In the code example, it seems to be used to generate indexes into a set of gradients, but the method seems odd. I assume that the table is just there to provide 1) determinism, and 2) a speed boost.
My question is, does the perm lookup table have any auxiliary meaning or purpose, or it there for the reasons above? Or another way, is there a specific reason that a pseudo-random number generator is not used, other than performance?
This is a bytes array. The range is 0 to 255.
You may randomize it if you want. You will probably want to seed the random... etc.
The perm table (and grad table) is used for optimization. They are just lookup tables of precomputed values. You are correct on both points 1) and 2).
Other than performance and portability, there is no reason you couldn't use a PRN.

Optimal population size, mutate rate and mate rate in genetic algorithm

I have written a game playing program for a competition, which relies on some 16 floating point "constants". Changing a constant can and will have dramatic impact on playing style and success rate.
I have also written a simple genetic algorithm to generate the optimal values for the constants. However the algorithm does not generate "optimal" constants.
The likely reasons:
The algorithm has errors (for the time being rule this out!)
The population is to small
The mutate rate is to high
The mate rate could be better
The algorithm goes like this:
First the initial population is created
Initial constants for each member are assigned (based on my bias multiplied with a random factor between 0.75 and 1.25)
For each generation members of the population are paired for a game match
The winner is cloned twice, if draw both are cloned once
The cloning mutates one gene if random() is less than mutate rate
Mutation multiplies a random constant with a random factor between 0.75 and 1.25
At fixed intervals, dependent on mate rate, the members are paired and genes are mixed
My current settings:
Population: 40 (to low)
Mutate rate 0.10 (10%)
Mate rate 0.20 (every 5 generations)
What would be better values for population size, mutate rate and mate rate?
Guesses are welcome, exact values are not expected!
Also, if you have insights with similar genetic algorithms, you will like to share, please do so.
P.S.: The game playing competition in question, if anyone is interested: http://ai-contest.com/
Your mutation size strikes me as surprisingly high. There's also a bit of bias inherent in it - the larger the current value is, the larger the mutation will be.
You might consider
Having a (much!) smaller mutation
Giving the mutation a fixed range
Distributing your mutation sizes differently - e.g. you could use a normal distribution with a mean of 1.
R.A. Fisher once compared the mutation size to focusing a microscope. If you change the focus, you might be going in the right direction, or the wrong direction. However, if you're fairly close to the optimum and turn it a lot - either you'll go in the wrong direction, or you'll overshoot the target. So a more subtle tweak is generally better!
Use GAUL framework, it's really easy so you could extract your objective function to plug it to GAUL. If you have a multi-core machine, then you would want to use omp (openMP ) when compiling to parallelize your evaluations( that I assume are time consumming ). This way you can have a bigger population size. http://gaul.sourceforge.net/
Normally they use High crossover and low mutation. Since you want creativity i suggest you High mutation and low crossover.http://games.slashdot.org/story/10/11/02/0211249/Developing-emStarCraft-2em-Build-Orders-With-Genetic-Algorithms?from=rss
Be really carefull in your mutation function to stay in your space search ( inside 0.75, 1.25 ). Use GAUL random function such as random_double( min, max ). They are really well designed. Build your own mutation function. Make sure parents dies !
Then you may want combine this with a simplex (Nelder-Mead), included in GAUL, because genetic programming with low crossover will find a non optimal solution.

Resources