Statistically optimize a genetic algorithms selection operator - algorithm

I am familiar with the methods of selection for genetic algorithms such as stochastic universal sampling, roulette wheel, tournament and others. However, I realize that these methods are close to random sampling used in statistics. I would like to know if there are implementation methods which are close to statistical clustering based on some features of individuals contained in the population, without having to first check all individuals for that specific feature before doing the sample. Essentially I would like to reduce the randomness of the other sampling methods while maintaining enough diversity in each population.

For the genetic algorithm generally, look for niching/crowding strategies. They try to preserve a diverse population by e.g. keeping unique or very diverse solutions and replacing solutions in very densly populated regions instead. This is especially useful in multiobjective optimization where the "solution" is a population of non-dominated individuals.
If you don't do multiobjective optimization and you do not need to maintain a diverse population over the whole run then you could also use the Offspring Selection Genetic Algorithm (OSGA). It is comparing children to its parents and only considering them for the next population if they've surpassed their parents in quality. This has been shown to a) work even with unbiased random parent selection and b) maintains the diversity until very late in the search at which point the population converges to a single solution.
You can for example use our software HeuristicLab, try different configurations of genetic algorithms and analyze their behavior. The software is GPL and runs on Windows.

Related

Does all evolutionary algorithm encode the population in binary terms

I am new to heuristic methods of optimization and learning about different optimization algorithms available in this space like Gentic Algorithm, PSO, DE, CMA ES etc.. The general flow of any of these algorithms seem to be initialise a population, select, crossover and mutation for update , evaluate and the cycle continues. The initial step of population creation in genetic algorithm seems to be that each member of the population is encoded by a chromosome, which is a bitstring of 0s and 1s and then all the other operations are performed. GE has simple update methods to popualation like mutation and crossover, but update methods are different in other algorithms.
My query here is do all the other heuristic algorithms also initialize the population as bitstrings of 0 and 1s or do they use the general natural numbers?
The representation of individuals in evolutionary algorithms (EA) depends on the representation of a candidate solution. If you are solving a combinatorial problem i.e. knapsack problem, the final solution is comprised of (0,1) string, so it makes sense to have a binary representation for the EA. However, if you are solving a continuous black-box optimisation problem, then it makes sense to have a representation with continuous decision variables.
In the old days, GA and other algorithms only used binary representation even for solving continuous problems. But nowadays, all the algorithms you mentioned have their own binary and continuous (and etc.) variants. For example, PSO is known as a continuous problem solver, but to update the individuals (particles), there are mapping strategies such as s-shape transform or v-shape transform to update the binary individuals for the next iteration.
My two cents: the choice of the algorithm relies on the type of the problem, and I personally won't recommend using a binary PSO at first try to solve a problem. Maybe there are benefits hidden there but need investigation.
Please feel free to extend your question.

What is an optimal strategy for an user of hungarian algorithm?

I wonder what is the optimal strategy for a player to adopt when dealing with a distribution using a Hungarian algorithm.(https://en.wikipedia.org/wiki/Hungarian_algorithm)
The situation is that of a pairing between n persons and m objects, where the persons make some ordered vows concerning their favorite object.
In the total absence of information on wishes of n-1 other people, I guess the optimal strategy to maximize one’s contentment is to actually order the vows according to one’s real preferences.
However, if you know the wishes of others, or if you have an estimate of the number of times each item is asked for, I think there may be an optimal “barrage” strategy to maximize your chance of getting the first wish by placing items in high demand afterwards.
However, since weights are linear in a Hungarian algorithm, I think this dam strategy is ineffective (compared to square weights in the wish index) and potentially risky.
After a little research, I could not find any documentation on the subject. Do you have good sources describing optimal strategies to deal with this kind of algorithm?
Thank you in advance!

Impact of fitness function domain selection in multi-objective evolutionary optimization

I am using evolutionary algorithms e.g. the NSGA-II algorithm to solve unconstrained optimization problems with multiple objectives.
As my fitness functions sometimes have very different domains (e.g. f1(x) generates fitness values within [0..1] and f2(x) within [10000..10000000]) I am wondering if this has an effect on the search behaviour of the selected algorithm.
Does the selection of the fitness function domain (e.g. scaling all domains to a common domain from [lb..ub]) impact the solution quality and the speed of finding good solutions? Or is there no general answer to this question?
Unfortunately, I could not find anything on this topic. Any hints are welcome!
Your question is related to the selection strategy implemented in the algorithm. In the case of the original NSGA II, selection is made using a mixture of pareto rank and crowding distance. While the pareto rank (i.e. the non dominated front id of a point) is not changing scaling the numerical values by some constant, the crowding distance does.
So the answer is yes, if your second objective is in [10000 .. 10000000] its contribution to the crowding distance might be eating up the one of the other objective.
In algorithms such as NSGA II units count!
I have just come across your question and I have to disagree with the previous answer. If you read the paper carefully you will see that the crowding distance is supposed to be calculated in the normalized objective space. Exactly for the reason that one objective shall not dominating another.
My PhD advisor is Kalyanmoy Deb who has proposed NSGA-II and I have implemented the algorithm by myself (available in our evolutionary multi-objective optimization framework pymoo). So I can state with certainty that normalization is supposed to be incorporated into the algorithm.
If you are curious about a vectorized crowding distance implementation feel free to have a look at pymoo/algorithms/nsga2.py in pymoo on GitHub.

Are all genetic algorithms maximization algorithms?

I'm not sure if my understanding of maximization and minimization is correct.
So lets say for some function f(x,y,z) I want to find what would give the highest value that would be maximization, right? And if I wanted to find the lowest value that would be minimization?
So if a genetic algorithm is a search algorithm trying to maximize some fitness function would they by definition be maximization algorithms?
So let's say for some function f(x,y,z), I want to find what would give the highest value that would be maximization, right? And if I wanted to find the lowest value that would be minimization?
Yes, that's by definition true.
So if a genetic algorithm is a search algorithm trying to maximize some fitness function would they by definition be maximization algorithms?
Pretty much yes, although I'm not sure a "maximization algorithm" is a well-used term, and only if a genetic algorithm is defined as such, which I don't believe it is strictly.
Generic algorithms can also try to minimize the distance to some goal function value, or minimize the function value, but then again, this can just be rephrased as maximization without loss of generality.
Perhaps more significantly, there isn't a strict need to even have a function - the candidates just need to be comparable. If they have a total order, it's again possible to rephrase it as a maximization problem. If they don't have a total order, it might be a bit more difficult to get candidates objectively better than all the others, although nothing's stopping you from running the GA on this type of data.
In conclusion - trying to maximize a function is the norm (and possibly in line with how you'll mostly see it defined), but don't be surprised if you come across a GA that doesn't do this.
Are all genetic algorithms maximization algorithms?
No they aren't.
Genetic algorithms are popular approaches to multi-objective optimization (e.g. NSGA-II or SPEA-2 are very well known genetic algorithm based approaches).
For multi-objective optimization you aren't trying to maximize a function.
This because scalarizing multi-objective optimization problems is seldom a viable way (i.e. there isn't a single solution that simultaneously optimizes each objective) and what you are looking for is a set of nondominated solutions (or a representative subset of the Pareto optimal solutions).
There are also approaches to evolutionary algorithms which try to capture open-endedness of natural evolution searching for behavioral novelty. Even in an objective-based problem, such novelty search ignores the objective (see Abandoning Objectives: Evolution through the
Search for Novelty Alone by Joel Lehman and Kenneth O. Stanley for details).

Genetic Programming and Search Algorithms

Is Genetic Programming currently capable of evolving one type of search algorithm into another? For example, has any experiment ever ever bred / mutated BubbleSort from QuickSort (see http://en.wikipedia.org/wiki/Sorting_algorithm)
You might want to look at the work of W. Daniel Hillis from the 80s. He spent a great deal of time creating sorting networks by genetic programming. While he was more interested in solving the problem of sorting a constant number of objects (16-object sorting networks had been a major academic problem for nearly a decade,) it would be a good idea to be familiar with his work if you're really interested in genetic sorting algorithms.
In the evolution of an algorithm for sorting a list of arbitrary length, you might also want to be familiar with the concept of co-evolution. I've built a co-evolutionary system before where the point was to have one genetic algorithm evolving sorting algorithms while another GA develops unsorted lists of numbers. The fitness of the sorter is its accuracy (plus a bonus for fewer comparisons if it is 100% accurate) and the fitness of the list generator is how many errors sort algorithms make in sorting its list.
To answer your specific question of whether bubble had ever been evolved from quick, I would have to say that I would seriously doubt it, unless the programmer's fitness function was both very specific and ill-advised. Yes, bubble is very simple, so maybe a GP whose fitness function was accuracy plus program size would eventually find bubble. However, why would a programmer select size instead of number of comparisons as a fitness function when it is the latter that determines runtime?
By asking if GP can evolve one algorithm into another, I'm wondering if you're entirely clear on what GP is. Ideally, each unique chromosome defines a unique sort. A population of 200 chromosomes represents 200 different algorithms. Yes, quick and bubble may be in there somewhere, but so are 198 other, potentially unnamed, methods.
There's no reason why GP couldn't evolve e.g. either type of algorithm. I'm not sure that it really makes sense to think of evolving one "into" the other. GP will simply evolve a program that comes ever-closer to a fitness function you define.
If your fitness function only looks at sort correctness (and assuming you have the proper building blocks for your GP to use) then it could very well evolve both BubbleSort and QuickSort. If you also include efficiency as a measure of fitness, then that might influence which of these would be determined as a better solution.
You could seed the GP with e.g. QuickSort and if you had an appropriate fitness function it certainly could eventually come up with BubbleSort - but it could come up with anything else that is fitter than QuickSort as well.
Now how long it takes the GP engine to do this evolution is another question...
I'm not aware of one, and the particular direction you're suggesting in your example seems unlikely; it would take a sort of perverse fitness function, since bubble sort is in most measures worse than quicksort. It's not inconceivable that this could happen, but in general once you've got a well-understood algorithm, it's already pretty fit -- going to another one probably requires passing through some worse choices.
Being trapped in local minima isn't an unknown problem for most search strategies.

Resources