Grammatical Evolution (GE) - genetic-algorithm

In the Grammatical Evolution (GE) algorithm (here the web: grammatical-evolution.org) there is the option to make the length of the individuals self-adaptive. I would like to know:
What is the most common strategy used when the individual's length is self-adaptive. In other words, how does the the length of the individual evolve.
Does it increase and decrease the size or just increase.
Is there any well documented or illustrative example.
Thanks in advance.

In GE, the individuals are necessary variable-length in order to be able to encode programs (structures) of variable length.
Initialization
The initial population already consists of individuals of varying size. The initialization procedures may vary, but the one I'm familiar the most uses the grammar to create the individuals. You basically do the same thing as when you want to decode the individual into the program (using the grammar) but instead of choosing the grammar expansions using the individual, you do it the other way around - you choose the expansions randomly and record these random decisions. When the expansion is finished, your recorded decisions form an individual.
Crossover
The crossover operator in GE already modifies length of the indiviuals. It's a classical single-point crossover, but the crossover points are chosen completely randomly in both parents (in contrast to the classical single-point crossover from GAs where the parents are of the same length and the crossover point is aligned). This mechanism is capable of both growing and shrinking of the individuals.
Example on crossover: suppose you have two individuals and you have randomly chosen the crossover points
parent 1: XXXXXXXXXXXX|XXXX
parent 2: YYY|YYYYYYYYYYYYYYYYYYYYYY
After crossover, the children look like this
parent 1: XXXXXXXXXXXX|YYYYYYYYYYYYYYYYYYYYYY
parent 2: YYY|XXXX
As you can see, the length of the individuals was changed dramatically.
Pruning
However, there is a second mechanism that is used for length reduction only. It is the pruning operator. This operator, when invoked (with a probability, in the same way e.g. the mutation is invoked), deletes the non-active part (i.e. if the grammar expansion finished before all codons were used, the remaining part is the non-active part) of the genotype.

Related

genetic algorithm crossover operation

I am trying to implement a basic genetic algorithm in MATLAB. I have some questions regarding the cross-over operation. I was reading materials on it and I found that always two parents are selected for cross-over operation.
What happens if I happen to have an odd number of parents?
Suppose I have parent A, parent B & parent C and I cross parent A with B and again parent B with C to produce offspring, even then I get 4 offspring. What is the criteria for rejecting one of them, as my population pool should remain the same always? Should I just reject the offspring with the lowest fitness value ?
Can an arithmetic operation between parents, like suppose OR or AND operation be deemed a good crossover operation? I found some sites listing them as crossover operations but I am not sure.
How can I do crossover between multiple parents ?
"Crossover" isn't so much a well-defined operator as the generic idea of taking aspects of parents and using them to produce offspring similar to each parent in some ways. As such, there's no real right answer to the question of how one should do crossover.
In practice, you should do whatever makes sense for your problem domain and encoding. With things like two parent recombination of binary encoded individuals, there are some obvious choices -- things like n-point and uniform crossover, for instance. For real-valued encodings, there are things like SBX that aren't really sensible if viewed from a strict biological perspective. Rather, they are simply engineered to have some predetermined properties. Similarly, permutation encodings offer numerous well-known operators (Order crossover, Cycle crossover, Edge-assembly crossover, etc.) that, again, are the result of analysis of what features in parents make sense to make heritable for particular problem domains.
You're free to do the same thing. If you have three parents (with some discrete encoding like binary), you could do something like the following:
child = new chromosome(L)
for i=1 to L
switch(rand(3))
case 0:
child[i] = parentA[i]
case 1:
child[i] = parentB[i]
case 2:
child[i] = parentC[i]
Whether that is a good operator or not will depend on several factors (problem domain, the interpretation of the encoding, etc.), but it's a perfectly legal way of producing offspring. You could also invent your own more complex method, e.g., taking a weighted average of each allele value over multiple parents, doing boolean operations like AND and OR, etc. You can also build a more "structured" operator if you like in which different parents have specific roles. The basic Differential Evolution algorithm selects three parents, a, b, and c, and computes an update like a + F(b - c) (with some function F) roughly corresponding to an offspring.
Consider reading the following academic articles:
DEB, Kalyanmoy et al. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE transactions on evolutionary computation, v. 6, n. 2, p. 182-197, 2002.
DEB, Kalyanmoy; AGRAWAL, Ram Bhushan. Simulated binary crossover for continuous search space. Complex systems, v. 9, n. 2, p. 115-148, 1995.
For SBX, method of crossing and mutate children mentioned by #deong, see answer simulated-binary-crossover-sbx-crossover-operator-example
Genetic algorithm does not have an arbitrary and definite form to be made. Many ways are proposed. But generally, what applies in all are the following steps:
Generate a random population by lot or any other method
Cross parents to raise children
Mutate
Evaluate the children and parents
Generate new population based only on children or children and parents (different approaches exist)
Return to item 2
NSGA-II, the DEB quoted above, is one of the most widely used and well-known genetic algorithms. See an image of the flow taken from the article:

What is the correct way to execute selection procedure in a genetic algorithm?

I am working on a genetic algorithm for symmetric TSP in VB.NET. I want to know what is the correct way go execute selection procedure. There seems to be at least two different possibilities:
1)
-create a "reproduction pool" of size R by using SELECTION(pop) function
-do offspring creation cycle
-randomly (uniformly) select two parents from that pool for each offspring
that needs to be created in each iteration
2)
-do offspring creation cycle
-use modified SELECTION(pop) function that will return two different parents from pop
-perform crossover to produce a child
Bonus question: After selecting two parents it is possible to produce two different offsprings (if the crossover operator is mot commutative): CROSS(p1, p2) and CROSS(p2, p1).
Should I insert both offsprings immediately or produce them one by one ? Will this make a difference?
Currently I am producing them one by one because I think it will give more variance in the population.
In genetic algorithms you don't use a separate reproduction pool, but sample from the population (|N| until you have 2*|N| parents out of which you create |N| children). If your reproduction pool R is of size 2*|N| and you sample randomly out of that pool it's essentially the same behavior, but you need more random numbers and it's more expensive to compute (depending on your RNG). Note, that there is no need to care about getting two different parents. A parent mated by itself will produce a child that is the same as the parent (if the crossover is idempotent). It's similar to using crossover probability. Also the check if two parents are different may be quite expensive if you compare them structurally. You could also compare them by fitness, but often you can have very different solutions of the same quality.
Regarding your second question: I would say that it doesn't matter much. I would choose to return only one child for simplicity reasons: A method that takes two solutions and returns one solution is easier to deal with than one that returns an array of solutions. I would say that returning both is only interesting in those cases where you can create two distinct solutions. This is the case of binary or real-valued encoding. With permutations however you cannot guarantee this property and some genetic information will be lost in the crossover anyway.
It depends on the codification.
You can consider the two fittest individuals of the current population.
Or you can use roulette whell selection (Google it) to associate each individual with a reproduction rate, this is the usual way.

When to check a chromosome for validity in a genetic algorithm

I'm trying to develop a plugin for Rhino which generates architectural floor plans (based on Shape Grammars). The plugin is written in C# using the RhinoCommon API. Using different rules, represented as genes in a chromosome, I transform a starting geometry. Using GA, a fitness function determines the optimum sequence of transformation rules to generate a geometry that matches parametric criteria (area, views, minimal construction, etc.).
As the geometry represents architecture, there are some constructive rules to adhere to. My question is about the general approach of Genetic Algorithms:
When do I check the validity of the geometry created by the chromosome? At the gene insertion point or do I just give invalid geometries a bad fitness value?
When I add a gene (representing a geometric transformation operation) to the chromosome, I can check if this leads to invalid geometry. For example:
My starting shape is a rectangle:
One transformation option is to divide one rectangle side in two parts. The gene would look something like this: [DIVIDE:TOP:0.25]. This will create a side containing of two segments, split at the quarter mark:
If I already know that a segment has to be of a certain length, this gene could be creating invalid geometry. In the example above the red segment on the top is too short. Do I implement this geometry check (which could potentially be more complex for other rules than in the shown example) at the gene-insertion point, or do I wait until the fitness function to validate it? In this example a check would be that when I add a segment-split gene, I check if the resulting segments are within an allowed range?
Not checking could lead to a population consisting of chromosomes that generate invalid geometry, or individuals with a very bad fitness. Checking could guarantee a population with "valid" chromosomes, but the generation of the chromosomes could possibly take much longer.
What would be a better strategy?
I think that either approach should actually work fine, and depending on your other parameters, will have almost identical behavior. As long as invalid genomes are never selected to be parents, letting invalid genomes through will be the same as removing them at gene insertion point, as long as in the first case your population is significantly bigger than in the second case. Let's say you estimate that about 33% of your gene insertions result in invalid genomes. Then, you'd want your population when letting invalid genomes through to be 3 times as large as if you reject invalid genomes when they are produced. In both cases, the algorithm will allow only valid genomes to be selected as parents, leading to very similar results.
In your case though, it might just be easier to reject invalid genomes at insertion point, which will ensure that all potential parents are valid.
I would finally like to point out that if you are using a significant amount of your evaluation time rejecting invalid genomes, you might want to consider ways of changing your genetic operators so that they can only produce valid genomes. I'm not sure the best way to do this in your GA, but often in genetic programming a developmental approach is used, which enforces only valid changes to be made to an "embryonic" solution.
I think the check has to be in the embryogeny (mapping) function and in the fitness, because a crossover between the two valid parents (DIVIDE:TOP:0.25) and (DIVIDE:BOTTOM:0.05) could create offspring like (DIVIDE:TOP:0.05) which would create much too short segments, even if the syntax is allowed. As this would have to be valued by the fitness function, a check at the Gene-creation or mutation-point is superfluous.

Choosing parents to crossover in genetic algorithms?

First of all, this is a part of a homework.
I am trying to implement a genetic algorithm. I am confused about selecting parents to crossover.
In my notes (obviously something is wrong) this is what is done as example;
Pc (possibility of crossover) * population size = estimated chromosome count to crossover (if not even, round to one of closest even)
Choose a random number in range [0,1] for every chromosome and if this number is smaller then Pc, choose this chromosome for a crossover pair.
But when second step applied, chosen chromosome count is equals to result found in first step. Which is not always guaranteed because of randomness.
So this does not make any sense. I searched for selecting parents for crossover but all i found is crossover techniques (one-point, cut and slice etc.) and how to crossover between chosen parents (i do not have a problem with these). I just don't know which chromosomesi should choose for crossover. Any suggestions or simple example?
You can implement it like this:
For every new child you decide if it will result from crossover by random probability. If yes, then you select two parents, eg. through roulette wheel selection or tournament selection. The two parents make a child, then you mutate it with mutation probability and add it to the next generation. If no, then you select only one "parent" clone it, mutate it with probability and add it to the next population.
Some other observations I noted and that I like to comment. I often read the word "chromosomes" when it should be individual. You hardly ever select chromosomes, but full individuals. A chromosome is just one part of a solution. That may be nitpicking, but a solution is not a chromosome. A solution is an individual that consists of several chromosomes which consist of genes which show their expression in the form of alleles. Often an individual has only one chromosome, but it's still not okay to mix terms.
Also I noted that you tagged genetic programming which is basically only a special type of a genetic algorithm. In GP you consider trees as a chromosome which can represent mathematical formulas or computer programs. Your question does not seem to be about GP though.
This is very late answer, but hoping it will help someone in the future. Even if two chromosomes are not paired (and produced children), they goes to the next generation (without crossover) but after some mutation (subject to probability again). And on the other hand, if two chromosomes paired, then they produce two children (replacing the original two parents) for the next generation. So, that's why the no of chromosomes remain same in two generations.

How to apply the Levenshtein distance to a set of target strings?

Let TARGET be a set of strings that I expect to be spoken.
Let SOURCE be the set of strings returned by a speech recognizer (that is, the possible sentences that it has heard).
I need a way to choose a string from TARGET. I read about the Levenshtein distance and the Damerau-Levenshtein distance, which basically returns the distance between a source string and a target string, that is the number of changes needed to transform the source string into the target string.
But, how can I apply this algorithm to a set of target strings?
I thought I'd use the following method:
For each string that belongs to TARGET, I calculate the distance from each string in SOURCE. In this way we obtain an m-by-n matrix, where n is the cardinality of SOURCE and n is the cardinality of TARGET. We could say that the i-th row represents the similarity of the sentences detected by the speech recognizer with respect to the i-th target.
Calculating the average of the values ​​on each row, you can obtain the average distance between the i-th target and the output of the speech recognizer. Let's call it average_on_row(i), where i is the row index.
Finally, for each row, I calculate the standard deviation of all values in the row. For each row, I also perform the sum of all the standard deviations. The result is a column vector, in which each element (Let's call it stadard_deviation_sum(i)) refers to a string of TARGET.
The string which is associated with the shortest stadard_deviation_sum could be the sentence pronounced by the user. Could be considered the correct method I used? Or are there other methods?
Obviously, too high values ​​indicate that the sentence pronounced by the user probably does not belong to TARGET.
I'm not an expert but your proposal does not make sense. First of all, in practice I'd expect the cardinality of TARGET to be very large if not infinite. Second, I don't believe the Levensthein distance or some similar similarity metric will be useful.
If :
you could really define SOURCE and TARGET sets,
all strings in SOURCE were equally probable,
all strings in TARGET were equally probable,
the strings in SOURCE and TARGET consisted of not characters but phonemes,
then I believe your best bet would be to find the pair p in SOURCE, q in TARGET such that distance(p,q) is minimum. Since especially you cannot guarantee the equal-probability part, I think you should think about the problem from scratch, do some research and make a completely different design. The usual methodology for speech recognition is the use Hidden Markov models. I would start from there.
Answer to your comment: Choose whichever is more probable. If you don't consider probabilities, it is hopeless.
[Suppose the following example is on phonemes, not characters]
Suppose the recognized word the "chees". Target set is "cheese", "chess". You must calculate P(cheese|chees) and P(chess|chees) What I'm trying to say is that not every substitution is equiprobable. If you will model probabilities as distances between strings, then at least you must allow that for example d("c","s") < d("c","q") . (It is common to confuse c and s letters but it is not common to confuse c and q) Adapting the distance calculation algorithm is easy, coming with good values for all pairs is difficult.
Also you must somehow estimate P(cheese|context) and P(chess|context) If we are talking about board games chess is more probable. If we are talking about dairy products cheese is more probable. This is why you'll need large amounts of data to come up with such estimates. This is also why Hidden Markov Models are good for this kind of problem.
You need to calculate these probabilities first: probability of insertion, deletion and substitution. Then use log of these probabilities as penalties for each operation.
In a "context independent" situation, if pi is probability of insertion, pd is probability of deletion and ps probability of substitution, the probability of observing the same symbol is pp=1-ps-pd.
In this case use log(pi/pp/k), log(pd/pp) and log(ps/pp/(k-1)) as penalties for insertion, deletion and substitution respectively, where k is the number of symbols in the system.
Essentially if you use this distance measure between a source and target you get log probability of observing that target given the source. If you have a bunch of training data (i.e. source-target pairs) choose some initial estimates for these probabilities, align source-target pairs and re-estimate these probabilities (AKA EM strategy).
You can start with one set of probabilities and assume context independence. Later you can assume some kind of clustering among the contexts (eg. assume there are k different sets of letters whose substitution rate is different...).

Resources