genetic algorithm crossover operation - genetic-algorithm

I am trying to implement a basic genetic algorithm in MATLAB. I have some questions regarding the cross-over operation. I was reading materials on it and I found that always two parents are selected for cross-over operation.
What happens if I happen to have an odd number of parents?
Suppose I have parent A, parent B & parent C and I cross parent A with B and again parent B with C to produce offspring, even then I get 4 offspring. What is the criteria for rejecting one of them, as my population pool should remain the same always? Should I just reject the offspring with the lowest fitness value ?
Can an arithmetic operation between parents, like suppose OR or AND operation be deemed a good crossover operation? I found some sites listing them as crossover operations but I am not sure.
How can I do crossover between multiple parents ?

"Crossover" isn't so much a well-defined operator as the generic idea of taking aspects of parents and using them to produce offspring similar to each parent in some ways. As such, there's no real right answer to the question of how one should do crossover.
In practice, you should do whatever makes sense for your problem domain and encoding. With things like two parent recombination of binary encoded individuals, there are some obvious choices -- things like n-point and uniform crossover, for instance. For real-valued encodings, there are things like SBX that aren't really sensible if viewed from a strict biological perspective. Rather, they are simply engineered to have some predetermined properties. Similarly, permutation encodings offer numerous well-known operators (Order crossover, Cycle crossover, Edge-assembly crossover, etc.) that, again, are the result of analysis of what features in parents make sense to make heritable for particular problem domains.
You're free to do the same thing. If you have three parents (with some discrete encoding like binary), you could do something like the following:
child = new chromosome(L)
for i=1 to L
switch(rand(3))
case 0:
child[i] = parentA[i]
case 1:
child[i] = parentB[i]
case 2:
child[i] = parentC[i]
Whether that is a good operator or not will depend on several factors (problem domain, the interpretation of the encoding, etc.), but it's a perfectly legal way of producing offspring. You could also invent your own more complex method, e.g., taking a weighted average of each allele value over multiple parents, doing boolean operations like AND and OR, etc. You can also build a more "structured" operator if you like in which different parents have specific roles. The basic Differential Evolution algorithm selects three parents, a, b, and c, and computes an update like a + F(b - c) (with some function F) roughly corresponding to an offspring.

Consider reading the following academic articles:
DEB, Kalyanmoy et al. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE transactions on evolutionary computation, v. 6, n. 2, p. 182-197, 2002.
DEB, Kalyanmoy; AGRAWAL, Ram Bhushan. Simulated binary crossover for continuous search space. Complex systems, v. 9, n. 2, p. 115-148, 1995.
For SBX, method of crossing and mutate children mentioned by #deong, see answer simulated-binary-crossover-sbx-crossover-operator-example
Genetic algorithm does not have an arbitrary and definite form to be made. Many ways are proposed. But generally, what applies in all are the following steps:
Generate a random population by lot or any other method
Cross parents to raise children
Mutate
Evaluate the children and parents
Generate new population based only on children or children and parents (different approaches exist)
Return to item 2
NSGA-II, the DEB quoted above, is one of the most widely used and well-known genetic algorithms. See an image of the flow taken from the article:

Related

Grammatical Evolution (GE)

In the Grammatical Evolution (GE) algorithm (here the web: grammatical-evolution.org) there is the option to make the length of the individuals self-adaptive. I would like to know:
What is the most common strategy used when the individual's length is self-adaptive. In other words, how does the the length of the individual evolve.
Does it increase and decrease the size or just increase.
Is there any well documented or illustrative example.
Thanks in advance.
In GE, the individuals are necessary variable-length in order to be able to encode programs (structures) of variable length.
Initialization
The initial population already consists of individuals of varying size. The initialization procedures may vary, but the one I'm familiar the most uses the grammar to create the individuals. You basically do the same thing as when you want to decode the individual into the program (using the grammar) but instead of choosing the grammar expansions using the individual, you do it the other way around - you choose the expansions randomly and record these random decisions. When the expansion is finished, your recorded decisions form an individual.
Crossover
The crossover operator in GE already modifies length of the indiviuals. It's a classical single-point crossover, but the crossover points are chosen completely randomly in both parents (in contrast to the classical single-point crossover from GAs where the parents are of the same length and the crossover point is aligned). This mechanism is capable of both growing and shrinking of the individuals.
Example on crossover: suppose you have two individuals and you have randomly chosen the crossover points
parent 1: XXXXXXXXXXXX|XXXX
parent 2: YYY|YYYYYYYYYYYYYYYYYYYYYY
After crossover, the children look like this
parent 1: XXXXXXXXXXXX|YYYYYYYYYYYYYYYYYYYYYY
parent 2: YYY|XXXX
As you can see, the length of the individuals was changed dramatically.
Pruning
However, there is a second mechanism that is used for length reduction only. It is the pruning operator. This operator, when invoked (with a probability, in the same way e.g. the mutation is invoked), deletes the non-active part (i.e. if the grammar expansion finished before all codons were used, the remaining part is the non-active part) of the genotype.

Decision Tree Binary Classifier shortcut (sorting)

Normally, at each node of the decision tree, we consider all features and all splitting points for each feature. We calculate the difference between the entropy of the entire node and the weighted avg of the entropies of potential left and right branches, and the feature + splitting feature_value that gives us the greatest entropy drop is chosen as the splitting criterion for that particular node.
Can someone explain why the above process, which requires (2^m -2)/2 tries for each feature at each node, where m is the number of distinct feature_values at the node, is the same as trying ONLY m-1 splits:
sort the m distinct feature_values by the percentage of 1's of the samples within the node that takes that feature_value for that feature.
Only try the m-1 ways of splitting the sorted list.
This 'trying only m-1 splits' method is mentioned as a 'shortcut' in the article below, which (by definition of 'shortcut') means the results of the two methods which differ drastically in runtime are exactly the same.
The quote:"For regression and binary classification problems, with K = 2 response classes, there is a computational shortcut [1]. The tree can order the categories by mean response (for regression) or class probability for one of the classes (for classification). Then, the optimal split is one of the L – 1 splits for the ordered list. "
The article:
http://www.mathworks.com/help/stats/splitting-categorical-predictors-for-multiclass-classification.html?s_tid=gn_loc_drop&requestedDomain=uk.mathworks.com
Note that I'm talking only about categorical variables.
Can someone explain why the above process, which requires (2^m -2)/2 tries for each feature at each node, where m is the number of distinct feature_values at the node, is the same as trying ONLY m-1 splits:
The answer is simple: both procedures just aren't the same. As you noticed, splitting in the exact way is an NP-hard problem and thus hardly feasible for any problem in practice. Moreover, due to overfitting that would usually be not the optimal result in terms of generaluzation.
Instead, the exhaustive search is replaced by some kind of greedy procedure which goes like: sort first, then try all ordered splits. In general this leads to different results than the exact splitting.
In order to improve on the greedy result, one further often applies pruning (which can be seen as another greedy and heuristic method). And never methods like random forests or BART deal with this problem effectively by averaging over several trees -- so that the deviation of a single tree becomes less important.

Find mutually compatible options from list of list of options

For purposes of this question, let us call a list of mutually incompatible options for "OptionS". I have a list of such OptionS, where each Option, apart from disqualifying all other Options in it's own OptionS list, also disqualify some Options from the other OptionS lists. These rules are symmetrical, so if A forbids B, B forbids A.
I want to pick exactly one Option from each list, such that no Options disqualify each other. There are too many Options (and OptionS) and too few disqualifications in each step to brute force a backtracking solution.
It reminds be a bit of Sudoku, but it is not an exact analog. From certain external factors, I have a rough likelihood for the different Options, or at least an ordering.
Is there a known better solution to this problem? Is it in NP?
Currently, I plan to just take random "paths" through the solution space, weighted by likelihood. A sort of simulated annealing.
EDIT - Clarification
I have a number, let's say between 5 and 500, of vectors.
Each vector contains a number, between 10 and 10000, of elements
Each element rules out a number of elements in the other vectors
This relation is symmetric
I want to pick exactly one element from each vector in a way that no elements disqualify each other
If there is no way to choose one from each vector, I want to at least choose as many as possible. The nature of the data is such that there will always be at least one (and at most a few) solution (or almost-solution - with just a few misses).
I cannot share the real data, but an example would be that the elements are integers between 1 and 10e9 and that only elements whose pairwise sum has more than P prime factors are allowed. Some numbers are more likely than others to "fit" other numbers, since larger numbers tend to have more factors, which makes some choices more likely just like the real one.
Pick P and the sizes and number of vectors as needed to make it suitably challenging :).
My naive solution:
I order the elements by how many other elements they rule out and try those who rule out few first (because that gives you a larger chance to be able to pick one from each).
Then I order the vectors by how many elements the "best" element rules out. Vectors that rule out many other elements are first. So the most constrained vector is tried first, even though the least constrained elements of that vector are tried first.
I then search depth first
The problem with this approach is that if the first choice is wrong, then the depth first search will never have time to reach the next choice.
A better way, which I try to explain in a comment below, would be to score each partial choice (node in the search tree) of elements according to how many you have chosen and how many elements are left. Then I could look deeper in the highest scoring node at each step, so the first choice is less rigid.
A similar way, which I might try first because it is slightly easier, is to do simulated annealing and take random paths, weighted by how many possibilities they keep, down the tree.
Depending on what constraints are allowed, I think you can reduce SAT to this.
Take a SAT expression e.g. (A|B|C)(~A|C|~D)...
Replace ~A by a and make a vector out of each term giving you {A,B,C} {a,C,d}...
You now have the problem of choosing one element from each vector, subject to the constraint that you cannot choose both versions of a variable - the constraints say that A is incompatible with a, B is incompatible with b, and so on.
If you can solve this instance of your problem you can solve SAT by setting to true variables that are chosen in your problem as A, B, C,... to false variables that are chosen as a, b, c,.. and making an arbitrary choice for anything not chosen - therefore your problem is at least as hard as SAT. (Except if you don't encounter these sorts of constraints, in which case I have not proved that your problem is this hard).
Given an instance of your problem, associate a variable with each element, write the constraints as boolean expressions (typically with only 2 variables) to give something which looks like 2-SAT, except that you need an expression for each vector of the form (A|B|C|D|...) to say that you must choose at least one element from each vector - so the exact solution version of your problem, at least, might code up quite nicely as input for a SAT-solver - so it is in NP and since we have already shown it is NP-hard it is NP-complete.
My first recommendation would be to find an off-the-shelf constraint solver and try that (request a maximum-weight solution with the log-likelihoods as weights), but if you're determined to implement a solver from scratch, then I would suggest that you start with something like WalkSAT. To summarize the link in the language of your question: at all times, keep a list of option choices (one from each option list, not necessarily compatible) and a list of conflicts (i.e., a set of pairs of indexes into the list of option lists). Repeatedly choose a conflict at random and resolve it by choosing differently for one half of the conflict or the other (most of the time) so as to decrease the number of conflicts afterward as much as possible or (some of the time) randomly, perhaps according to the likelihoods. Good data structures will be essential in making this run fast.

What is the correct way to execute selection procedure in a genetic algorithm?

I am working on a genetic algorithm for symmetric TSP in VB.NET. I want to know what is the correct way go execute selection procedure. There seems to be at least two different possibilities:
1)
-create a "reproduction pool" of size R by using SELECTION(pop) function
-do offspring creation cycle
-randomly (uniformly) select two parents from that pool for each offspring
that needs to be created in each iteration
2)
-do offspring creation cycle
-use modified SELECTION(pop) function that will return two different parents from pop
-perform crossover to produce a child
Bonus question: After selecting two parents it is possible to produce two different offsprings (if the crossover operator is mot commutative): CROSS(p1, p2) and CROSS(p2, p1).
Should I insert both offsprings immediately or produce them one by one ? Will this make a difference?
Currently I am producing them one by one because I think it will give more variance in the population.
In genetic algorithms you don't use a separate reproduction pool, but sample from the population (|N| until you have 2*|N| parents out of which you create |N| children). If your reproduction pool R is of size 2*|N| and you sample randomly out of that pool it's essentially the same behavior, but you need more random numbers and it's more expensive to compute (depending on your RNG). Note, that there is no need to care about getting two different parents. A parent mated by itself will produce a child that is the same as the parent (if the crossover is idempotent). It's similar to using crossover probability. Also the check if two parents are different may be quite expensive if you compare them structurally. You could also compare them by fitness, but often you can have very different solutions of the same quality.
Regarding your second question: I would say that it doesn't matter much. I would choose to return only one child for simplicity reasons: A method that takes two solutions and returns one solution is easier to deal with than one that returns an array of solutions. I would say that returning both is only interesting in those cases where you can create two distinct solutions. This is the case of binary or real-valued encoding. With permutations however you cannot guarantee this property and some genetic information will be lost in the crossover anyway.
It depends on the codification.
You can consider the two fittest individuals of the current population.
Or you can use roulette whell selection (Google it) to associate each individual with a reproduction rate, this is the usual way.

Choosing parents to crossover in genetic algorithms?

First of all, this is a part of a homework.
I am trying to implement a genetic algorithm. I am confused about selecting parents to crossover.
In my notes (obviously something is wrong) this is what is done as example;
Pc (possibility of crossover) * population size = estimated chromosome count to crossover (if not even, round to one of closest even)
Choose a random number in range [0,1] for every chromosome and if this number is smaller then Pc, choose this chromosome for a crossover pair.
But when second step applied, chosen chromosome count is equals to result found in first step. Which is not always guaranteed because of randomness.
So this does not make any sense. I searched for selecting parents for crossover but all i found is crossover techniques (one-point, cut and slice etc.) and how to crossover between chosen parents (i do not have a problem with these). I just don't know which chromosomesi should choose for crossover. Any suggestions or simple example?
You can implement it like this:
For every new child you decide if it will result from crossover by random probability. If yes, then you select two parents, eg. through roulette wheel selection or tournament selection. The two parents make a child, then you mutate it with mutation probability and add it to the next generation. If no, then you select only one "parent" clone it, mutate it with probability and add it to the next population.
Some other observations I noted and that I like to comment. I often read the word "chromosomes" when it should be individual. You hardly ever select chromosomes, but full individuals. A chromosome is just one part of a solution. That may be nitpicking, but a solution is not a chromosome. A solution is an individual that consists of several chromosomes which consist of genes which show their expression in the form of alleles. Often an individual has only one chromosome, but it's still not okay to mix terms.
Also I noted that you tagged genetic programming which is basically only a special type of a genetic algorithm. In GP you consider trees as a chromosome which can represent mathematical formulas or computer programs. Your question does not seem to be about GP though.
This is very late answer, but hoping it will help someone in the future. Even if two chromosomes are not paired (and produced children), they goes to the next generation (without crossover) but after some mutation (subject to probability again). And on the other hand, if two chromosomes paired, then they produce two children (replacing the original two parents) for the next generation. So, that's why the no of chromosomes remain same in two generations.

Resources