I'm wondering if I can select two same parents in two iteration of selection in a genetic algorithm (in a same population with tournament selection).
Can I?
A lot of these decisions are made after experimentation with one's particular software and domain.
Of course two parents can generate more than two children. This may happen either because:
crossover operator creates more than two children;
tournament selection picks repeatedly the same parents (with a simple steady state population this is a common event).
Generally it's not recommended to create too many individuals with the same parents because you could have a too "restricted trend" (what "too many" means is debatable).
So you can often find some form of prevention. Apart from checking explicitly for the "same parents" occurrence, there are other techniques.
E.g.
demetic grouping the same parents can generate a numerous offspring but children will compete among them.
family competition replacement schemes are different way of limiting the amounts of multiple crossovers with the same parents.
...
Related
I've been trying to build a schedule generator for my school using topological sort, but am stuck dealing with classes that have prerequisites that can be taken concurrently. I was wondering if there was any clever way to modify topological sort to deal with these concurrent classes? For example, an intro to CS course can either be taken before a Data Structures course or at the same time as a Data Structures course. I'm trying to include the case where they are taken together.
You could create a dummy node, combining the two courses together (assuming each course has low number of concurrent courses at most, as you will likely need all combination of them... Should work just fine if you have only one or two concurrent courses)
The prerequisites of the combined node will be the combined prerequisites of both courses, and all courses that have any prerequisite of one of these will have the dummy node as well.
As postprocessing, once topological sort has ended, you can cleanup the redundancies, and split dummy nodes back to the original courses.
That said, note that topological sort doesn't guarantee you to actually use this dummy node - even if it's possible, before using the original nodes. So there is no guarantee it will actually be used, unless you tie break in favor of them when possible.
Can't mathematically guarantee it's correctness, but this slight modification should work.
Use the normal topological sorting with one difference. Assign all possible beginning nodes a value of 0. For each node that is queued, assign it value of parent node's value + 1. That way, all nodes at a given value would ideally be parallel and can be picked together.
Kahn's algorithm for topological sorting naturally produces a minimum length schedule with concurrency:
Make a dependency graph of all your courses
Select all courses with no dependencies. These can be taken concurrently.
Remove the selected courses from the graph.
If the graph is not empty, go back to (2)
Of course, students are limited in the number of courses they can take simultaneously, and the problem gets tricky when you also impose a limit on maximum concurrency. Deciding the best courses to take first, when too many courses are available, is an NP-hard problem. There are some heuristics you can try, though, like deferring the jobs with the shortest dependant depth.
If you think about exactly what you want as output, it might clear out. For instance, if your desired output is a potential list of what courses to take which semester, then each vertex involved in the topological sort could be “course X on semester Y” rather than just “course X”. Then you'd get these edges, among many others:
intro to CS on semester 1 → data structures on semester 1
intro to CS
on semester 1 → data structures on semester 2
This graph would be larger than if the vertices are just courses of course: the number of vertices is now the number of courses times the maximum number of semesters in your education. But in a realistic setting, it appears to me that it wouldn't be too much to handle.
I have been using my own GA for a while where I use random selection and elitism (top 10% or so) to get 50% of my population. I then do crossover to produce the next 50%, followed by mutation of course. It sounds strange, but it has gotten me far enough in my problem to be satisfied with it for now.
I'd like to start using more elaborate selection methods, specifically ranked selection. I'd also like to employ a crossover probability.
My questions are:
When doing ranked selection, is each individual only allowed to be selected once?
What typically happens to the parents after crossover? Do they get replaced by children or do they also go onto the next generation?
When doing ranked selection, is each individual only allowed to be selected once?
Well, if every individual was allowed to be selected only once, you would have to copy the whole population to form the new one. In ranked selection you just pick probabilistically with probability proportional to the rank of the individual and let the chance decide whether or which individual gets copied more times.
What typically happens to the parents after crossover? Do they get replaced by children or do they also go onto the next generation?
It depends. If you have a so-called generational scheme, you always generate a whole new population that replaces the old one completely. The members of this new population are from these four "sources":
Elites copied directly from the parent population.
Individuals selected from the parent population which were neither crossed over nor mutated (i.e. copied directly too).
Children of parents that were selected from the parent population which were crossed over but not mutated.
Mutated children of parents that were selected from the parent population and crossed over.
On the other hand, you can have a so-called steady state scheme. In this scheme, in each iteration you select just enough individuals to be able to perform crossover, cross them over (if probability allows), mutate them (if probability allows) and then you somehow put them back into the original population. That means that someone must get thrown away. This may be the parents or the children (if one is worse than the other) or an arbitrary member of the population based on your replacement strategy. You can do e.g. "inversed" selection, i.e. selection with probabilites the other way around (the worst gets the highest while the best gets the lowest).
One final remark - in the realm of GAs, almost any mechanism you come up with may work for your particular problem, or it may not. You just have to try. It's a stochastic method after all.
I am trying to implement a basic genetic algorithm in MATLAB. I have some questions regarding the cross-over operation. I was reading materials on it and I found that always two parents are selected for cross-over operation.
What happens if I happen to have an odd number of parents?
Suppose I have parent A, parent B & parent C and I cross parent A with B and again parent B with C to produce offspring, even then I get 4 offspring. What is the criteria for rejecting one of them, as my population pool should remain the same always? Should I just reject the offspring with the lowest fitness value ?
Can an arithmetic operation between parents, like suppose OR or AND operation be deemed a good crossover operation? I found some sites listing them as crossover operations but I am not sure.
How can I do crossover between multiple parents ?
"Crossover" isn't so much a well-defined operator as the generic idea of taking aspects of parents and using them to produce offspring similar to each parent in some ways. As such, there's no real right answer to the question of how one should do crossover.
In practice, you should do whatever makes sense for your problem domain and encoding. With things like two parent recombination of binary encoded individuals, there are some obvious choices -- things like n-point and uniform crossover, for instance. For real-valued encodings, there are things like SBX that aren't really sensible if viewed from a strict biological perspective. Rather, they are simply engineered to have some predetermined properties. Similarly, permutation encodings offer numerous well-known operators (Order crossover, Cycle crossover, Edge-assembly crossover, etc.) that, again, are the result of analysis of what features in parents make sense to make heritable for particular problem domains.
You're free to do the same thing. If you have three parents (with some discrete encoding like binary), you could do something like the following:
child = new chromosome(L)
for i=1 to L
switch(rand(3))
case 0:
child[i] = parentA[i]
case 1:
child[i] = parentB[i]
case 2:
child[i] = parentC[i]
Whether that is a good operator or not will depend on several factors (problem domain, the interpretation of the encoding, etc.), but it's a perfectly legal way of producing offspring. You could also invent your own more complex method, e.g., taking a weighted average of each allele value over multiple parents, doing boolean operations like AND and OR, etc. You can also build a more "structured" operator if you like in which different parents have specific roles. The basic Differential Evolution algorithm selects three parents, a, b, and c, and computes an update like a + F(b - c) (with some function F) roughly corresponding to an offspring.
Consider reading the following academic articles:
DEB, Kalyanmoy et al. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE transactions on evolutionary computation, v. 6, n. 2, p. 182-197, 2002.
DEB, Kalyanmoy; AGRAWAL, Ram Bhushan. Simulated binary crossover for continuous search space. Complex systems, v. 9, n. 2, p. 115-148, 1995.
For SBX, method of crossing and mutate children mentioned by #deong, see answer simulated-binary-crossover-sbx-crossover-operator-example
Genetic algorithm does not have an arbitrary and definite form to be made. Many ways are proposed. But generally, what applies in all are the following steps:
Generate a random population by lot or any other method
Cross parents to raise children
Mutate
Evaluate the children and parents
Generate new population based only on children or children and parents (different approaches exist)
Return to item 2
NSGA-II, the DEB quoted above, is one of the most widely used and well-known genetic algorithms. See an image of the flow taken from the article:
I am working on a genetic algorithm for symmetric TSP in VB.NET. I want to know what is the correct way go execute selection procedure. There seems to be at least two different possibilities:
1)
-create a "reproduction pool" of size R by using SELECTION(pop) function
-do offspring creation cycle
-randomly (uniformly) select two parents from that pool for each offspring
that needs to be created in each iteration
2)
-do offspring creation cycle
-use modified SELECTION(pop) function that will return two different parents from pop
-perform crossover to produce a child
Bonus question: After selecting two parents it is possible to produce two different offsprings (if the crossover operator is mot commutative): CROSS(p1, p2) and CROSS(p2, p1).
Should I insert both offsprings immediately or produce them one by one ? Will this make a difference?
Currently I am producing them one by one because I think it will give more variance in the population.
In genetic algorithms you don't use a separate reproduction pool, but sample from the population (|N| until you have 2*|N| parents out of which you create |N| children). If your reproduction pool R is of size 2*|N| and you sample randomly out of that pool it's essentially the same behavior, but you need more random numbers and it's more expensive to compute (depending on your RNG). Note, that there is no need to care about getting two different parents. A parent mated by itself will produce a child that is the same as the parent (if the crossover is idempotent). It's similar to using crossover probability. Also the check if two parents are different may be quite expensive if you compare them structurally. You could also compare them by fitness, but often you can have very different solutions of the same quality.
Regarding your second question: I would say that it doesn't matter much. I would choose to return only one child for simplicity reasons: A method that takes two solutions and returns one solution is easier to deal with than one that returns an array of solutions. I would say that returning both is only interesting in those cases where you can create two distinct solutions. This is the case of binary or real-valued encoding. With permutations however you cannot guarantee this property and some genetic information will be lost in the crossover anyway.
It depends on the codification.
You can consider the two fittest individuals of the current population.
Or you can use roulette whell selection (Google it) to associate each individual with a reproduction rate, this is the usual way.
First of all, this is a part of a homework.
I am trying to implement a genetic algorithm. I am confused about selecting parents to crossover.
In my notes (obviously something is wrong) this is what is done as example;
Pc (possibility of crossover) * population size = estimated chromosome count to crossover (if not even, round to one of closest even)
Choose a random number in range [0,1] for every chromosome and if this number is smaller then Pc, choose this chromosome for a crossover pair.
But when second step applied, chosen chromosome count is equals to result found in first step. Which is not always guaranteed because of randomness.
So this does not make any sense. I searched for selecting parents for crossover but all i found is crossover techniques (one-point, cut and slice etc.) and how to crossover between chosen parents (i do not have a problem with these). I just don't know which chromosomesi should choose for crossover. Any suggestions or simple example?
You can implement it like this:
For every new child you decide if it will result from crossover by random probability. If yes, then you select two parents, eg. through roulette wheel selection or tournament selection. The two parents make a child, then you mutate it with mutation probability and add it to the next generation. If no, then you select only one "parent" clone it, mutate it with probability and add it to the next population.
Some other observations I noted and that I like to comment. I often read the word "chromosomes" when it should be individual. You hardly ever select chromosomes, but full individuals. A chromosome is just one part of a solution. That may be nitpicking, but a solution is not a chromosome. A solution is an individual that consists of several chromosomes which consist of genes which show their expression in the form of alleles. Often an individual has only one chromosome, but it's still not okay to mix terms.
Also I noted that you tagged genetic programming which is basically only a special type of a genetic algorithm. In GP you consider trees as a chromosome which can represent mathematical formulas or computer programs. Your question does not seem to be about GP though.
This is very late answer, but hoping it will help someone in the future. Even if two chromosomes are not paired (and produced children), they goes to the next generation (without crossover) but after some mutation (subject to probability again). And on the other hand, if two chromosomes paired, then they produce two children (replacing the original two parents) for the next generation. So, that's why the no of chromosomes remain same in two generations.