Maintaining Population Size in a Genetic Algorithm/Program - genetic-algorithm

I'm writing genetic program, but it's been a while so I'm a little rusty.
If I start with a population size of 100 individuals, and select 50 through tournament selection for reproduction, and after crossover each pair produces 50 next-generation individuals, I'm left with 100 1st-gen individuals (which will no longer reproduce, no longer part of the "population") and 50 current-gen individuals. So my tournament selection of 50 won't really work. Should the tournament selected individuals also go on to the next generation? Or should they reproduce 2:1 somehow?
Thanks for the refresher!

There are many ways to perform selection and crossover in a Genetic Algorithm but generally, if you're using tournament selection you're best to select as many individuals as your population and have them produce the same number of offspring.
There are a number of ways to produce the same number of offspring as parents but, as an example, if performing a straight forward one point crossover each half of the initial parent would carry forward with the other half of the other parent. That way two parents produce two offspring. For example
Parent 1: 00000000
Parent 2: 11111111
With a crossover point after the third bit.
Offspring 1: 00011111
Offspring 2: 11100000
Afterwards you can discard your entire initial population and replace them with all the offspring.
Note: This doesn't take into account any specialised operator you may want to include which can help to carry the best individuals forward every population. But that's another story....

Related

A good randomizer for puzzle-15

I have implemented a puzzle 15 for people to compete online. My current randomizer works by starting from the good configuration and moving tiles around for 100 moves (arbitrary number)
Everything is fine, however, once in a little while the tiles are shuffled too easy and it takes only a few moves to solve the puzzle, therefore the game is really unfair for some people reaching better scores in a much higher speed.
What would be a good way to randomize the initial configuration so it is not "too easy"?
You can generate a completely random configuration (that is solvable) and then use some solver to determine the optimal sequence of moves. If the sequence is long enough for you, good, otherwise generate a new configuration and repeat.
Update & details
There is an article on Wikipedia about the 15-puzzle and when it is (and isn't) solvable. In short, if the empty square is in the lower-right corner, then the puzzle is solvable if and only if the number of inversions (an inversion is a swap of two elements in the sequence, not necessarily adjacent elements) with respect to the goal permutation is even.
You can then easily generate a solvable start state by doing an even number of inversions, which may lead to a not-so-easy-to-solve state far quicker than by doing regular moves, and it is guaranteed that it will remain solvable.
In fact, you don't need to use a search algorithm as I mentioned above, but an admissible heuristic. Such one always underestimates never overestimates the number of moves needed to solve the puzzle, i.e. you are guaranteed that it will not take less moves that the heuristic tells you.
A good heuristic is the sum of manhattan distances of each number to its goal position.
Summary
In short, a possible (very simple) algorithm for generating starting positions might look like this:
1: current_state <- goal_state
2: swap two arbitrary (randomly selected) pieces
3: swap two arbitrary (randomly selected) pieces again (to ensure solvability)
4: h <- heuristic(current_state)
5: if h > desired threshold
6: return current_state
7: else
8: go to 2.
To be absolutely certain about how difficult a state is, you need to find the optimal solution using some solver. Heuristics will give you only an estimate.
I would do this
start from solution (just like you did)
make valid turn in random direction
so you must keep track where the gap is and generate random direction (N,E,S,W) and do the move. I think this part you have done too.
compute the randomness of your placements
So compute some coefficient dependent on the order of the array. So ordered (solved) solutions will have low values and random will have high values. The equation for the coefficiet however is a matter of trial and error. Here some ideas what to use:
correlation coefficient
sum of average difference of value and its neighbors
1 2 4
3 6 5
9 8 7
coeff(6)= (|6-3|+|6-5|+|6-2|+|6-8|)/4
coeff=coeff(1)+coeff(2)+...coeff(15)
abs distance from ordered array
You can combine more approaches together. You can divide this to separated rows and columns and then combine the sub coefficients together.
loop #2 unit coefficient from #3 is high enough (treshold)
The treshold can be used also to change the difficulty.

Crossmode in different length genes

I have two genes with different sizes and I want to produce offspring from them. The position of the chromosome doesn't make difference in the gene.
I want to know what is common to do in this situation
Gene1:
123456789
Gene2:
ABCDEFGHIJKL
I can use a single cross point in each
12345.6789
ABCD.EFGHIJKL
And with this I have 8 possible combinations
1. 12345ABCD
2. 12345EFGHIJKL
3. 6789ABCD
4. 6789EFGHIJKL
5. ABCD12345
6. ABCD6789
7. EFGHIJKL12345
8. EFGHIJKL6789
Is it okay to create all the 8 offsprings, or should I just make 1, if so, do I need to randomize the method or just pick one and stick with it?
Genetic algorithms are mocking biological processes where chromosomes crossover at one point and exchange their parts after the crossover point if we are talking about single point crossover.
As you can see in picture above parents exchange "tail" parts of the chromosome after the crossover point. Therefore you have only 2 offspring/children produced by crossover. That's how crossover occurs in nature, how biologists describe it.
If you refer to any literature dealing with topic of Genetic Algorithms they also state this convention that when using single point crossover parent chromosomes are split into head and tail denoted as H/T like that (see citation below):
H T
123456.789
H T
ABCDEF.GHI
Therefore offspring produced with this crossover will be:
123456GHI
ABCDEF789
Following this convention is much better than creating all possible combinations and then selecting random or fittest of the offspring as it is computationally more efficient. If you want to solve more complex problems you just simply increase size of the population to allow more diversity.
"Single point crossover: A single random cut is made, producing two head sections and two tail sections. The two tail sections are then swapped to produce
two new individuals (chromosomes)".
Genetic Algorithms and Genetic Programming:
Affenzeller, Michael
Wagner, Stefan
Winkler, Stephan,
ISBN: 1584886293
Alternatively you can use multipoint crossover which follows similar convention where chromosomes are split into sections and parents exchange parts in a way that offspring chromosome is just alternation of parents chromosomes so if you have parents with chromosomes:
A1.A2.A3.A4
B1.B2.B3.B4
|
| this will produce offspring
|
A1 B2 A3 B4
and
B1 A2 B3 A4
This answer might help you as well:
Crossover of chromosomes with different length
It seams you use Gene instead of Chromosome and vice versa.
In this case and if different size of chromosome is okay, you can create all 8 offspring. But your population increases in each iteration and you should control this. For example keep 2 of the best offspring or 2 random offspring and replace their parents by.

Choosing parents to crossover in genetic algorithms?

First of all, this is a part of a homework.
I am trying to implement a genetic algorithm. I am confused about selecting parents to crossover.
In my notes (obviously something is wrong) this is what is done as example;
Pc (possibility of crossover) * population size = estimated chromosome count to crossover (if not even, round to one of closest even)
Choose a random number in range [0,1] for every chromosome and if this number is smaller then Pc, choose this chromosome for a crossover pair.
But when second step applied, chosen chromosome count is equals to result found in first step. Which is not always guaranteed because of randomness.
So this does not make any sense. I searched for selecting parents for crossover but all i found is crossover techniques (one-point, cut and slice etc.) and how to crossover between chosen parents (i do not have a problem with these). I just don't know which chromosomesi should choose for crossover. Any suggestions or simple example?
You can implement it like this:
For every new child you decide if it will result from crossover by random probability. If yes, then you select two parents, eg. through roulette wheel selection or tournament selection. The two parents make a child, then you mutate it with mutation probability and add it to the next generation. If no, then you select only one "parent" clone it, mutate it with probability and add it to the next population.
Some other observations I noted and that I like to comment. I often read the word "chromosomes" when it should be individual. You hardly ever select chromosomes, but full individuals. A chromosome is just one part of a solution. That may be nitpicking, but a solution is not a chromosome. A solution is an individual that consists of several chromosomes which consist of genes which show their expression in the form of alleles. Often an individual has only one chromosome, but it's still not okay to mix terms.
Also I noted that you tagged genetic programming which is basically only a special type of a genetic algorithm. In GP you consider trees as a chromosome which can represent mathematical formulas or computer programs. Your question does not seem to be about GP though.
This is very late answer, but hoping it will help someone in the future. Even if two chromosomes are not paired (and produced children), they goes to the next generation (without crossover) but after some mutation (subject to probability again). And on the other hand, if two chromosomes paired, then they produce two children (replacing the original two parents) for the next generation. So, that's why the no of chromosomes remain same in two generations.

breeding parents for multiple children in genetic algorithm

I'm building my first Genetic Algorithm in javascript, using a collection of tutorials.
I'm building a somewhat simpler structure to this scheduling tutorial http://www.codeproject.com/KB/recipes/GaClassSchedule.aspx#Chromosome8, but I've run into a problem with breeding.
I get a population of 60 individuals, and now I'm picking the top two individuals to breed, and then selecting a few random other individuals to breed with the top two, am I not going to end up with a fairly small amount of parents rather quickly?
I figure I'm not going to be making much progress in the solution if I breed the top two results with each of the next 20.
Is that correct? Is there a generally accepted method for doing this?
I have a sample of genetic algorithms in Javascript here.
One problem with your approach is that you are killing diversity in the population by mating always the top 2 individuals. That will never work very well because it's too greedy, and you'll actually be defeating the purpose of having a genetic algorithm in the first place.
This is how I am implementing mating with elitism (which means I am retaining a percentage of unaltered best fit individuals and randomly mating all the rest), and I'll let the code do the talking:
// save best guys as elite population and shove into temp array for the new generation
for(var e = 0; e < ELITE; e++) {
tempGenerationHolder.push(fitnessScores[e].chromosome);
}
// randomly select a mate (including elite) for all of the remaining ones
// using double-point crossover should suffice for this silly problem
// note: this should create INITIAL_POP_SIZE - ELITE new individualz
for(var s = 0; s < INITIAL_POP_SIZE - ELITE; s++) {
// generate random number between 0 and INITIAL_POP_SIZE - ELITE - 1
var randInd = Math.floor(Math.random()*(INITIAL_POP_SIZE - ELITE));
// mate the individual at index s with indivudal at random index
var child = mate(fitnessScores[s].chromosome, fitnessScores[randInd].chromosome);
// push the result in the new generation holder
tempGenerationHolder.push(child);
}
It is fairly well commented but if you need any further pointers just ask (and here's the github repo, or you can just do a view source on the url above). I used this approach (elitism) a number of times, and for basic scenarios it usually works well.
Hope this helps.
When I've implemented genetic algorithms in the past, what I've done is to pick the parents always probabilistically - that is, you don't necessarily pick the winners, but you will pick the winners with a probability depending on how much better they are than everyone else (based on the fitness function).
I cannot remember the name of the paper to back it up, but there is a mathematical proof that "ranking" selection converges faster than "proportional" selection. If you try looking around for "genetic algorithm selection strategy" you may find something about this.
EDIT:
Just to be more specific, since pedalpete asked, there are two kinds of selection algorithms: one based on rank, one based on fitness proportion. Consider a population with 6 solutions and the following fitness values:
Solution Fitness Value
A 5
B 4
C 3
D 2
E 1
F 1
In ranking selection, you would take the top k (say, 2 or 4) and use those as the parents for your next generation. In proportional ranking, to form each "child", you randomly pick the parent with a probability based on fitness value:
Solution Probability
A 5/16
B 4/16
C 3/16
D 2/16
E 1/16
F 1/16
In this scheme, F may end up being a parent in the next generation. With a larger population size (100 for example - may be larger or smaller depending on the search space), this will mean that the bottom solutions will end up being a parent some of the time. This is OK, because even "bad" solutions have some "good" aspects.
Keeping the absolute fittest individuals is called elitism, and it does tend to lead to faster convergence, which, depending on the fitness landscape of the problem, may or may not be what you want. Faster convergence is good if it reduces the amount of effort taken to find an acceptable solution but it's bad if it means that you end up with a local optimum and ignore better solutions.
Picking the other parents completely at random isn't going to work very well. You need some mechanism whereby fitter candidates are more likely to be selected than weaker ones. There are several different selection strategies that you can use, each with different pros and cons. Some of the main ones are described here. Typically you will use roulette wheel selection or tournament selection.
As for combining the elite individuals with every single one of the other parents, that is a recipe for destroying variation in the population (as well as eliminating the previously preserved best candidates).
If you employ elitism, keep the elite individuals unchanged (that's the point of elitism) and then mate pairs of the other parents (which may or may not include some or all of the elite individuals, depending on whether they were also picked out as parents by the selection strategy). Each parent will only mate once unless it was picked out multiple times by the selection strategy.
Your approach is likely to suffer from premature convergence. There are lots of other selection techniques to pick from though. One of the more popular that you may wish to consider is Tournament selection.
Different selection strategies provide varying levels of 'selection pressure'. Selection pressure is how strongly the strategy insists on choosing the best programs. If the absolute best programs are chosen every time, then your algorithm effectively becomes a hill-climber; it will get trapped in local optimum with no way of navigating to other peaks in the fitness landscape. At the other end of the scale, no fitness pressure at all means the algorithm will blindly stumble around the fitness landscape at random. So, the challenge is to try to choose an operator with sufficient (but not excessive) selection pressure, for the problem you are tackling.
One of the advantages of the tournament selection operator is that by just modifying the size of the tournament, you can easily tweak the level of selection pressure. A larger tournament will give more pressure, a smaller tournament less.

What is Crossover Probability & Mutation Probability in Genetic Algorithm or Genetic Programming?

What is Crossover Probability & Mutation Probability in Genetic Algorithm or Genetic Programming ? Could someone explain them from implementation perspective!
Mutation probability (or ratio) is basically a measure of the likeness that random elements of your chromosome will be flipped into something else. For example if your chromosome is encoded as a binary string of lenght 100 if you have 1% mutation probability it means that 1 out of your 100 bits (on average) picked at random will be flipped.
Crossover basically simulates sexual genetic recombination (as in human reproduction) and there are a number of ways it is usually implemented in GAs. Sometimes crossover is applied with moderation in GAs (as it breaks symmetry, which is not always good, and you could also go blind) so we talk about crossover probability to indicate a ratio of how many couples will be picked for mating (they are usually picked by following selection criteria - but that's another story).
This is the short story - if you want the long one you'll have to make an effort and follow the link Amber posted. Or do some googling - which last time I checked was still a good option too :)
According to Goldberg (Genetic Algorithms in Search, Optimization and Machine Learning) the probability of crossover is the probability that crossover will occur at a particular mating; that is, not all matings must reproduce by crossover, but one could choose Pc=1.0.
Probability of Mutation is per JohnIdol.
It's shows the quantity of features which inherited from the parents in crossover!
Note: If crossover probability is 100%, then all offspring is made by crossover. If it is 0%, whole new generation is made from exact
copies of chromosomes from old population (but this does not mean that
the new generation is the same!).
Here might be a little good explanation on these two probabilities:
http://www.optiwater.com/optiga/ga.html
Johnldol's answer on mutation probability is exactly words that the website is saying:
"Each bit in each chromosome is checked for possible mutation by generating a random number between zero and one and if this number is less than or equal to the given mutation probability e.g. 0.001 then the bit value is changed."
For crossover probability, maybe it is the ratio of next generation population born by crossover operation. While the rest of population...maybe by previous selection
or you can define it as best fit survivors

Resources