I have a genetic algorithm and i have to calculate the fitness score for each chromosome. Fitness score must be different for all chromosomes or some of them can have the same?
You should have the same fitness score for all chromsomes in your population.
Imagine testing chromosome 1 on fitness function A, resulting in a high score. Imagine testing chromosome 2 on fitness function B, resulting in a low score. But chromosome 2 scores very high on function A, without you even knowing it.
The goal of a genetic algorithm is to optimize a solution to an objective - this objective should be equal for ALL chromosomes.
But I think you may have meant this: yes, it is perfectly feasable that two or more chromosomes share the same score. That means their ancestry is similar.
Related
I have implemented a genetic algorithm, in which the fitness function considers the coefficient of variation of the data as the fitness value, and so the closer the COV is to zero, the better. Will it still be called the fitness function? Usually the fitness value is defined such that the greater the value, the better.
An abstract & direct answer would be fitness function is a measure for your solutions and it should be well defined to drive the search space to a near-optimal solution you want to achieve for your problem instance. You can find more for designing fitness function here: A guide for fitness function design
You can definitely have an EA for the maximization or minimization problem. A general evolutionary cycle is shown in the image below (from one of the text book-Introduction to Evolutionary Computing). As per the EA cycle, you need to evaluate your solutions after the population is created, and after offsprings are created. Basically, survivor selection is the process where you would like to focus on maximization or minimization, and your problem is minimization. For your problem, you might want to adopt one of the following approaches:
When you create your fitness function, you can negate the fitness. And make sure in this approach you have to choose the highest fitness individuals for next-generation (aka in survivor selection).
Let your fitness function be positive. But as you want to approach as a minimization problem, you have to choose the lowest fitness individuals for next-generation (aka in survivor selection).
When we compare the structure of Genetic Algorithm (GA) with the structure of Particles swarm Optimization (PSO) is possible to say that:
The Population in GA = the Swarm in PSO.
The chromosome (potential solution) in GA = the Particle (potential solution) in PSO.
The genes of a chromosome in GA = the coordinates of a particle in PSO.
As you must be knowing that both GA(Genetic Algorithm) and PSO(Particle Swarm Optimization) come under the Evolutionary Computation. These two evolutionary heuristics are population-based search methods.
As if, we consider the case of original GA (i'm not even talking about slightly modified GA, here). It was inspired by the principles of genetics and evolution, and it mimics the reproduction behavior observed in biological population or environment. It follows the principle of "survival of fittest" to select and generate offsprings which are adapted to the environment or constraints. Whereas, if we talk about PSO. It was inspired by the behavior observed in the school of fish and swarm of birds. How do they react or behave to external stimuli (to achieve maxima or minima in the fitness function) using their cognitive and collective power?
PSO is considered as improvement over GA, because it takes less time to compute the desired results. Whereas, GA is still being used by many people and companies, because of ease of implementation and understanding.
Now, let's discuss about your questions:
The Population in GA = the Swarm in PSO ?
I think the answer is yes, but not always. Because there are some cases where we can't represent the population in terms of PSO directly, but we can represent it in the case of GA. For example - If we consider discrete vector represent, then we can easily go for GA but we'll have to make few modification before feeding the vector into the PSO algorithm.
The chromosome (potential solution) in GA = the Particle (potential solution) in PSO?
Perhaps, you can be right. But let me remind you why we use EC techniques? Since, the number of possible solutions to the problem (which is solved by using EC techniques) generally has very large number of state spaces. And, we are only considering the fraction of it as our output or result. Thus, even if we have reached to the required benchmark, we can't be sure that the results will match up.
The genes of a chromosome in GA = the coordinates of a particle in PSO?
As, we all know that GA is inherently designed to evaluate the discrete vector representation, where as, PSO performs best with unconstrained problems with continuous variables. And since, the original PSO was not very immune to local maxima or minima. Therefore, it may sometimes show premature convergence under the constraints. which is rarely seen in GA (thanks to mutation). So, I'd say No. The genes of the chromosomes in GA are not always the coordinates of the particles in PSO.
EDIT:
Could you please give me a simple example for this: "... we'll have to make few modification before feeding the vector into the PSO algorithm."
Since i already told you that GA is inherently designed for discrete evaluation whereas PSO works better in case of continuous variables. Let us consider a case. Where we are given a family of strings of 0 and 1. and we are to satisfy the given fitness function f(x) to terminate the loop. As we all know that 0 and 1 are two discrete numbers. Thus on apply crossover or mutation (in case of GA) we will get the discrete output. After that we will feed this output into the fitness function to check its strength. If it survives the f(x) then we will push that output to the next generation.
But in case of PSO, we generally consider position vector or a string of '0' and '1' of this case, and velocity vector or probability of changing the number to its opposite number (means probability of changing 0 to 1 and vice versa). Now, suppose the probability of changing the value 0 into 1 is 0.6 and 1 into 0 is 0.3. Then on apply the probability distribution we will get a transition state which either holds 0 or 1 with some degree of probability. In other words, the upcoming state will contain the values between 0 and 1 included, which is not correct itself. And, since we were only expecting discrete numbers. Thus, we'll have to put some benchmark like- below 0.5 will be 0(zero) and above 0.5 will be 1(one).And, now this benchmarked or modified output will be used as an input for the next generation.
I am interested in plotting the performance of individual subpopulations in the island based distributed genetic algorithms. I saw a couple of research works that calculate the rank and order of subpopulations and plot the rank against the generations to understand how the subpopulations evolve?
I could not understand how the rank of each subpopulation is calculated?
Could anyone please explain.
Generally, the rank of sub-populations is assigned based on a quality of the sub-populations. Some common qualities to use for this are the average fitness value of the sub-population, the best fitness value of the sub-population, etc...
The rank may be used as measure to order the population.
I try to solve this problem using genetic algorithm and get difficult to choose the fitness function.
My problem is a little differnt than the original Traveling Salesman Problem ,since the population and maybe also the win unit not neccesrly contain all the cities.
So , I have 2 value for each unit: the amount of cities he visit, the total time and the order he visit the cities.
I tried 2-3 fitness function but they don't give good sulotion.
I need idea of good fitness function which take in account the amount of cities he visited and also the total time.
Thanks!
In addition to Peladao's suggestions of using a pareto approach or some kind of weighted sum there are two more possibilities that I'd like to mention for the sake of completeness.
First, you could prioritize your fitness functions. So that the individuals in the population are ranked by first goal, then second goal, then third goal. Therefore only if two individuals are equal in the first goal they will be compared by second goal. If there is a clear dominance in your goals this may be a feasible approach.
Second, you could define two of your goals as constraints that you penalize only when they exceed a certain threshold. This may be feasible when e.g. the amount of cities should not be in a certain range, e.g. [4;7], but doesn't matter if it's 4 or 5. This is similar to a weighted sum approach where the contribution of the individual goals to the combined fitness value differs by several orders of magnitude.
The pareto approach is the only one that treats all objectives with equal importance. It requires special algorithms suited for multiobjective optimization though, such as NSGA-II, SPEA2, AbYSS, PAES, MO-TS, ...
In any case, it would be good if you could show the 2-3 fitness functions that you tried. Maybe there were rather simple errors.
Multiple-objective fitness functions can be implemented using a Pareto optimal.
You could also use a weighted sum of different fitness values.
For a good and readable introduction into multiple-objective optimisation and GA: http://www.calresco.org/lucas/pmo.htm
I'm using the ruby classifier gem whose classifications method returns the scores for a given string classified against the trained model.
Is the score a percentage? If so, is the maximum difference 100 points?
It's the logarithm of a probability. With a large trained set, the actual probabilities are very small numbers, so the logarithms are easier to compare. Theoretically, scores will range from infinitesimally close to zero down to negative infinity. 10**score * 100.0 will give you the actual probability, which indeed has a maximum difference of 100.
Actually to calculate the probability of a typical naive bayes classifier where b is the base, it is b^score/(1+b^score). This is the inverse logit (http://en.wikipedia.org/wiki/Logit) However, given the independence assumptions of the NBC, these scores tend to be too high or too low and probabilities calculated this way will accumulate at the boundaries. It is better to calculate the scores in a holdout set and do a logistic regression of accurate(1 or 0) on score to get a better feel for the relationship between score and probability.
From a Jason Rennie paper:
2.7 Naive Bayes Outputs Are Often Overcondent
Text databases frequently have
10,000 to 100,000 distinct vocabulary words; documents often contain 100 or more
terms. Hence, there is great opportunity for duplication.
To get a sense of how much duplication there is, we trained a MAP Naive Bayes
model with 80% of the 20 Newsgroups documents. We produced p(cjd;D) (posterior)
values on the remaining 20% of the data and show statistics on maxc p(cjd;D) in
table 2.3. The values are highly overcondent. 60% of the test documents are assigned
a posterior of 1 when rounded to 9 decimal digits. Unlike logistic regression, Naive
Bayes is not optimized to produce reasonable probability values. Logistic regression
performs joint optimization of the linear coecients, converging to the appropriate
probability values with sucient training data. Naive Bayes optimizes the coecients
one-by-one. It produces realistic outputs only when the independence assumption
holds true. When the features include signicant duplicate information (as is usually
the case with text), the posteriors provided by Naive Bayes are highly overcondent.