I'm trying to write a genetic algorithm for the Travelling Salesman Problem (TSP). For selection I'm implementing Roulette Wheel Selection: http://www.edc.ncl.ac.uk/highlight/rhjanuary2007g02.php/
It basicaly means that the probability to be selected for mating is proportional to the value of the fitness function.
The most common fitness function for TSP is the length of the route. However, the 'shorter' the route is - the better.
How can I write a fitness function that will describe the shortness of the route?
Or how can I convert the true length of each route to a probability?
You have a cost function (the lower the better) that you want to convert to a fitness function (the higher the better).
Use the inverse. If the cost (distance) is x then your fitness could become 1/x.
Actually that is not a problem for the fitness function, but for the selection step. You should also use windowing in proportional selection so that you scale the fitness values. Otherwise the operator will exert too little selection pressure: just imagine the values 573 and 579 they're very close and thus will have about the same proportion. Typically you scale them by the current best and worst fitness.
You can take a look at the ProportionalSelector that we implemented in HeuristicLab. You can even try and experiment with that software and explore different selection methods, crossovers, mutation operators, etc
Related
I am using evolutionary algorithms e.g. the NSGA-II algorithm to solve unconstrained optimization problems with multiple objectives.
As my fitness functions sometimes have very different domains (e.g. f1(x) generates fitness values within [0..1] and f2(x) within [10000..10000000]) I am wondering if this has an effect on the search behaviour of the selected algorithm.
Does the selection of the fitness function domain (e.g. scaling all domains to a common domain from [lb..ub]) impact the solution quality and the speed of finding good solutions? Or is there no general answer to this question?
Unfortunately, I could not find anything on this topic. Any hints are welcome!
Your question is related to the selection strategy implemented in the algorithm. In the case of the original NSGA II, selection is made using a mixture of pareto rank and crowding distance. While the pareto rank (i.e. the non dominated front id of a point) is not changing scaling the numerical values by some constant, the crowding distance does.
So the answer is yes, if your second objective is in [10000 .. 10000000] its contribution to the crowding distance might be eating up the one of the other objective.
In algorithms such as NSGA II units count!
I have just come across your question and I have to disagree with the previous answer. If you read the paper carefully you will see that the crowding distance is supposed to be calculated in the normalized objective space. Exactly for the reason that one objective shall not dominating another.
My PhD advisor is Kalyanmoy Deb who has proposed NSGA-II and I have implemented the algorithm by myself (available in our evolutionary multi-objective optimization framework pymoo). So I can state with certainty that normalization is supposed to be incorporated into the algorithm.
If you are curious about a vectorized crowding distance implementation feel free to have a look at pymoo/algorithms/nsga2.py in pymoo on GitHub.
I am trying to solve the N-puzzle using the A* algorithm with 3 different heuristic functions. I want to know how to compare each of the heuristics in terms of time complexity. The heuristics I am using are: manhattan distance , manhattan distance + linear conflict, N-max swap. And specifically for an 8-puzzle and an 15-puzzle.
The N-puzzle is, in general, NP hard to find the shortest solution, so no matter what heuristic you use it's unlikely you'll be able to find any difference in complexity between them, since you won't be prove the tightness of any bound.
If you restrict yourself to the 8-puzzle or 15-puzzle, an A* algorithm with any admissible heuristic will run in O(1) time since there are a finite (albeit large) number of board positions.
As #Harold said in his comment, the approach to compare time complexity of heuristic functions is tipically by experimental tests. In your case, generate a set of n random problems for the 8-puzzle and the 15-puzzle and solve them using the different heuristic functions. Things to be aware of are:
The comparison will always depend on several factors, like hardware expecs, programming language, your skills when implementing the algorithm, ...
Generally speaking, a more informed heuristic will always expand less nodes than a less informed one, and will probably be faster.
And finally, in order to compare the three heuristics for each problem set, I would suggest a graphic with average running times (repeat for example 5 times each problem) where:
The problems are in the x-axis sorted by difficulty.
The running times are in the y-axis for each heuristic function (perhaps in logarithmic scale if the difference between the alternatives cannot be easily seen).
and a similar graphic with the number of explored states.
Consider a genetic algorithm that uses only selection and mutation(no crossover). How is this similar to a hill climbing algorithm ?
I found this statement in an article, but I dont seem to understand why?
This statement is risky and it is hard to see. I believe that many would not necessairly (or fully) agree with it.
Case when it may be true
I believe that the author of this statement wants to say that it is possible to use only mutation and selection to achieve hill-climbing algorithm.
Imagine that each mutation of your Chromosome (X) can improve or deteriorate value of your fitness function (Y) (imagine it is height). We want to find X for which Y is the biggest.
We put into our pool population of Chromosomes (X)
We MUTATE chromosomes (X) and look for improvement of (Y).
After mutation SELECT only chromosomes producing highest (Y) and repeat steps 1-2 20 times.
Because at every stage you are rejecting poor values - you will be able to get (nearly) maximum value of Y.
I think this is what the author was trying to say.
Case when it may be false
When mutations affect Chromosomes to a great extent - the algorithm will not converge easily to the maximum. This is when too many genes in a chromosome are affected at each mutation;
When chromosome after mutation does not resemble its original set of genes - you are only introducing noise. In the effect it is a bit like using random generator for (X) to find maxmimum (Y). Every time you mutate (X) you are getting something that has nothing to do with its original.
You may find maximum value, but it has little to do with hill climbing.
Both hill climbing and genetic algorithms without crossover are local search techniques. In that sense, they are similar, but I would not say they are the same.
Hill climbing comes in different forms but all share some properties that the genetic algorithm does not have:
there is one well-defined neighbor function (which given one solution can enumerate all neighbors)
unless cancelled the algorithm continues as long as improvements are found (not after a fixed number of generations)
during the iteration, there is only one solution (not a pool of solutions)
In practice, choosing a good neighbor function can have a huge impact on the effectiveness of a hill climbing algorithm. Here, you can sometimes use additional domain knowledge.
In genetic algorithms, as far as I have seen, domain knowledge is not used for mutators. Mostly, they use simple techniques like flipping bits or adding random noise to numbers.
Hill climbing can work well as a deterministic algorithm without any randomness. Depending on your problem, that may be a critical property or not. If not, then random-restart hill climbing will often lead to better results.
In summary, if you use a genetic algorithm without crossovers, you end up with a rather bad local search algorithm. I would expect a good hill climbing algorithm to outperform it, especially in a scenario where you are under strict time contraints (real-time systems).
I'm not sure if my understanding of maximization and minimization is correct.
So lets say for some function f(x,y,z) I want to find what would give the highest value that would be maximization, right? And if I wanted to find the lowest value that would be minimization?
So if a genetic algorithm is a search algorithm trying to maximize some fitness function would they by definition be maximization algorithms?
So let's say for some function f(x,y,z), I want to find what would give the highest value that would be maximization, right? And if I wanted to find the lowest value that would be minimization?
Yes, that's by definition true.
So if a genetic algorithm is a search algorithm trying to maximize some fitness function would they by definition be maximization algorithms?
Pretty much yes, although I'm not sure a "maximization algorithm" is a well-used term, and only if a genetic algorithm is defined as such, which I don't believe it is strictly.
Generic algorithms can also try to minimize the distance to some goal function value, or minimize the function value, but then again, this can just be rephrased as maximization without loss of generality.
Perhaps more significantly, there isn't a strict need to even have a function - the candidates just need to be comparable. If they have a total order, it's again possible to rephrase it as a maximization problem. If they don't have a total order, it might be a bit more difficult to get candidates objectively better than all the others, although nothing's stopping you from running the GA on this type of data.
In conclusion - trying to maximize a function is the norm (and possibly in line with how you'll mostly see it defined), but don't be surprised if you come across a GA that doesn't do this.
Are all genetic algorithms maximization algorithms?
No they aren't.
Genetic algorithms are popular approaches to multi-objective optimization (e.g. NSGA-II or SPEA-2 are very well known genetic algorithm based approaches).
For multi-objective optimization you aren't trying to maximize a function.
This because scalarizing multi-objective optimization problems is seldom a viable way (i.e. there isn't a single solution that simultaneously optimizes each objective) and what you are looking for is a set of nondominated solutions (or a representative subset of the Pareto optimal solutions).
There are also approaches to evolutionary algorithms which try to capture open-endedness of natural evolution searching for behavioral novelty. Even in an objective-based problem, such novelty search ignores the objective (see Abandoning Objectives: Evolution through the
Search for Novelty Alone by Joel Lehman and Kenneth O. Stanley for details).
As you know choosing a genetic representation is a part of building any Genetic Algorithm (GA). A mapping can be hence defined between the genotype space (problem solving space) and the phenotype space (original problem context). The fitness function, let's called it f, can be this mapping, in case assessing individuals of GA is identical to the objective function of the original problem:
f: Genotype Space ---------> Phenotype Space
For each genotype there is one corresponding phenotype. So, f is injective. A good GA representation encodes all phenotype into genotypes. So, f is bijective. My question: is it possible to go further and assess the quality of genetic representations by just examining some analytical properties of the fitness function. Thank you.
There is not, as of yet, any set of general guidelines for assessing the quality of a fitness function.
For someone starting out on a genetic algorithm problem, the fitness function is first formulated as a heuristic which suits ones own understanding. Development of "better" measures of fitness are done progressively, with the researcher refining the fitness function as new metrics come to light.
As the Wikipedia article on fitness functions states:
Definition of the fitness function is not straightforward in many cases and often is performed iteratively if the fittest solutions produced by GA are not what is desired. In some cases, it is very hard or impossible to come up even with a guess of what fitness function definition might be.
Evaluation of the suitability of fitness function, however, is an active area of research. There has been directed research in the past towards this end, though no promising results have arisen.