Genetic Algorithm - better crossover/mutation algorithm? [closed] - algorithm

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
For the basic Genetic Algorithm implementation with a random crossover boundary and random number of mutations at random bit positions, a lot of inferior children are created and leaves the optimum solution to be discovered by chance. This wastes a lot of CPU, and the user does not know when the optimum solution is found, because it could always be "the next one".
Is there an algorithm to consistently get better children rather than leave this important process to chance?
Thank you.

As others have said the quality of offspring is dependant on a lot of factors and can often require experimentation, using known solutions, to get right.
However, one of the biggest factors in determining the quality of the children is the selection of the parent chromosome. Since stronger parents are more likely to create strong children the type of selection plays a big part.
The best type of selection (more common types are rank based, roulette wheel and tournament selection) like with most things Genetic Algorithms related are largely dependant on the problem, and can often require experimentation to get right.
On whether there is a better crossover/mutation algorithm for the basic Genetic Algorithm the answer is, not really. You can experiment with different kinds of crossover (1-point, two-point, n-point) and mutation (swap or replace). The values for each can also be altered. There are also plenty of things you can change or add to the Genetic Algorithm to improve efficiency (things like culling, duplicate removal, allowing the best chromosome into the next generation) but then your Genetic Algorithm would no longer be a basic Genetic Algorithm. Adding these features also means that you may have to do a lot more experimentation to get the features used, and their parameters, right.

As Michalewicz states in his book, How to solve it, there is no such thing as an off-the-shelf genetic algorithm. So, the answer to your question is basically what #OnABauer stated.
I would only like to complete his answer with a suggestion for you to look into a memetic algorithm (there is an interesting introduction here). If you add a local optimization operator, chances are that offspring will be improved (beware of local entrapment only).

For optimizations problems like the traveling salesperson, you can encode the solution so that all possible crossovers form a valid solution.
For example, instead of treating the genome as a list of cities (and thereby making every genome that misses a city or revisits a city as invalid), you can treat the genome as a list of transformations on a list of cities, starting with some (arbitrary) canonical list of cities.
Suppose we have a list of cities:
Azusa
Boca Raton
Cincinatti
Denver
If you treat each pair of bits as an encoding of one of the cities, then only a small number of bit patterns encodes a valid tour. Mutating and crossing between valid tours has a very small probability of resulting in another valid tour.
If you instead treat every four bits as a swap instruction. Now any list of bits is valid. To determine the correct tour, you start with an "official" ordering of the cities, and apply the list of swaps in order. You'll end with a valid tour, even if some of the swaps are no-ops.
I've used this approach in a couple of optimization problems with good results.

In essence, genetic algorithm is a type of search algorithm.
GA is a particular kind of heuristic search.
You are trying to explore the answers which you think are more likely to be the best first.
In GA, the basis of why you choose to explore an answer is because it is similar to a previously known good answers (parents).
GA also traditionally can terminate before exploring all the possible answers, which I think is the aspect that worries you the most.
If you want to always look at all possible answer, then you are considering a exhaustive search. For example, doing depth-first search through all possible answers.
In conclusion, GA is a heuristic search.
You choose it, if:
exhaustive search isn't fast enough.
you don't care if the final result is the best (globally optimal)
you understand how to guess for better answer based on explored answers. This depends on the problem domain. It is what determines what mutation and crossover operators.

Related

How genetic algorithm is different from random selection and evaluation for fittest?

I have been learning the genetic algorithm since 2 months. I knew about the process of initial population creation, selection , crossover and mutation etc. But could not understand how we are able to get better results in each generation and how its different than random search for a best solution. Following I am using one example to explain my problem.
Lets take example of travelling salesman problem. Lets say we have several cities as X1,X2....X18 and we have to find the shortest path to travel. So when we do the crossover after selecting the fittest guys, how do we know that after crossover we will get a better chromosome. The same applies for mutation also.
I feel like its just take one arrangement of cities. Calculate the shortest distance to travel them. Then store the distance and arrangement. Then choose another another arrangement/combination. If it is better than prev arrangement, then save the current arrangement/combination and distance else discard the current arrangement. By doing this also, we will get some solution.
I just want to know where is the point where it makes the difference between random selection and genetic algorithm. In genetic algorithm, is there any criteria that we can't select the arrangement/combination of cities which we have already evaluated?
I am not sure if my question is clear. But I am open, I can explain more on my question. Please let me know if my question is not clear.
A random algorithm starts with a completely blank sheet every time. A new random solution is generated each iteration, with no memory of what happened before during the previous iterations.
A genetic algorithm has a history, so it does not start with a blank sheet, except at the very beginning. Each generation the best of the solution population are selected, mutated in some way, and advanced to the next generation. The least good members of the population are dropped.
Genetic algorithms build on previous success, so they are able to advance faster than random algorithms. A classic example of a very simple genetic algorithm, is the Weasel program. It finds its target far more quickly than random chance because each generation it starts with a partial solution, and over time those initial partial solutions are closer to the required solution.
I think there are two things you are asking about. A mathematical proof that GA works, and empirical one, that would waive your concerns.
Although I am not aware if there is general proof, I am quite sure at least a good sketch of a proof was given by John Holland in his book Adaptation in Natural and Artificial Systems for the optimization problems using binary coding. There is something called Holland's schemata theoerm. But you know, it's heuristics, so technically it does not have to be. It basically says that short schemes in genotype raising the average fitness appear exponentially with successive generations. Then cross-over combines them together. I think the proof was given only for binary coding and got some criticism as well.
Regarding your concerns. Of course you have no guarantee that a cross-over will produce a better result. As two intelligent or beautiful parents might have ugly stupid children. The premise of GA is that it is less likely to happen. (As I understand it) The proof for binary coding hinges on the theoerm that says a good partial patterns will start emerging, and given that the length of the genotype should be long enough, such patterns residing in different specimen have chance to be combined into one improving his fitness in general.
I think it is fairly easy to understand in terms of TSP. Crossing-over help to accumulate good sub-paths into one specimen. Of course it all depends on the choice of the crossing method.
Also GA's path towards the solution is not purely random. It moves towards a certain direction with stochastic mechanisms to escape trappings. You can lose best solutions if you allow it. It works because it wants to move towards the current best solutions, but you have a population of specimens and they kind of share knowledge. They are all similar, but given that you preserve diversity new better partial patterns can be introduced to the whole population and get incorporated into the best solutions. This is why diversity in population is regarded as very important.
As a final note please remember the GA is a very broad topic and you can modify the base in nearly every way you want. You can introduce elitarism, taboos, niches, etc. There is no one-and-only approach/implementation.

Decrease and Conquer in Real world [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
Can anyone suggest real-world problems in this algorithms Insertion sort, Breath first search, Depth first search or Topological sorting? Thank you.
Real-world examples of recursion
I saw sample here But I need is specific problems for insertion sort, Breath first search, Depth first search or Topological sorting algorithms.
I wish you can Help me.
How more real can it get than our daily, humdrum lives?
Insertion sort is what we (or at least I) most commonly use when we need to sort things. Consider a deck of cards - one would go over them one by one, put the smallest ones at the front, smaller behind them, etc. Or a pile of paperwork which needs to be sorted by date, same algorithm.
In CS, insertion-sort is less commonly used because we have much better algorithms (qsort and merge-sort come to mind). A human can do them as well, but that'd be a much more tedious task indeed.
Breadth-first search's use is in the name: When we want to go over a tree horizontally, and not vertically. Let's say you heard your family is connected to Ilya Repin, a Russian artist. You go to the attic, crack open the giant wooden chest that's been gathering dust, and take out the old family tree which dates back to the 19th century (doesn't everyone have that?). You know he's closer to the top of the tree than he is to the bottom, so you go breadth-first: Take the first line, followed by the second, and so on...just a little more...Efim Repin...bingo!
If Ilya Repin happened to be in the leftmost branch of the tree, then depth-first would've made more sense. However, in the average case, you'd want to go breadth-first because we know our target is closer to the root than it is to the leafs. In CS there are a buttload of usage cases (Cheney's Algorithm, A*, etc. you can see several more on wikipedia).
Depth-first search is used when we... drumroll ...want to go to the depth of the tree first, travel vertically. There are so many uses I can't even begin, but the simplest and most common is solving a cereal-box maze. You go through one path until you reach a dead-end, and then you backtrack. We don't do it perfectly, since we sometimes skip over a path or forget which ones we took, but we still do it.
In CS, there're a whole lot of usage cases, so I'll redirect you to wikipedia again.
Topological sort some of us use in the back of our heads, but it's easily seen in chefs, cooks, programmers, anyone and everyone who has to do an ordered set of tasks. My grandmother made the best Canneloni I've eaten, and her very simple recipe was constructed of several simple steps (which I've managed to forget, but here are their very general outline): Making the "pancake" wrapper, making the sauce and wrapping them together. Now, we can't wrap these two up without making them, so naturally, we first have to make the wrapper and sauce, and only then wrap 'em.
In CS it's used for exactly the same thing: scheduling. I think Excel uses it to calculate dependant spreadsheet formulas (or maybe it just uses a simple recursive algorithm, which may be less efficient). For some more, you can see our good friend wikipedia and a random article I found.
I work with Hierarchical Datastructures, and always in need of BFS to find objects i need nested under a specific root...
e.g. Find(/Replace)
some-times (significally less) i use DFS in order to check some design constraints that cannot be evaluated without investigating the leafs.
though not used by me, and not exactly BFS,
but GPS navigation software use A* to search for a good path,
Which is kid of a "Weighted BFS"
Insertion sort - none, it's good for learning. Outside computers it is used often to sort, for example, cards. In real-world merge sort or quick sort are better.
BFS - finding connected nodes, finding shortest paths. Base for Dijkstra's algorithm and A* (faster version of Dijkstra's).
DFS - finding connected nodes, numbering nodes in tree.
Topological sorting - finding correct order of tasks.

What are some good resources for learning backtracking, branch-and-bound and dynamic programming algorithms? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
CLRS doesn't seem to cover bactracking/branch-and-bound. I tried several resources online, though I get the idead behind these, I am unable to write code for, let's say, Knapsack problem. So, I want something that, may be, takes a problem and solves it with these 3 approaches and at least gives pseudo-code.
Or any resources that you thing will be helpful.
In algorithms which use backtracking, branch/bound etc. there is typically the concept of the solution space, and the search space. The goal of the algorithm is to traverse the search space to reach a point in the solution space, and often a point which is considered optimal by some metric, or establish that the solution space is empty (without visiting every element in the search space).
The first step is to define a mechanism to express an element in the search space in an efficient format. The representation should allow you to express what elements form a solution space, a way to evaluate the quality of element by the metric used to measure, a way to determine the neighboring elements you can reach from a current state and so on.
Typically these algorithms will traverse the search space till the find a solution, or will exit if no solution will exist. The traversal happens by visiting a series of points, often in parallel to explore the search space. This is the branch aspect; you are making decisions to visit certain parts of the search space.
During the traversal of the search space they may determine that a particular path is not worth it so they may decide that they would not explore the part of the search space reachable from the path. This is very bounding aspect.
Very often the traversal of the space is done by using partial solutions. For example if you have a search space represented by eight bits, you might assign fixed values to two specific bits out of the eight bits, and then search for a desirable solution for the space represented by the remaining six bits. You might discover that a particular assignment of the two bits to a fixed value leads to a situation where no solution can exist (or the quality of the solution is very poor). You can then bound the search space so that the algorithm does not explore any more elements in that sub-space defined by assigning a particular fixed value to those two bits.
For backtracking based systems the pseudo code is trivial. The challenge lies in finding efficient representation to represent the search space, representing partial solutions, finding out the validity of a particular solution, coming up with rules to determine which path to take up front, developing metrics to measure the quality of solution, figuring out when to backtrack, or how far to backtrack and so on...
StateStack.push(StartState)
loop{
curState = StateStack.top
nextState = calculateNextState(curState)
StateStack.push(nextState)
if(reachedFinalGoal(nextState)){
break;
}
if(needToBackTrack(StateStack)){
curState = nextState
stateToBackTrackTo = calculateStateToBackTrackTo(stateStack)
while(curState != stateToBackTrackTo){
stateToGoBackTo = stateStack.pop
curState = RollBackToState(stateToGoBackTo)
}
}
These are search techniques, rather than algorithms. To start with, you should clearly understand what the search space is. E.g. in the case of a Knapsack problem that would consist of all the possible subsets of available objects. Sometimes there are constraints that define which solutions are valid and which are not, for example those sets of objects that exceed the total volume of the knapsack are not valid. You also should have the clearly defined objective (the total worth of the selected objects here).
Wikipedia contains a pretty accurate description of the Branch and Bound, actually. It's rather high-level, but any description that is more detailed will require assumptions about the structure of the search space. For backtracking there's even some pseudo-code, but again very general.
An alternative (and probably better) approach is to find example applications of these techniques and study those. There's at least a couple of algorithms involving DP in CLRS and you can surely google up more if you need.

concrete examples of heuristics

What are concrete examples (e.g. Alpha-beta pruning, example:tic-tac-toe and how is it applicable there) of heuristics. I already saw an answered question about what heuristics is but I still don't get the thing where it uses estimation. Can you give me a concrete example of a heuristic and how it works?
Warnsdorff's rule is an heuristic, but the A* search algorithm isn't. It is, as its name implies, a search algorithm, which is not problem-dependent. The heuristic is. An example: you can use the A* (if correctly implemented) to solve the Fifteen puzzle and to find the shortest way out of a maze, but the heuristics used will be different. With the Fifteen puzzle your heuristic could be how many tiles are out of place: the number of moves needed to solve the puzzle will always be greater or equal to the heuristic.
To get out of the maze you could use the Manhattan Distance to a point you know is outside of the maze as your heuristic. Manhattan Distance is widely used in game-like problems as it is the number of "steps" in horizontal and in vertical needed to get to the goal.
Manhattan distance = abs(x2-x1) + abs(y2-y1)
It's easy to see that in the best case (there are no walls) that will be the exact distance to the goal, in the rest you will need more. This is important: your heuristic must be optimistic (admissible heuristic) so that your search algorithm is optimal. It must also be consistent. However, in some applications (such as games with very big maps) you use non-admissible heuristics because a suboptimal solution suffices.
A heuristic is just an approximation to the real cost, (always lower than the real cost if admissible). The better the approximation, the fewer states the search algorithm will have to explore. But better approximations usually mean more computing time, so you have to find a compromise solution.
Most demonstrative is the usage of heuristics in informed search algorithms, such as A-Star. For realistic problems you usually have large search space, making it infeasible to check every single part of it. To avoid this, i.e. to try the most promising parts of the search space first, you use a heuristic. A heuristic gives you an estimate of how good the available subsequent search steps are. You will choose the most promising next step, i.e. best-first. For example if you'd like to search the path between two cities (i.e. vertices, connected by a set of roads, i.e. edges, that form a graph) you may want to choose the straight-line distance to the goal as a heuristic to determine which city to visit first (and see if it's the target city).
Heuristics should have similar properties as metrics for the search space and they usually should be optimistic, but that's another story. The problem of providing a heuristic that works out to be effective and that is side-effect free is yet another problem...
For an application of different heuristics being used to find the path through a given maze also have a look at this answer.
Your question interests me as I've heard about heuristics too during my studies but never saw an application for it, I googled a bit and found this : http://www.predictia.es/blog/aco-search
This code simulate an "ant colony optimization" algorithm to search trough a website.
The "ants" are workers which will search through the site, some will search randomly, some others will follow the "best path" determined by the previous ones.
A concrete example: I've been doing a solver for the game JT's Block, which is roughly equivalent to the Same Game. The algorithm performs a breadth-first search on all possible hits, store the values, and performs to the next ply. Problem is the number of possible hits quickly grows out of control (10e30 estimated positions per game), so I need to prune the list of positions at each turn and only take the "best" of them.
Now, the definition of the "best" positions is quite fuzzy: they are the positions that are expected to lead to the best final scores, but nothing is sure. And here comes the heuristics. I've tried a few of them:
sort positions by score obtained so far
increase score by best score obtained with a x-depth search
increase score based on a complex formula using the number of tiles, their color and their proximity
improve the last heuristic by tweaking its parameters and seeing how they perform
etc...
The last of these heuristic could have lead to an ant-march optimization: there's half a dozen parameters that can be tweaked from 0 to 1, and an optimizer could find the optimal combination of these. For the moment I've just manually improved some of them.
The second of this heuristics is interesting: it could lead to the optimal score through a full depth-first search, but such a goal is impossible of course because it would take too much time. In general, increasing X leads to a better heuristic, but increases the computing time a lot.
So here it is, some examples of heuristics. Anything can be an heuristic as long as it helps your algorithm perform better, and it's what makes them so hard to grasp: they're not deterministic. Another point with heuristics: they're supposed to lead to quick and dirty results of the real stuff, so there's a trade-of between their execution time and their accuracy.
A couple of concrete examples: for solving the Knight's Tour problem, one can use Warnsdorff's rule - an heuristic. Or for solving the Fifteen puzzle, a possible heuristic is the A* search algorithm.
The original question asked for concrete examples for heuristics.
Some of these concrete examples were already given. Another one would be the number of misplaced tiles in the 15-puzzle or its improvement, the Manhattan distance, based on the misplaced tiles.
One of the previous answers also claimed that heuristics are always problem-dependent, whereas algorithms are problem-independent. While there are, of course, also problem-dependent algorithms (for instance, for every problem you can just give an algorithm that immediately solves that very problem, e.g. the optimal strategy for any tower-of-hanoi problem is known) there are also problem-independent heuristics!
Consequently, there are also different kinds of problem-independent heuristics. Thus, in a certain way, every such heuristic can be regarded a concrete heuristic example while not being tailored to a specific problem like 15-puzzle. (Examples for problem-independent heuristics taken from planning are the FF heuristic or the Add heuristic.)
These problem-independent heuristics base on a general description language and then they perform a problem relaxation. That is, the problem relaxation only bases on the syntax (and, of course, its underlying semantics) of the problem description without "knowing" what it represents. If you are interested in this, you should get familiar with "planning" and, more specifically, with "planning as heuristic search". I also want to mention that these heuristics, while being problem-independent, are dependent on the problem description language, of course. (E.g., my before-mentioned heuristics are specific to "planning problems" and even for planning there are various different sub problem classes with differing kinds of heuristics.)

How do 20 questions AI algorithms work?

Simple online games of 20 questions powered by an eerily accurate AI.
How do they guess so well?
You can think of it as the Binary Search Algorithm.
In each iteration, we ask a question, which should eliminate roughly half of the possible word choices. If there are total of N words, then we can expect to get an answer after log2(N) questions.
With 20 question, we should optimally be able to find a word among 2^20 = 1 million words.
One easy way to eliminate outliers (wrong answers) would be to probably use something like RANSAC. This would mean, instead of taking into account all questions which have been answered, you randomly pick a smaller subset, which is enough to give you a single answer. Now you repeat that a few times with different random subset of questions, till you see that most of the time, you are getting the same result. you then know you have the right answer.
Of course this is just one way of many ways of solving this problem.
I recommend reading about the game here: http://en.wikipedia.org/wiki/Twenty_Questions
In particular the Computers section:
The game suggests that the information
(as measured by Shannon's entropy
statistic) required to identify an
arbitrary object is about 20 bits. The
game is often used as an example when
teaching people about information
theory. Mathematically, if each
question is structured to eliminate
half the objects, 20 questions will
allow the questioner to distinguish
between 220 or 1,048,576 subjects.
Accordingly, the most effective
strategy for Twenty Questions is to
ask questions that will split the
field of remaining possibilities
roughly in half each time. The process
is analogous to a binary search
algorithm in computer science.
A decision tree supports this kind of application directly. Decision trees are commonly used in artificial intelligence.
A decision tree is a binary tree that asks "the best" question at each branch to distinguish between the collections represented by its left and right children. The best question is determined by some learning algorithm that the creators of the 20 questions application use to build the tree. Then, as other posters point out, a tree 20 levels deep gives you a million things.
A simple way to define "the best" question at each point is to look for a property that most evenly divides the collection into half. That way when you get a yes/no answer to that question, you get rid of about half of the collection at each step. This way you can approximate binary search.
Wikipedia gives a more complete example:
http://en.wikipedia.org/wiki/Decision_tree_learning
And some general background:
http://en.wikipedia.org/wiki/Decision_tree
It bills itself as "the neural net on the internet", and therein lies the key. It likely stores the question/answer probabilities in a spare matrix. Using those probabilities, it's able to use a decision tree algorithm to deduce which question to ask that would best narrow down the next question. Once it narrows the number of possible answers to a few dozen, or if it's reached 20 questions already, then it starts reading off the most likely.
The really intriguing aspect of 20q.net is that unlike most decision tree and neural network algorithms I'm aware of, 20q supports a sparse matrix and incremental updates.
Edit: Turns out the answer's been on the net this whole time. Robin Burgener, the inventor, described his algorithm in detail in his 2005 patent filing.
It is using a learning algorithm.
k-NN is a good example of one of these.
Wikipedia: k-Nearest Neighbor Algorithm

Resources