How does Genetic Algorithm Crossover work when my output contains only 2 states? - genetic-algorithm

I'm currently working on a project where I am using a basic Cellular Automata and a Genetic Algorithm to create dungeon-like maps. Currently, I'm having an incredibly hard time understanding how exactly crossover works when my output can only be two states: DEAD or ALIVE (1 or 0).
I understand crossover conceptually - you find two fit members of the population and they exchange genetic material, hopefully producing a fitter offspring. I also understand this is usually done by performing k-point crossover on bit strings (but can also be done with real numbers).
However, even if I encode my Dead/Alive cells into bits and cross them over... What do I end up with? The cell can ONLY be Dead or Alive. Crossover will give me some random value that is outside this range, right? And even if I were to work on floating point numbers, wouldn't I just end up with a 1 or 0 anyway? In that case, it seems like it would be better to just randomly mutate Dead cells into Alive cells, or vice versa.
I've read several papers on the topic but none seem to explain this particular issue (in language I can understand, anyway). Intuitively, I thought maybe I can perform crossover on NEIGHBOURHOODS of cells - so I find 2 fit neighbourhoods, and then they exchange members (Neighbourhood A gives 4 of it's neighbours to Neighbourhood B for example). However, I have not seen this idea anywhere, which leads me to believe it must be fundamentally wrong.
Any help would be greatly appreciated, I'm really stuck on this one.

A lover of dungeon games and Genetic programming here :)
I think you misunderstood the concept of crossover. On your Cellular Automata, you must have genetic information (chromosomes) in such a way your system decides if the matching cell is Dead or Alive. The state of the cell is the output of your system.
You perform the crossover genetic operator between two-parent chromosomes by mixing their genetic information. As a result, you obtain a new chromosome that encodes the map in a way similar to both parents. The new state of your cells is obtained by running the new chromosome to re-map your scenery.
The crossover gets you a new chromosome to map your dungeon, it does not give you new states for the cells. To get the new states, just run your new chromosome.
The state of your cells would be the phenotype, the way your chromosome is manifested. Your chromosome is the model that decides if your cell is dead or alive. I am not concerned about the model you are using. For instance, you are using a Neural Network with two input nodes. One input node receives the X coordinate of the cell in the grid, and the other node receives the Y coordinate. The output node is a binary value: DEAD or ALIVE. This Neural Network has a certain number of hidden layers and weights that are encoded in your chromosome. By performing the crossover operator you create a new way to connect the neurons, which is in between both parents. But in order to know the new state of each cell, you need to pass the coordinates again to the Neural Network. Maybe checking NEAT algorithm by Stanley clarifies the crossover process: http://nn.cs.utexas.edu/downloads/papers/stanley.ec02.pdf
In case you do not have any model that encodes the state of each cell in the map, your genetic information is directly the state of each cell. In this situation, you could split your parent grids into smaller ones, 10x10 grids for instance. Then run through the map in tiles of 10x10 and choose randomly if you pick the matching tile from parent1 or parent2.
I hope this helps!
Alberto

Related

Bark at a Sphere Tree or Look Somewhere Else?

The scenario: A large number of players, playing a real-time game in 3d space, must be organized in a way where a server can efficiently update other players and any other observer of a players move and actions. What objects 'talk' to one another needs to be culled based on their range from one another in simulation; this is to preserve network sanity, programmer sanity, and also to allow for server-lets to handle smaller chunks of the overall world play-space.
However, if you have 3000 players, this runs into the issue that one must run 3000! Calculations to find out the ranges between everything. (Google tells me that ends up as a number with over 9000 digits; that’s insane and not worth considering for a near-real-time environment.)
Daybreak Games seems to have solved this problem with their massive online first person shooting game Planetside 2; it had allowed 3000 players to play on a shared space and have real-time responsiveness. They’ve apparently done it through a “Sphere Tree” data structure.
However, I’m not positive this is the solution they use, and I’m still a questioning how to apply the concept of "Sphere Trees" to reduce the range calculations for culling to a reasonable amount.
If Sphere Trees are not the right tree to bark up, what else should I be directing my attention at to tackle this problem?
(I'm a c# programmer (mainly), but I'm looking for a logical answer, not a code one)
References I’ve found about sphere trees;
http://isg.cs.tcd.ie/spheretree/#algorithms
https://books.google.com/books?id=1-NfBElV97IC&pg=PA385&lpg=PA385#v=onepage&q&f=false
Here are a few of my thoughts:
Let n denote the total number of players.
I think your estimate of 3000! is wrong. If you want to calculate all pairs distances given a fixed distance matrix, you run 3000 choose 2 operations, in the order of O(n^2*t), where t is the number of operations you spend calculating the distance between two players. If you build the graph underlying the players with edge weights being the Euclidean distance, you can reduce this to the all-pairs shortest paths problem, which is doable via the Floyd-Warshall algorithm in O(n^3).
What you're describing sounds pretty similar to doing a range query: https://en.wikipedia.org/wiki/Range_searching. There are a lot of data structures that can help you, such as range trees and k-d trees.
If objects only need to interact with objects that are, e.g., <= 100m away, then you can divide the world up into 100m x 100m tiles (or voxels), and keep track of which objects are in each non-empty tile.
Then, when one object needs to 'talk', you only need to check the objects in at most 9 tiles to see if they are close enough to hear it.

How to get value of child node in minmax algorithm?

I am working on minmax algorithm and i want to do alpha-beta pruning...
And i read one example having this tree.
i didnt understand , how to get value of child node, marked in red color.
can someone pls help me, how the values 3,5,10,2 come from and what is logic behind it ??
It doesn't come from anywhere, usually, you predict those values.
For searching a tree with huge possible number of states (i.e chess game), this technique, also commonly known as heuristic function, is a must. A heuristic function usually takes a single parameter, a state i.e one of those child nodes (an array of size 9 for a tic-tac-toe game for example) and tries to predict how favorable this state is for a certain player. So, if the function is written from say, white's POV in chess, +10 might mean white is likely to win, while a -7 might mean game is in black's favor. A state where white is guaranteed to win, should have +infinity as returned value.
Naturally, questions like "how favorable" can't have a science-y, absolute answer. So you usually apply your intuition, domain expertise, common sense etc to write this function.
When number of states isn't huge, as in tic-tac-toe game for example, where you don't have to stop the search after certain depths, You can simply use +1,0,-1 to denote win, draw,loss respectively.

How do I guarantee that a cellular automata generated maze is solvable/interesting?

I am writing a maze generation algorithm, and this wikipedia article caught my eye. I decided to implement it in java, which was a cinch. The problem I am having is that while a maze-like picture is generated, the maze often is not solvable and is not often interesting. What I mean by interesting is that there are a vast number of unreachable places and often there are many solutions.
I implemented the 1234/3 rule (although is is changeable easily, see comments for an explanation) with a roughly 50/50 distribution in the start. The mazes always reach an equilibrium where there is no change between t-steps.
My question is, is there a way to guarantee the mazes solvability from a fixed start and endpoint? Also, is there a way to make the maze more interesting to solve (fewer/one solution and few/no unreachable places)? If this is not possible with cellular automata, please tell me. Thank you.
I don't think it's possible to ensure a solvable, interesting maze through simple cellular automata, unless there's some specific criteria that can be placed on the starting state. The fact that cells have no knowledge of the overall shape because each cell won't be able to coordinate with the group as a whole.
If you're insistent on using them, you could do some combination of modification and pathfinding after generation is finished, but other methods (like the ones shown in the Wikipedia article or this question) are simpler to implement and won't result in walls that take up a whole cell (unless you want that).
the root of the problem is that "maze quality" is a global measure, but your automaton cells are restricted to a very local knowledge of the system.
to resolve this, you have three options:
add the global information from outside. generate mazes using the automaton and random initial data, then measure the maze quality (eg using flood fill or a bunch of other maze solving techniques) and repeat until you get a result you like.
use a much more complex set of explicit rules and state. you can work out a set of rules / cell values that encode both the presence of walls and the lengths / quality of paths. for example, -1 would be a wall and a positive value would be the sum of all neighbours above and to the left. then positive values encode the path distance from top left, roughly. that's not enough, but it shows the general idea... you need to encode an algorithm about the maze "directly" in the rules of the system.
use a less complex, but still turing complete, set of rules, and encode the rules for maze generation in the initial state. for example, you could use conway's life and construct an initial state that is a "program" that implements maze generation via gliders etc etc.
if it helps any you could draw a parallel between the above and:
ghost in the machine / external user
FPGA
programming a general purpose CPU
Run a path finding algorithm over it. Dijkstra would give you a sure way to compute all solutions. A* would give you one good solution.
The difficulty of a maze can be measured by the speed at which these algorithms solve it.
You can add some dead-ends in order to shut down some solutions.

Question about Backpropagation Algorithm with Artificial Neural Networks -- Order of updating

Hey everyone, I've been trying to get an ANN I coded to work with the backpropagation algorithm. I have read several papers on them, but I'm noticing a few discrepancies.
Here seems to be the super general format of the algorithm:
Give input
Get output
Calculate error
Calculate change in weights
Repeat steps 3 and 4 until we reach the input level
But here's the problem: The weights need to be updated at some point, obviously. However, because we're back propagating, we need to use the weights of previous layers (ones closer to the output layer, I mean) when calculating the error for layers closer to the input layer. But we already calculated the weight changes for the layers closer to the output layer! So, when we use these weights to calculate the error for layers closer to the input, do we use their old values, or their "updated values"?
In other words, if we were to put the the step of updating the weights in my super general algorithm, would it be:
(Updating the weights immediately)
Give input
Get output
Calculate error
Calculate change in weights
Update these weights
Repeat steps 3,4,5 until we reach the input level
OR
(Using the "old" values of the weights)
Give input
Get output
Calculate error
Calculate change in weights
Store these changes in a matrix, but don't change these weights yet
Repeat steps 3,4,5 until we reach the input level
Update the weights all at once using our stored values
In this paper I read, in both abstract examples (the ones based on figures 3.3 and 3.4), they say to use the old values, not to immediately update the values. However, in their "worked example 3.1", they use the new values (even though what they say they're using are the old values) for calculating the error of the hidden layer.
Also, in my book "Introduction to Machine Learning by Ethem Alpaydin", though there is a lot of abstract stuff I don't yet understand, he says "Note that the change in the first-layer weight delta-w_hj, makes use of the second layer weight v_h. Therefore, we should calculate the changes in both layers and update the first-layer weights, making use of the old value of the second-layer weights, then update the second-layer weights."
To be honest, it really seems like they just made a mistake and all the weights are updated simultaneously at the end, but I want to be sure. My ANN is giving me strange results, and I want to be positive that this isn't the cause.
Anyone know?
Thanks!
As far as I know, you should update weights immediately. The purpose of back-propagation is to find weights that minimize the error of the ANN, and it does so by doing a gradient descent. I think the algorithm description in the Wikipedia page is quite good. You may also double-check its implementation in the joone engine.
You are usually backpropagating deltas not errors. These deltas are calculated from the errors, but they do not mean the same thing. Once you have the deltas for layer n (counting from input to output) you use these deltas and the weigths from the layer n to calculate the deltas for layer n-1 (one closer to input). The deltas only have a meaning for the old state of the network, not for the new state, so you should always use the old weights for propagating the deltas back to the input.
Deltas mean in a sense how much each part of the NN has contributed to the error before, not how much it will contribute to the error in the next step (because you do not know the actual error yet).
As with most machine-learning techniques it will probably still work, if you use the updated, weights, but it might converge slower.
If you simply train it on a single input-output pair my intuition would be to update weights immediately, because the gradient is not constant. But I don't think your book mentions only a single input-output pair. Usually you come up with an ANN because you have many input-output samples from a function you would like to model with the ANN. Thus your loops should repeat from step 1 instead of from step 3.
If we label your two methods as new->online and old->offline, then we have two algorithms.
The online algorithm is good when you don't know how many sample input-output relations you are going to see, and you don't mind some randomness in they way the weights update.
The offline algorithm is good if you want to fit a particular set of data optimally. To avoid overfitting the samples in your data set, you can split it into a training set and a test set. You use the training set to update the weights, and the test set to measure how good a fit you have. When the error on the test set begins to increase, you are done.
Which algorithm is best depends on the purpose of using an ANN. Since you talk about training until you "reach input level", I assume you train until output is exactly as the target value in the data set. In this case the offline algorithm is what you need. If you were building a backgammon playing program, the online algorithm would be a better because you have an unlimited data set.
In this book, the author talks about how the whole point of the backpropagation algorithm is that it allows you to efficiently compute all the weights in one go. In other words, using the "old values" is efficient. Using the new values is more computationally expensive, and so that's why people use the "old values" to update the weights.

Writing Simulated Annealing algorithm for 0-1 knapsack in C#

I'm in the process of learning about simulated annealing algorithms and have a few questions on how I would modify an example algorithm to solve a 0-1 knapsack problem.
I found this great code on CP:
http://www.codeproject.com/KB/recipes/simulatedAnnealingTSP.aspx
I'm pretty sure I understand how it all works now (except the whole Bolzman condition, as far as I'm concerned is black magic, though I understand about escaping local optimums and apparently this does exactly that). I'd like to re-design this to solve a 0-1 knapsack-"ish" problem. Basically I'm putting one of 5,000 objects in 10 sacks and need to optimize for the least unused space. The actual "score" I assign to a solution is a bit more complex, but not related to the algorithm.
This seems easy enough. This means the Anneal() function would be basically the same. I'd have to implement the GetNextArrangement() function to fit my needs. In the TSM problem, he just swaps two random nodes along the path (ie, he makes a very small change each iteration).
For my problem, on the first iteration, I'd pick 10 random objects and look at the leftover space. For the next iteration, would I just pick 10 new random objects? Or am I best only swapping out a few of the objects, like half of them or only even one of them? Or maybe the number of objects I swap out should be relative to the temperature? Any of these seem doable to me, I'm just wondering if someone has some advice on the best approach (though I can mess around with improvements once I have the code working).
Thanks!
Mike
With simulated annealing, you want to make neighbour states as close in energy as possible. If the neighbours have significantly greater energy, then it will just never jump to them without a very high temperature -- high enough that it will never make progress. On the other hand, if you can come up with heuristics that exploit lower-energy states, then exploit them.
For the TSP, this means swapping adjacent cities. For your problem, I'd suggest a conditional neighbour selection algorithm as follows:
If there are objects that fit in the empty space, then it always puts the biggest one in.
If no objects fit in the empty space, then pick an object to swap out -- but prefer to swap objects of similar sizes.
That is, objects have a probability inverse to the difference in their sizes. You might want to use something like roulette selection here, with the slice size being something like (1 / (size1 - size2)^2).
Ah, I think I found my answer on Wikipedia.. It suggests moving to a "neighbor" state, which usually implies changing as little as possible (like swapping two cities in a TSM problem)..
From: http://en.wikipedia.org/wiki/Simulated_annealing
"The neighbours of a state are new states of the problem that are produced after altering the given state in some particular way. For example, in the traveling salesman problem, each state is typically defined as a particular permutation of the cities to be visited. The neighbours of some particular permutation are the permutations that are produced for example by interchanging a pair of adjacent cities. The action taken to alter the solution in order to find neighbouring solutions is called "move" and different "moves" give different neighbours. These moves usually result in minimal alterations of the solution, as the previous example depicts, in order to help an algorithm to optimize the solution to the maximum extent and also to retain the already optimum parts of the solution and affect only the suboptimum parts. In the previous example, the parts of the solution are the parts of the tour."
So I believe my GetNextArrangement function would want to swap out a random item with an item unused in the set..

Resources