I am working on a variant of TSP where each node also has a 'Time Window' which has to be respected. Since I am using a Genetic Algorithm to solve TSPTW, I was wondering what might be some crossover techniques that might work well for TSPTW.
Thanks.
This is just idle speculation, but the standard TSP tends to benefit from operators that attempt to preserve the adjacency of nodes in the parent tours. So what's considered important is not that city X appears at a particular position, but instead that it appears next to cities Y and Z, wherever in the string they appear. There are operators like Edge Assembly Crossover (EAX) that have been designed specifically to attempt to exploit this structure.
In your case, the time windows presumably mean that, unlike TSP, tours like 01234567 are different from tours like 56701234, and thus the absolute position of a city in the tour matters as well. In cases like QAP where absolute position is important, people tend to do things like Cycle Crossover (CX).
If I was committed to a GA for this problem, I might start by doing something obvious like implementing both CX and EAX and choosing between them at random. Alternately, you might attempt to design a single hybrid operator that combined elements of both, but that is probably rather non-trivial to do.
I suspect, however, that a GA might not be the way to go here, or at the very least, a black box GA might suffer. An operator that attempted to use semantic information from the problem instance to, for example, tend to group cities near one another if they had similar time windows might be effective. And my $5 says that a good local search algorithm (tabu search, variable neighborhood search, etc.) might beat the GA in any case, although that's based on nothing more than a hunch.
Related
I understand that each particle is a solution to a specific function, and each particle and the swarm is constantly searching for the best solution. If the global best is found after the first iteration, and no new particles are being added to the mix, shouldn't the loop just quit and the first global best found be the most fitting solution? If this is the case what makes PSO better than just iterating through a list.
Your terminology is a bit off. Simple PSO is a search for a vector x that minimizes some scalar objective function E(x). It does this by creating many candidate vectors. Call them x_i. These are the "particles". They are initialized randomly in both position and rate of change, also called velocity, which is consistent with the idea of a moving particle, even though that particle may have many more than 3 dimensions.
Simple rules describe how the position and velocity change over time. The rules are chosen so that each particle x_i tends randomly to move in directions that reduce E(x_i).
The rules usually involve tracking the "single best x_i value seen so far" and are tuned so that all particles tend to head generally toward that best value with random variations. So the particles swarm like buzzing bees, heading as a group toward a common goal, but with many deviations by individual bees that, over time, cause the common goal to change.
It's unfortunate that some of the literature calls this goal or best particle value seen so far "the global minimum." In optimization, global minimum has a different meaning. A global minimum (there can be more than one when there are "ties" for best) is a value of x that - out of the entire domain of possible x values - produces the unique minimum possible value of E(x).
In no way is PSO guaranteed to find a global minimum. In fact, your question is a bit nonsensical in that one generally never knows when a global minimum has been found. How would you? In most problems you don't even know the gradient of E (which gives the direction taking E to smaller values, i.e. downhill). This is why you are using PSO in the first place. If you know the gradient, you can almost certainly use numerical techniques that will find an answer more quickly than PSO. Without a gradient, you can't even be sure you've found a local minimum, let alone a global one.
Rather, the best you can usually do is "guess" when a local minimum has been found. You do this by letting the system run while watching how often and by how much the "best particle seen so far" is being updated. When the changes become infrequent and/or small, you declare victory.
Another way of putting this is that PSO is used on problems where reducing E(x) is always good and "you'll take anything you can get" regardless of whether you have any confidence that what you got is the best possible. E.g. you're Walmart and any way of locating your stores that saves/makes more dollars is interesting.
With all this as background, let's recap your specific questions:
If the global best is found after the first iteration, and no new particles are being added to the mix, shouldn't the loop just quit and the first global best found be the most fitting solution?
There's no answer because there's no way to determine a global best has been found. The swarm of buzzing particles might find a new best in the next iteration or ten trillion iterations from now. You seldom know.
If this is the case what makes PSO better than just iterating through a list?
I don't exactly grok what you mean by this. The PSO is emulating the way swarms of biological entities like bugs and herd animals behave. In this manner it resembles genetic algorithms, simulated annealing, neural networks, and other families of solution finders that use the following logic: Nature, both physical and biological, has known-good optimization processes. Let's take advantage of them and do our best to emulate them in software. We are using nature to do better than any simple iteration we might devise ourselves.
Given a function, a particle swarm attempts to find the solution (a vector) that will minimize (or sometimes maximize, depending on the problem) the value to that function.
If you happen to know the minimum of the solution (suppose for argument sake, it is 0) AND
if you are lucky enough to generate the solution that gives you 0 on the first step, then you can exit the loop and stop the algorithm.
That said; the probability of you randomly generating that solution on initialization is infinitely small.
In most practical terms, when you would want to use a PSO to solve, it is most likely that you will not know the minimum value, so you wont be able to use that as a stopping condition.
The particle swarm optimization, the optimization process is not in the way the random initial step occurs, but rather the modification that occurs by adapting the initial solution with the velocity determined by social and cognitive component.
The social component consists of the current evaluated global best solution of the swarm
The cognitive component consists of a the best location seen by the current solution.
This adjustment will move the particle along a line between the global best and the current best - in hope there is a better solution between them.
I hope that answers the question in some way
Just to add some piece in answering, your problem seems to be linked to the common issue of "when should I stop my PSO?" A question everyone is faced when launching a swarm since (as clearly explained above) you never know if you reached the global best solution (except in very specific objective functions).
Usual tricks already present in most PSO implementation:
1- just limit a number of iterations since there is always a limit in processing time (and you could implement different ways to convert the iterations number into a time limit by self assessment of time spent to evaluate the objective).
2- stop the algorithm when the progress in optimization starts to be insignificant.
I'm having trouble writing the pathfinding routine for the AI in a simple Elite-esque game I'm writing.
The world in this game is a handful of "systems" connected by "wormholes", and a ship can jump from the system it's in to any system it's linked to once per turn. The AI is limited to only knowing things that it should know; it doesn't know what links come from a system it hasn't been to (though it can work it out from the systems it has seen, since links are two-way). Other parts of the AI decide which system the ship needs to get to based on what goods it has in its inventory and how much it remembers things being worth on systems it has passed through.
The problem is, I don't know how to approach the problem of finding a path to the target system. I can't use A*; there's no way to determine the 'distance' to another system without pathing to it. I also need this algorithm to be efficient, since it'll need to run about 100 times every time the player takes his turn.
Does anyone know of a suitable algorithm?
I ended up implementing a bidirectional, greedy version of breadth-first search, which suits the purpose well enough. To put it simply, I just had the program look through each node its starting node connected through, then each node those nodes connected to, then each node those connected to... until the destination node was found.
Normally one would build a list of appropriate paths and pick the shortest one, but I tried a different method; I had the program run two searches in parallel, one from the starting point, and one from the end point. When the 'from' search found the last node of the 'to' search, the path was considered found.
It then optimizes the path by checking if each node on the path connects to a node further up in the path, and deleting each node in between them.
Whether or not this funky algorithm is actually any better than a straight BFS remains to be seen.
When it comess to unknown environments, I usually use an evolutionary algorithm approach. It doesn't guarantee that you'll find the best solution in the small timeframe you have, but is a way to approach such a problem.
Have a look at Partially Observable Markov Decision Problems (POMDP). You should be able to express your problem with this model.
Then you can use an algorithm that solves these problems to try to find a good solution. Note that solving POMDPs is usually very expensive and you probably have to resort to approximate methods.
Easiest way to cheat this would be o go through, or at least attempt to access as many systems as possible, then implement the distance heuristic as the sum of all the systems you've been to.
Alternatively, and way cooler:
I've implemented something similar using ACO (Ant colony optimization) and worked pretty well combined with PSO(particle swarm optimization), however, the additional constraints your system is imposing means that you'll have to spend a few (at least one) sessions figuring out the environment layout, and if it's dynamic... well... though.
The good thing is that this algorithm completely bypasses the need for heuristic generation which is what you need since you are flying blind. Be advised though, that this is a bad idea if your search space (number of runs) is small. (100 may be acceptable, but 10 or 5 ... not so much).
This scales up quite nicely when dealing with large numbers of nodes (systems) and it bypasses the heuristic distance computational need for every node-to-node relationship, thereby making it more efficient.
Good luck.
I'm using C#, but in the future I might need to use it on other languages.
Many games have such puzzles. There is a group of wires (there are 2 types of wires: straight and curved.), there is a place from where a signal comes and there is a place from where the signal must leave. But the arrangement of the wires doesn't allow that to happen. You must turn some of the wires in order to create a path for the signal.
Yeah, I'm trying to find the continent America again, in order not to try finding it more than once in the future.
Somewhere in the future I will also try the same thing but this time with wires that split the signal to 2 or 3 signals.
Problem is that I can't think of an algorithm that I can imagine how to turn it into a code. I have been thinking for some time and I can't think of anything good.
So, can you help me? I will be able to understand the algorithm as "what does the program has to do", but I basically need help with understanding the algorithm as "how to write the code".
Thanks!
Take a look at some maze generation algorithms -- they do the same thing you're looking for, and as such are what you'll need to create the grid. Pick a simple 'cell-carver' from the ones I linked to; randomly rotate all of the wire-pieces, et voilĂ !
A comment on your question notes that turning an algorithm into code involves breaking it down bit by bit -- so that's what we'll do:
You'd start with a grid of potential wire locations (probably 2D, but a 3D game would be amazing).
To produce a solvable level, you could do a couple things -- produce a level, and see if it is solvable (bad), or produce a solved level, then "unsolve" it (better). As I alluded to above, producing a solved level involves algorithms very similar to maze generation -- a solved level would have many traversable "paths", just like a maze. Unsolving the level would just loop through all wire pieces, and rotate them a bit.
But what of maze generation? The resource I linked to contains some algorithms that would be perfect for your use -- a simple randomized DFS should be plenty for your needs.
You'll note that I covered the general case involving branching wires -- as it is, excluding branching means you have to do more coding, to "prune" branches intelligently -- that is, backtracking when your one true path gets stuck, say in a corner, due to the random movement probably necessary to make an interesting level.
You should also note that a solution for such a game would, if I understood correctly, require use of all given wire (I know some games like this). Otherwise, the game would probably be a lot simpler to play given the above generation tactics.
The Union-Find algorithms (that answer my problem) are explained in this PDF file: http://www.cs.princeton.edu/~rs/AlgsDS07/01UnionFind.pdf
I don't think I would have ever thought of this algorithm, let alone making it fast.
I'm looking for algorithms to find a "best" set of parameter values. The function in question has a lot of local minima and changes very quickly. To make matters even worse, testing a set of parameters is very slow - on the order of 1 minute - and I can't compute the gradient directly.
Are there any well-known algorithms for this kind of optimization?
I've had moderate success with just trying random values. I'm wondering if I can improve the performance by making the random parameter chooser have a lower chance of picking parameters close to ones that had produced bad results in the past. Is there a name for this approach so that I can search for specific advice?
More info:
Parameters are continuous
There are on the order of 5-10 parameters. Certainly not more than 10.
How many parameters are there -- eg, how many dimensions in the search space? Are they continuous or discrete - eg, real numbers, or integers, or just a few possible values?
Approaches that I've seen used for these kind of problems have a similar overall structure - take a large number of sample points, and adjust them all towards regions that have "good" answers somehow. Since you have a lot of points, their relative differences serve as a makeshift gradient.
Simulated
Annealing: The classic approach. Take a bunch of points, probabalistically move some to a neighbouring point chosen at at random depending on how much better it is.
Particle
Swarm Optimization: Take a "swarm" of particles with velocities in the search space, probabalistically randomly move a particle; if it's an improvement, let the whole swarm know.
Genetic Algorithms: This is a little different. Rather than using the neighbours information like above, you take the best results each time and "cross-breed" them hoping to get the best characteristics of each.
The wikipedia links have pseudocode for the first two; GA methods have so much variety that it's hard to list just one algorithm, but you can follow links from there. Note that there are implementations for all of the above out there that you can use or take as a starting point.
Note that all of these -- and really any approach to this large-dimensional search algorithm - are heuristics, which mean they have parameters which have to be tuned to your particular problem. Which can be tedious.
By the way, the fact that the function evaluation is so expensive can be made to work for you a bit; since all the above methods involve lots of independant function evaluations, that piece of the algorithm can be trivially parallelized with OpenMP or something similar to make use of as many cores as you have on your machine.
Your situation seems to be similar to that of the poster of Software to Tune/Calibrate Properties for Heuristic Algorithms, and I would give you the same advice I gave there: consider a Metropolis-Hastings like approach with multiple walkers and a simulated annealing of the step sizes.
The difficulty in using a Monte Carlo methods in your case is the expensive evaluation of each candidate. How expensive, compared to the time you have at hand? If you need a good answer in a few minutes this isn't going to be fast enough. If you can leave it running over night, it'll work reasonably well.
Given a complicated search space, I'd recommend a random initial distributed. You final answer may simply be the best individual result recorded during the whole run, or the mean position of the walker with the best result.
Don't be put off that I was discussing maximizing there and you want to minimize: the figure of merit can be negated or inverted.
I've tried Simulated Annealing and Particle Swarm Optimization. (As a reminder, I couldn't use gradient descent because the gradient cannot be computed).
I've also tried an algorithm that does the following:
Pick a random point and a random direction
Evaluate the function
Keep moving along the random direction for as long as the result keeps improving, speeding up on every successful iteration.
When the result stops improving, step back and instead attempt to move into an orthogonal direction by the same distance.
This "orthogonal direction" was generated by creating a random orthogonal matrix (adapted this code) with the necessary number of dimensions.
If moving in the orthogonal direction improved the result, the algorithm just continued with that direction. If none of the directions improved the result, the jump distance was halved and a new set of orthogonal directions would be attempted. Eventually the algorithm concluded it must be in a local minimum, remembered it and restarted the whole lot at a new random point.
This approach performed considerably better than Simulated Annealing and Particle Swarm: it required fewer evaluations of the (very slow) function to achieve a result of the same quality.
Of course my implementations of S.A. and P.S.O. could well be flawed - these are tricky algorithms with a lot of room for tweaking parameters. But I just thought I'd mention what ended up working best for me.
I can't really help you with finding an algorithm for your specific problem.
However in regards to the random choosing of parameters I think what you are looking for are genetic algorithms. Genetic algorithms are generally based on choosing some random input, selecting those, which are the best fit (so far) for the problem, and randomly mutating/combining them to generate a next generation for which again the best are selected.
If the function is more or less continous (that is small mutations of good inputs generally won't generate bad inputs (small being a somewhat generic)), this would work reasonably well for your problem.
There is no generalized way to answer your question. There are lots of books/papers on the subject matter, but you'll have to choose your path according to your needs, which are not clearly spoken here.
Some things to know, however - 1min/test is way too much for any algorithm to handle. I guess that in your case, you must really do one of the following:
get 100 computers to cut your parameter testing time to some reasonable time
really try to work out your parameters by hand and mind. There must be some redundancy and at least some sanity check so you can test your case in <1min
for possible result sets, try to figure out some 'operations' that modify it slightly instead of just randomizing it. For example, in TSP some basic operator is lambda, that swaps two nodes and thus creates new route. Your can be shifting some number up/down for some value.
then, find yourself some nice algorithm, your starting point can be somewhere here. The book is invaluable resource for anyone who starts with problem-solving.
I'm designing a realtime strategy wargame where the AI will be responsible for controlling a large number of units (possibly 1000+) on a large hexagonal map.
A unit has a number of action points which can be expended on movement, attacking enemy units or various special actions (e.g. building new units). For example, a tank with 5 action points could spend 3 on movement then 2 in firing on an enemy within range. Different units have different costs for different actions etc.
Some additional notes:
The output of the AI is a "command" to any given unit
Action points are allocated at the beginning of a time period, but may be spent at any point within the time period (this is to allow for realtime multiplayer games). Hence "do nothing and save action points for later" is a potentially valid tactic (e.g. a gun turret that cannot move waiting for an enemy to come within firing range)
The game is updating in realtime, but the AI can get a consistent snapshot of the game state at any time (thanks to the game state being one of Clojure's persistent data structures)
I'm not expecting "optimal" behaviour, just something that is not obviously stupid and provides reasonable fun/challenge to play against
What can you recommend in terms of specific algorithms/approaches that would allow for the right balance between efficiency and reasonably intelligent behaviour?
If you read Russell and Norvig, you'll find a wealth of algorithms for every purpose, updated to pretty much today's state of the art. That said, I was amazed at how many different problem classes can be successfully approached with Bayesian algorithms.
However, in your case I think it would be a bad idea for each unit to have its own Petri net or inference engine... there's only so much CPU and memory and time available. Hence, a different approach:
While in some ways perhaps a crackpot, Stephen Wolfram has shown that it's possible to program remarkably complex behavior on a basis of very simple rules. He bravely extrapolates from the Game of Life to quantum physics and the entire universe.
Similarly, a lot of research on small robots is focusing on emergent behavior or swarm intelligence. While classic military strategy and practice are strongly based on hierarchies, I think that an army of completely selfless, fearless fighters (as can be found marching in your computer) could be remarkably effective if operating as self-organizing clusters.
This approach would probably fit a little better with Erlang's or Scala's actor-based concurrency model than with Clojure's STM: I think self-organization and actors would go together extremely well. Still, I could envision running through a list of units at each turn, and having each unit evaluating just a small handful of very simple rules to determine its next action. I'd be very interested to hear if you've tried this approach, and how it went!
EDIT
Something else that was on the back of my mind but that slipped out again while I was writing: I think you can get remarkable results from this approach if you combine it with genetic or evolutionary programming; i.e. let your virtual toy soldiers wage war on each other as you sleep, let them encode their strategies and mix, match and mutate their code for those strategies; and let a refereeing program select the more successful warriors.
I've read about some startling successes achieved with these techniques, with units operating in ways we'd never think of. I have heard of AIs working on these principles having had to be intentionally dumbed down in order not to frustrate human opponents.
First you should aim to make your game turn based at some level for the AI (i.e. you can somehow model it turn based even if it may not be entirely turn based, in RTS you may be able to break discrete intervals of time into turns.) Second, you should determine how much information the AI should work with. That is, if the AI is allowed to cheat and know every move of its opponent (thereby making it stronger) or if it should know less or more. Third, you should define a cost function of a state. The idea being that a higher cost means a worse state for the computer to be in. Fourth you need a move generator, generating all valid states the AI can transition to from a given state (this may be homogeneous [state-independent] or heterogeneous [state-dependent].)
The thing is, the cost function will be greatly influenced by what exactly you define the state to be. The more information you encode in the state the better balanced your AI will be but the more difficult it will be for it to perform, as it will have to search exponentially more for every additional state variable you include (in an exhaustive search.)
If you provide a definition of a state and a cost function your problem transforms to a general problem in AI that can be tackled with any algorithm of your choice.
Here is a summary of what I think would work well:
Evolutionary algorithms may work well if you put enough effort into them, but they will add a layer of complexity that will create room for bugs amongst other things that can go wrong. They will also require extreme amounts of tweaking of the fitness function etc. I don't have much experience working with these but if they are anything like neural networks (which I believe they are since both are heuristics inspired by biological models) you will quickly find they are fickle and far from consistent. Most importantly, I doubt they add any benefits over the option I describe in 3.
With the cost function and state defined it would technically be possible for you to apply gradient decent (with the assumption that the state function is differentiable and the domain of the state variables are continuous) however this would probably yield inferior results, since the biggest weakness of gradient descent is getting stuck in local minima. To give an example, this method would be prone to something like attacking the enemy always as soon as possible because there is a non-zero chance of annihilating them. Clearly, this may not be desirable behaviour for a game, however, gradient decent is a greedy method and doesn't know better.
This option would be my most highest recommended one: simulated annealing. Simulated annealing would (IMHO) have all the benefits of 1. without the added complexity while being much more robust than 2. In essence SA is just a random walk amongst the states. So in addition to the cost and states you will have to define a way to randomly transition between states. SA is also not prone to be stuck in local minima, while producing very good results quite consistently. The only tweaking required with SA would be the cooling schedule--which decides how fast SA will converge. The greatest advantage of SA I find is that it is conceptually simple and produces superior results empirically to most other methods I have tried. Information on SA can be found here with a long list of generic implementations at the bottom.
3b. (Edit Added much later) SA and the techniques I listed above are general AI techniques and not really specialized to AI for games. In general, the more specialized the algorithm the more chance it has at performing better. See No Free Lunch Theorem 2. Another extension of 3 is something called parallel tempering which dramatically improves the performance of SA by helping it avoid local optima. Some of the original papers on parallel tempering are quite dated 3, but others have been updated4.
Regardless of what method you choose in the end, its going to be very important to break your problem down into states and a cost function as I said earlier. As a rule of thumb I would start with 20-50 state variables as your state search space is exponential in the number of these variables.
This question is huge in scope. You are basically asking how to write a strategy game.
There are tons of books and online articles for this stuff. I strongly recommend the Game Programming Wisdom series and AI Game Programming Wisdom series. In particular, Section 6 of the first volume of AI Game Programming Wisdom covers general architecture, Section 7 covers decision-making architectures, and Section 8 covers architectures for specific genres (8.2 does the RTS genre).
It's a huge question, and the other answers have pointed out amazing resources to look into.
I've dealt with this problem in the past and found the simple-behavior-manifests-complexly/emergent behavior approach a bit too unwieldy for human design unless approached genetically/evolutionarily.
I ended up instead using abstracted layers of AI, similar to a way armies work in real life. Units would be grouped with nearby units of the same time into squads, which are grouped with nearby squads to create a mini battalion of sorts. More layers could be use here (group battalions in a region, etc.), but ultimately at the top there is the high-level strategic AI.
Each layer can only issue commands to the layers directly below it. The layer below it will then attempt to execute the command with the resources at hand (ie, the layers below that layer).
An example of a command issued to a single unit is "Go here" and "shoot at this target". Higher level commands issued to higher levels would be "secure this location", which that level would process and issue the appropriate commands to the lower levels.
The highest level master AI is responsible for very board strategic decisions, such as "we need more ____ units", or "we should aim to move towards this location".
The army analogy works here; commanders and lieutenants and chain of command.