I'm looking for algorithms to find a "best" set of parameter values. The function in question has a lot of local minima and changes very quickly. To make matters even worse, testing a set of parameters is very slow - on the order of 1 minute - and I can't compute the gradient directly.
Are there any well-known algorithms for this kind of optimization?
I've had moderate success with just trying random values. I'm wondering if I can improve the performance by making the random parameter chooser have a lower chance of picking parameters close to ones that had produced bad results in the past. Is there a name for this approach so that I can search for specific advice?
More info:
Parameters are continuous
There are on the order of 5-10 parameters. Certainly not more than 10.
How many parameters are there -- eg, how many dimensions in the search space? Are they continuous or discrete - eg, real numbers, or integers, or just a few possible values?
Approaches that I've seen used for these kind of problems have a similar overall structure - take a large number of sample points, and adjust them all towards regions that have "good" answers somehow. Since you have a lot of points, their relative differences serve as a makeshift gradient.
Simulated
Annealing: The classic approach. Take a bunch of points, probabalistically move some to a neighbouring point chosen at at random depending on how much better it is.
Particle
Swarm Optimization: Take a "swarm" of particles with velocities in the search space, probabalistically randomly move a particle; if it's an improvement, let the whole swarm know.
Genetic Algorithms: This is a little different. Rather than using the neighbours information like above, you take the best results each time and "cross-breed" them hoping to get the best characteristics of each.
The wikipedia links have pseudocode for the first two; GA methods have so much variety that it's hard to list just one algorithm, but you can follow links from there. Note that there are implementations for all of the above out there that you can use or take as a starting point.
Note that all of these -- and really any approach to this large-dimensional search algorithm - are heuristics, which mean they have parameters which have to be tuned to your particular problem. Which can be tedious.
By the way, the fact that the function evaluation is so expensive can be made to work for you a bit; since all the above methods involve lots of independant function evaluations, that piece of the algorithm can be trivially parallelized with OpenMP or something similar to make use of as many cores as you have on your machine.
Your situation seems to be similar to that of the poster of Software to Tune/Calibrate Properties for Heuristic Algorithms, and I would give you the same advice I gave there: consider a Metropolis-Hastings like approach with multiple walkers and a simulated annealing of the step sizes.
The difficulty in using a Monte Carlo methods in your case is the expensive evaluation of each candidate. How expensive, compared to the time you have at hand? If you need a good answer in a few minutes this isn't going to be fast enough. If you can leave it running over night, it'll work reasonably well.
Given a complicated search space, I'd recommend a random initial distributed. You final answer may simply be the best individual result recorded during the whole run, or the mean position of the walker with the best result.
Don't be put off that I was discussing maximizing there and you want to minimize: the figure of merit can be negated or inverted.
I've tried Simulated Annealing and Particle Swarm Optimization. (As a reminder, I couldn't use gradient descent because the gradient cannot be computed).
I've also tried an algorithm that does the following:
Pick a random point and a random direction
Evaluate the function
Keep moving along the random direction for as long as the result keeps improving, speeding up on every successful iteration.
When the result stops improving, step back and instead attempt to move into an orthogonal direction by the same distance.
This "orthogonal direction" was generated by creating a random orthogonal matrix (adapted this code) with the necessary number of dimensions.
If moving in the orthogonal direction improved the result, the algorithm just continued with that direction. If none of the directions improved the result, the jump distance was halved and a new set of orthogonal directions would be attempted. Eventually the algorithm concluded it must be in a local minimum, remembered it and restarted the whole lot at a new random point.
This approach performed considerably better than Simulated Annealing and Particle Swarm: it required fewer evaluations of the (very slow) function to achieve a result of the same quality.
Of course my implementations of S.A. and P.S.O. could well be flawed - these are tricky algorithms with a lot of room for tweaking parameters. But I just thought I'd mention what ended up working best for me.
I can't really help you with finding an algorithm for your specific problem.
However in regards to the random choosing of parameters I think what you are looking for are genetic algorithms. Genetic algorithms are generally based on choosing some random input, selecting those, which are the best fit (so far) for the problem, and randomly mutating/combining them to generate a next generation for which again the best are selected.
If the function is more or less continous (that is small mutations of good inputs generally won't generate bad inputs (small being a somewhat generic)), this would work reasonably well for your problem.
There is no generalized way to answer your question. There are lots of books/papers on the subject matter, but you'll have to choose your path according to your needs, which are not clearly spoken here.
Some things to know, however - 1min/test is way too much for any algorithm to handle. I guess that in your case, you must really do one of the following:
get 100 computers to cut your parameter testing time to some reasonable time
really try to work out your parameters by hand and mind. There must be some redundancy and at least some sanity check so you can test your case in <1min
for possible result sets, try to figure out some 'operations' that modify it slightly instead of just randomizing it. For example, in TSP some basic operator is lambda, that swaps two nodes and thus creates new route. Your can be shifting some number up/down for some value.
then, find yourself some nice algorithm, your starting point can be somewhere here. The book is invaluable resource for anyone who starts with problem-solving.
I'm designing a realtime strategy wargame where the AI will be responsible for controlling a large number of units (possibly 1000+) on a large hexagonal map.
A unit has a number of action points which can be expended on movement, attacking enemy units or various special actions (e.g. building new units). For example, a tank with 5 action points could spend 3 on movement then 2 in firing on an enemy within range. Different units have different costs for different actions etc.
Some additional notes:
The output of the AI is a "command" to any given unit
Action points are allocated at the beginning of a time period, but may be spent at any point within the time period (this is to allow for realtime multiplayer games). Hence "do nothing and save action points for later" is a potentially valid tactic (e.g. a gun turret that cannot move waiting for an enemy to come within firing range)
The game is updating in realtime, but the AI can get a consistent snapshot of the game state at any time (thanks to the game state being one of Clojure's persistent data structures)
I'm not expecting "optimal" behaviour, just something that is not obviously stupid and provides reasonable fun/challenge to play against
What can you recommend in terms of specific algorithms/approaches that would allow for the right balance between efficiency and reasonably intelligent behaviour?
If you read Russell and Norvig, you'll find a wealth of algorithms for every purpose, updated to pretty much today's state of the art. That said, I was amazed at how many different problem classes can be successfully approached with Bayesian algorithms.
However, in your case I think it would be a bad idea for each unit to have its own Petri net or inference engine... there's only so much CPU and memory and time available. Hence, a different approach:
While in some ways perhaps a crackpot, Stephen Wolfram has shown that it's possible to program remarkably complex behavior on a basis of very simple rules. He bravely extrapolates from the Game of Life to quantum physics and the entire universe.
Similarly, a lot of research on small robots is focusing on emergent behavior or swarm intelligence. While classic military strategy and practice are strongly based on hierarchies, I think that an army of completely selfless, fearless fighters (as can be found marching in your computer) could be remarkably effective if operating as self-organizing clusters.
This approach would probably fit a little better with Erlang's or Scala's actor-based concurrency model than with Clojure's STM: I think self-organization and actors would go together extremely well. Still, I could envision running through a list of units at each turn, and having each unit evaluating just a small handful of very simple rules to determine its next action. I'd be very interested to hear if you've tried this approach, and how it went!
EDIT
Something else that was on the back of my mind but that slipped out again while I was writing: I think you can get remarkable results from this approach if you combine it with genetic or evolutionary programming; i.e. let your virtual toy soldiers wage war on each other as you sleep, let them encode their strategies and mix, match and mutate their code for those strategies; and let a refereeing program select the more successful warriors.
I've read about some startling successes achieved with these techniques, with units operating in ways we'd never think of. I have heard of AIs working on these principles having had to be intentionally dumbed down in order not to frustrate human opponents.
First you should aim to make your game turn based at some level for the AI (i.e. you can somehow model it turn based even if it may not be entirely turn based, in RTS you may be able to break discrete intervals of time into turns.) Second, you should determine how much information the AI should work with. That is, if the AI is allowed to cheat and know every move of its opponent (thereby making it stronger) or if it should know less or more. Third, you should define a cost function of a state. The idea being that a higher cost means a worse state for the computer to be in. Fourth you need a move generator, generating all valid states the AI can transition to from a given state (this may be homogeneous [state-independent] or heterogeneous [state-dependent].)
The thing is, the cost function will be greatly influenced by what exactly you define the state to be. The more information you encode in the state the better balanced your AI will be but the more difficult it will be for it to perform, as it will have to search exponentially more for every additional state variable you include (in an exhaustive search.)
If you provide a definition of a state and a cost function your problem transforms to a general problem in AI that can be tackled with any algorithm of your choice.
Here is a summary of what I think would work well:
Evolutionary algorithms may work well if you put enough effort into them, but they will add a layer of complexity that will create room for bugs amongst other things that can go wrong. They will also require extreme amounts of tweaking of the fitness function etc. I don't have much experience working with these but if they are anything like neural networks (which I believe they are since both are heuristics inspired by biological models) you will quickly find they are fickle and far from consistent. Most importantly, I doubt they add any benefits over the option I describe in 3.
With the cost function and state defined it would technically be possible for you to apply gradient decent (with the assumption that the state function is differentiable and the domain of the state variables are continuous) however this would probably yield inferior results, since the biggest weakness of gradient descent is getting stuck in local minima. To give an example, this method would be prone to something like attacking the enemy always as soon as possible because there is a non-zero chance of annihilating them. Clearly, this may not be desirable behaviour for a game, however, gradient decent is a greedy method and doesn't know better.
This option would be my most highest recommended one: simulated annealing. Simulated annealing would (IMHO) have all the benefits of 1. without the added complexity while being much more robust than 2. In essence SA is just a random walk amongst the states. So in addition to the cost and states you will have to define a way to randomly transition between states. SA is also not prone to be stuck in local minima, while producing very good results quite consistently. The only tweaking required with SA would be the cooling schedule--which decides how fast SA will converge. The greatest advantage of SA I find is that it is conceptually simple and produces superior results empirically to most other methods I have tried. Information on SA can be found here with a long list of generic implementations at the bottom.
3b. (Edit Added much later) SA and the techniques I listed above are general AI techniques and not really specialized to AI for games. In general, the more specialized the algorithm the more chance it has at performing better. See No Free Lunch Theorem 2. Another extension of 3 is something called parallel tempering which dramatically improves the performance of SA by helping it avoid local optima. Some of the original papers on parallel tempering are quite dated 3, but others have been updated4.
Regardless of what method you choose in the end, its going to be very important to break your problem down into states and a cost function as I said earlier. As a rule of thumb I would start with 20-50 state variables as your state search space is exponential in the number of these variables.
This question is huge in scope. You are basically asking how to write a strategy game.
There are tons of books and online articles for this stuff. I strongly recommend the Game Programming Wisdom series and AI Game Programming Wisdom series. In particular, Section 6 of the first volume of AI Game Programming Wisdom covers general architecture, Section 7 covers decision-making architectures, and Section 8 covers architectures for specific genres (8.2 does the RTS genre).
It's a huge question, and the other answers have pointed out amazing resources to look into.
I've dealt with this problem in the past and found the simple-behavior-manifests-complexly/emergent behavior approach a bit too unwieldy for human design unless approached genetically/evolutionarily.
I ended up instead using abstracted layers of AI, similar to a way armies work in real life. Units would be grouped with nearby units of the same time into squads, which are grouped with nearby squads to create a mini battalion of sorts. More layers could be use here (group battalions in a region, etc.), but ultimately at the top there is the high-level strategic AI.
Each layer can only issue commands to the layers directly below it. The layer below it will then attempt to execute the command with the resources at hand (ie, the layers below that layer).
An example of a command issued to a single unit is "Go here" and "shoot at this target". Higher level commands issued to higher levels would be "secure this location", which that level would process and issue the appropriate commands to the lower levels.
The highest level master AI is responsible for very board strategic decisions, such as "we need more ____ units", or "we should aim to move towards this location".
The army analogy works here; commanders and lieutenants and chain of command.
I'm not interested in tiny optimizations giving few percents of the speed.
I'm interested in the most important heuristics for alpha-beta search. And most important components for evaluation function.
I'm particularly interested in algorithms that have greatest (improvement/code_size) ratio.
(NOT (improvement/complexity)).
Thanks.
PS
Killer move heuristic is a perfect example - easy to implement and powerful.
Database of heuristics is too complicated.
Not sure if you're already aware of it, but check out the Chess Programming Wiki - it's a great resource that covers just about every aspect of modern chess AI. In particular, relating to your question, see the Search and Evaluation sections (under Principle Topics) on the main page. You might also be able to discover some interesting techniques used in some of the programs listed here. If your questions still aren't answered, I would definitely recommend you ask in the Chess Programming Forums, where there are likely to be many more specialists around to answer. (Not that you won't necessarily get good answers here, just that it's rather more likely on topic-specific expert forums).
MTD(f) or one of the MTD variants is a big improvement over standard alpha-beta, providing you don't have really fine detail in your evaluation function and assuming that you're using the killer heuristic. The history heuristic is also useful.
The top-rated chess program Rybka has apparently abandoned MDT(f) in favour of PVS with a zero-aspiration window on the non-PV nodes.
Extended futility pruning, which incorporates both normal futility pruning and deep razoring, is theoretically unsound, but remarkably effective in practice.
Iterative deepening is another useful technique. And I listed a lot of good chess programming links here.
Even though many optimizations based on heuristics(I mean ways to increase the tree depth without actualy searching) discussed in chess programming literature, I think most of them are rarely used. The reason is that they are good performance boosters in theory, but not in practice.
Sometimes these heuristics can return a bad(I mean not the best) move too.
The people I have talked to always recommend optimizing the alpha-beta search and implementing iterative deepening into the code rather than trying to add the other heuristics.
The main reason is that computers are increasing in processing power, and research[need citation I suppose] has shown that the programs that use their full CPU time to brute force the alpha-beta tree to the maximum depth have always outrunned the programs that split their time between a certain levels of alpha-beta and then some heuristics,.
Even though using some heuristics to extend the tree depth can cause more harm than good, ther are many performance boosters you can add to the alpha-beta search algorithm.
I am sure that you are aware that for alpha-beta to work exactly as it is intended to work, you should have a move sorting mechanisn(iterative deepening). Iterative deepening can give you about 10% performace boost.
Adding Principal variation search technique to alpha beta may give you an additional 10% boost.
Try the MTD(f) algorithm too. It can also increase the performance of your engine.
One heuristic that hasn't been mentioned is Null move pruning.
Also, Ed Schröder has a great page explaining a number of tricks he used in his Rebel engine, and how much improvement each contributed to speed/performance: Inside Rebel
Using a transposition table with a zobrist hash
It takes very little code to implement [one XOR on each move or unmove, and an if statement before recursing in the game tree], and the benefits are pretty good, especially if you are already using iterative deepening, and it's pretty tweakable (use a bigger table, smaller table, replacement strategies, etc)
Killer moves are good example of small code size and great improvement in move ordering.
Most board game AI algorithms are based on http://en.wikipedia.org/wiki/Minmax MinMax. The goal is to minimize their options while maximizing your options. Although with Chess this is a very large and expensive runtime problem. To help reduce that you can combine minmax with a database of previously played games. Any game that has a similar board position and has a pattern established on how that layout was won for your color can be used as far as "analyzing" where to move next.
I am a bit confused on what you mean by improvement/code_size. Do you really mean improvement / runtime analysis (big O(n) vs. o(n))? If that is the case, talk to IBM and big blue, or Microsoft's Parallels team. At PDC I spoke with a guy (whose name escapes me now) who was demonstrating Mahjong using 8 cores per opponent and they won first place in the game algorithm design competition (whose name also escapes me).
I do not think there are any "canned" algorithms out there to always win chess and do it very fast. The way that you would have to do it is have EVERY possible previously played game indexed in a very large dictionary based database and have pre-cached the analysis of every game. It would be a VERY compex algorithm and would be a very poor improvement / complexity problem in my opinion.
I might be slightly off topic but "state of the art" chess programs use MPI such as Deep Blue for massive parallel power.
Just consider than parallel processing plays a great role in modern chess
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
I was recently in a discussion with a non-coder person on the possibilities of chess computers. I'm not well versed in theory, but think I know enough.
I argued that there could not exist a deterministic Turing machine that always won or stalemated at chess. I think that, even if you search the entire space of all combinations of player1/2 moves, the single move that the computer decides upon at each step is based on a heuristic. Being based on a heuristic, it does not necessarily beat ALL of the moves that the opponent could do.
My friend thought, to the contrary, that a computer would always win or tie if it never made a "mistake" move (however do you define that?). However, being a programmer who has taken CS, I know that even your good choices - given a wise opponent - can force you to make "mistake" moves in the end. Even if you know everything, your next move is greedy in matching a heuristic.
Most chess computers try to match a possible end game to the game in progress, which is essentially a dynamic programming traceback. Again, the endgame in question is avoidable though.
Edit: Hmm... looks like I ruffled some feathers here. That's good.
Thinking about it again, it seems like there is no theoretical problem with solving a finite game like chess. I would argue that chess is a bit more complicated than checkers in that a win is not necessarily by numerical exhaustion of pieces, but by a mate. My original assertion is probably wrong, but then again I think I've pointed out something that is not yet satisfactorily proven (formally).
I guess my thought experiment was that whenever a branch in the tree is taken, then the algorithm (or memorized paths) must find a path to a mate (without getting mated) for any possible branch on the opponent moves. After the discussion, I will buy that given more memory than we can possibly dream of, all these paths could be found.
"I argued that there could not exist a deterministic Turing machine that always won or stalemated at chess."
You're not quite right. There can be such a machine. The issue is the hugeness of the state space that it would have to search. It's finite, it's just REALLY big.
That's why chess falls back on heuristics -- the state space is too huge (but finite). To even enumerate -- much less search for every perfect move along every course of every possible game -- would be a very, very big search problem.
Openings are scripted to get you to a mid-game that gives you a "strong" position. Not a known outcome. Even end games -- when there are fewer pieces -- are hard to enumerate to determine a best next move. Technically they're finite. But the number of alternatives is huge. Even a 2 rooks + king has something like 22 possible next moves. And if it takes 6 moves to mate, you're looking at 12,855,002,631,049,216 moves.
Do the math on opening moves. While there's only about 20 opening moves, there are something like 30 or so second moves, so by the third move we're looking at 360,000 alternative game states.
But chess games are (technically) finite. Huge, but finite. There's perfect information. There are defined start and end-states, There are no coin-tosses or dice rolls.
I know next to nothing about what's actually been discovered about chess. But as a mathematician, here's my reasoning:
First we must remember that White gets to go first and maybe this gives him an advantage; maybe it gives Black an advantage.
Now suppose that there is no perfect strategy for Black that lets him always win/stalemate. This implies that no matter what Black does, there is a strategy White can follow to win. Wait a minute - this means there is a perfect strategy for White!
This tells us that at least one of the two players does have a perfect strategy which lets that player always win or draw.
There are only three possibilities, then:
White can always win if he plays perfectly
Black can always win if he plays perfectly
One player can win or draw if he plays perfectly (and if both players play perfectly then they always stalemate)
But which of these is actually correct, we may never know.
The answer to the question is yes: there must be a perfect algorithm for chess, at least for one of the two players.
It has been proven for the game of checkers that a program can always win or tie the game. That is, there is no choice of moves that one player can make which force the other player into losing.
The researchers spent almost two decades going through the 500 billion billion possible checkers positions, which is still an infinitesimally small fraction of the number of chess positions, by the way. The checkers effort included top players, who helped the research team program checkers rules of thumb into software that categorized moves as successful or unsuccessful. Then the researchers let the program run, on an average of 50 computers daily. Some days, the program ran on 200 machines. While the researchers monitored progress and tweaked the program accordingly. In fact, Chinook beat humans to win the checkers world championship back in 1994.
Yes, you can solve chess, no, you won't any time soon.
This is not a question about computers but only about the game of chess.
The question is, does there exist a fail-safe strategy for never losing the game? If such a strategy exists, then a computer which knows everything can always use it and it is not a heuristic anymore.
For example, the game tic-tac-toe normally is played based on heuristics. But, there exists a fail-safe strategy. Whatever the opponent moves, you always find a way to avoid losing the game, if you do it right from the start on.
So you would need to proof that such a strategy exists or not for chess as well. It is basically the same, just the space of possible moves is vastly bigger.
I'm coming to this thread very late, and that you've already realised some of the issues. But as an ex-master and an ex-professional chess programmer, I thought I could add a few useful facts and figures. There are several ways of measuring the complexity of chess:
The total number of chess games is approximately 10^(10^50). That number is unimaginably large.
The number of chess games of 40 moves or less is around 10^40. That's still an incredibly large number.
The number of possible chess positions is around 10^46.
The complete chess search tree (Shannon number) is around 10^123, based on an average branching factor of 35 and an average game length of 80.
For comparison, the number of atoms in the observable universe is commonly estimated to be around 10^80.
All endgames of 6 pieces or less have been collated and solved.
My conclusion: while chess is theoretically solvable, we will never have the money, the motivation, the computing power, or the storage to ever do it.
Some games have, in fact, been solved. Tic-Tac-Toe is a very easy one for which to build an AI that will always win or tie. Recently, Connect 4 has been solved as well (and shown to be unfair to the second player, since a perfect play will cause him to lose).
Chess, however, has not been solved, and I don't think there's any proof that it is a fair game (i.e., whether the perfect play results in a draw). Speaking strictly from a theoretical perspective though, Chess has a finite number of possible piece configurations. Therefore, the search space is finite (albeit, incredibly large). Therefore, a deterministic Turing machine that could play perfectly does exist. Whether one could ever be built, however, is a different matter.
The average $1000 desktop will be able to solve checkers in a mere 5 seconds by the year 2040 (5x10^20 calculations).
Even at this speed, it would still take 100 of these computers approximately 6.34 x 10^19 years to solve chess. Still not feasible. Not even close.
Around 2080, our average desktops will have approximately 10^45 calculations per second. A single computer will have the computational power to solve chess in about 27.7 hours. It will definitely be done by 2080 as long as computing power continues to grow as it has the past 30 years.
By 2090, enough computational power will exist on a $1000 desktop to solve chess in about 1 second...so by that date it will be completely trivial.
Given checkers was solved in 2007, and the computational power to solve it in 1 second will lag by about 33-35 years, we can probably roughly estimate chess will be solved somewhere between 2055-2057. Probably sooner since when more computational power is available (which will be the case in 45 years), more can be devoted to projects such as this. However, I would say 2050 at the earliest, and 2060 at the latest.
In 2060, it would take 100 average desktops 3.17 x 10^10 years to solve chess. Realize I am using a $1000 computer as my benchmark, whereas larger systems and supercomputers will probably be available as their price/performance ratio is also improving. Also, their order of magnitude of computational power increases at a faster pace. Consider a supercomputer now can perform 2.33 x 10^15 calculations per second, and a $1000 computer about 2 x 10^9. By comparison, 10 years ago the difference was 10^5 instead of 10^6. By 2060 the order of magnitude difference will probably be 10^12, and even this may increase faster than anticipated.
Much of this depends on whether or not we as human beings have the drive to solve chess, but the computational power will make it feasible around this time (as long as our pace continues).
On another note, the game of Tic-Tac-Toe, which is much, much simpler, has 2,653,002 possible calculations (with an open board). The computational power to solve Tic-Tac-Toe in roughly 2.5 (1 million calculations per second) seconds was achieved in 1990.
Moving backwards, in 1955, a computer had the power to solve Tic-Tac-Toe in about 1 month (1 calculation per second). Again, this is based on what $1000 would get you if you could package it into a computer (a $1000 desktop obviously did not exist in 1955), and this computer would have been devoted to solving Tic-Tac-Toe....which was just not the case in 1955. Computation was expensive and would not have been used for this purpose, although I don't believe there is any date where Tic-Tac-Toe was deemed "solved" by a computer, but I'm sure it lags behind the actual computational power.
Also, take into account $1000 in 45 years will be worth about 4 times less than it is now, so much more money can go into projects such as this while computational power will continue to get cheaper.
It actually is possible for both players to have winning strategies in infinite games with no well-ordering; however, chess is well-ordered. In fact, because of the 50-move rule, there is an upper-limit to the number of moves a game can have, and thus there are only finitely many possible games of chess (which can be enumerated to solve exactly.. theoretically, at least :)
Your end of the argument is supported by the way modern chess programs work now. They work that way because it's way too resource-intense to code a chess program to operate deterministically. They won't necessarily always work that way. It's possible that chess will someday be solved, and if that happens, it will likely be solved by a computer.
I think you are dead on. Machines like Deep Blue and Deep Thought are programmed with a number of predefined games, and clever algorithms to parse the trees into the ends of those games. This is, of course, a dramatic oversimplification. There is always a chance to "beat" the computer along the course of a game. By this I mean making a move that forces the computer to make a move that is less than optimal (whatever that is). If the computer cannot find the best path before the time limit for the move, it might very well make a mistake by choosing one of the less-desirable paths.
There is another class of chess programs that uses real machine learning, or genetic programming / evolutionary algorithms. Some programs have been evolved and use neural networks, et al, to make decisions. In this type of case, I would imagine that the computer might make "mistakes", but still end up in a victory.
There is a fascinating book on this type of GP called Blondie24 that you might read. It is about checkers, but it could apply to chess.
For the record, there are computers that can win or tie at checkers. I'm not sure if the same could be done for chess. The number of moves is a lot higher. Also, things change because pieces can move in any direction, not just forwards and backwards. I think although I'm not sure, that chess is deterministic, but that there are just way too many possible moves for a computer to currently determine all the moves in a reasonable amount of time.
From game theory, which is what this question is about, the answer is yes Chess can be played perfectly. The game space is known/predictable and yes if you had you grandchild's quantum computers you could probably eliminate all heuristics.
You could write a perfect tic-tac-toe machine now-a-days in any scripting language and it'd play perfectly in real-time.
Othello is another game that current computers can easily play perfectly, but the machine's memory and CPU will need a bit of help
Chess is theoretically possible but not practically possible (in 2008)
i-Go is tricky, it's space of possibilities falls beyond the amount of atoms in the universe, so it might take us some time to make a perfect i-Go machine.
Chess is an example of a matrix game, which by definition has an optimal outcome (think Nash equilibrium). If player 1 and 2 each take optimal moves, a certain outcome will ALWAYS be reached (whether it be a win-tie-loss is still unknown).
As a chess programmer from the 1970's, I definitely have an opinion on this. What I wrote up about 10 years ago, still is basically true today:
"Unfinished Work and Challenges to Chess Programmers"
Back then, I thought we could solve Chess conventionally, if done properly.
Checkers was solved recently (Yay, University of Alberta, Canada!!!) but that was effectively done Brute Force. To do chess conventionally, you'll have to be smarter.
Unless, of course, Quantum Computing becomes a reality. If so, chess will be solved as easily as Tic-Tac-Toe.
In the early 1970's in Scientific American, there was a short parody that caught my attention. It was an announcement that the game of chess was solved by a Russian chess computer. It had determined that there is one perfect move for white that would ensure a win with perfect play by both sides, and that move is: 1. a4!
Lots of answers here make the important game-theoretic points:
Chess is a finite, deterministic game with complete information about the game state
You can solve a finite game and identify a perfect strategy
Chess is however big enough that you will not be able to solve it completely with a brute force method
However these observations miss an important practical point: it is not necessary to solve the complete game perfectly in order to create an unbeatable machine.
It is in fact quite likely that you could create an unbeatable chess machine (i.e. will never lose and will always force a win or draw) without searching even a tiny fraction of the possible state space.
The following techniques for example all massively reduce the search space required:
Tree pruning techniques like Alpha/Beta or MTD-f already massively reduce the search space
Provable winning position. Many endings fall in this category: You don't need to search KR vs K for example, it's a proven win. With some work it is possible to prove many more guaranteed wins.
Almost certain wins - for "good enough" play without any foolish mistakes (say about ELO 2200+?) many chess positions are almost certain wins, for example a decent material advantage (e.g. an extra Knight) with no compensating positional advantage. If your program can force such a position and has good enough heuristics for detecting positional advantage, it can safely assume it will win or at least draw with 100% probability.
Tree search heuristics - with good enough pattern recognition, you can quickly focus on the relevant subset of "interesting" moves. This is how human grandmasters play so it's clearly not a bad strategy..... and our pattern recognition algorithms are constantly getting better
Risk assessment - a better conception of the "riskiness" of a position will enable much more effective searching by focusing computing power on situations where the outcome is more uncertain (this is a natural extension of Quiescence Search)
With the right combination of the above techniques, I'd be comfortable asserting that it is possible to create an "unbeatable" chess playing machine. We're probably not too far off with current technology.
Note that It's almost certainly harder to prove that this machine cannot be beaten. It would probably be something like the Reimann hypothesis - we would be pretty sure that it plays perfectly and would have empirical results showing that it never lost (including a few billion straight draws against itself), but we wouldn't actually have the ability to prove it.
Additional note regarding "perfection":
I'm careful not to describe the machine as "perfect" in the game-theoretic sense because that implies unusually strong additional conditions, such as:
Always winning in every situation where it is possible to force a win, no matter how complex the winning combination may be. There will be situations on the boundary between win/draw where this is extremely hard to calculate perfectly.
Exploiting all available information about potential imperfection in your opponent's play, for example inferring that your opponent might be too greedy and deliberately playing a slightly weaker line than usual on the grounds that it has a greater potential to tempt your opponent into making a mistake. Against imperfect opponents it can in fact be optimal to make a losing if you estimate that your opponent probably won't spot the forced win and it gives you a higher probability of winning yourself.
Perfection (particularly given imperfect and unknown opponents) is a much harder problem than simply being unbeatable.
It's perfectly solvable.
There are 10^50 odd positions. Each position, by my reckoning, requires a minimum of 64 round bytes to store (each square has: 2 affiliation bits, 3 piece bits). Once they are collated, the positions that are checkmates can be identified and positions can be compared to form a relationship, showing which positions lead to other positions in a large outcome tree.
Then, the program needs only to find the lowest only one side checkmate roots, if such a thing exists. In any case, Chess was fairly simply solved at the end of the first paragraph.
if you search the entire space of all combinations of player1/2 moves, the single move that the computer decides upon at each step is based on a heuristic.
There are two competing ideas there. One is that you search every possible move, and the other is that you decide based on a heuristic. A heuristic is a system for making a good guess. If you're searching through every possible move, then you're no longer guessing.
"Is there a perfect algorithm for chess?"
Yes there is. Maybe it's for White to always win. Maybe it's for Black to always win. Maybe it's for both to always tie at least. We don't know which, and we'll never know, but it certainly exist.
See also
God's algorithm
I found this article by John MacQuarrie that references work by the "father of game theory" Ernst Friedrich Ferdinand Zermelo. It draws the following conclusion:
In chess either white can force a win, or black can force a win, or both sides can force at least a draw.
The logic seems sound to me.
There are two mistakes in your thought experiment:
If your Turing machine is not "limited" (in memory, speed, ...) you do not need to use heuristics but you can calculate evaluate the final states (win, loss, draw). To find the perfect game you would then just need to use the Minimax algorithm (see http://en.wikipedia.org/wiki/Minimax) to compute the optimal moves for each player, which would lead to one or more optimal games.
There is also no limit on the complexity of the used heuristic. If you can calculate a perfect game, there is also a way to compute a perfect heuristic from it. If needed its just a function that maps chess positions in the way "If I'm in this situation S my best move is M".
As others pointed out already, this will end in 3 possible results: white can force a win, black can force a win, one of them can force a draw.
The result of a perfect checkers games has already been "computed". If humanity will not destroy itself before, there will be also a calculation for chess some day, when computers have evolved enough to have enough memory and speed. Or we have some quantum computers... Or till someone (researcher, chess experts, genius) finds some algorithms that significantly reduces the complexity of the game. To give an example: What is the sum of all numbers between 1 and 1000? You can either calculate 1+2+3+4+5...+999+1000, or you can simply calculate: N*(N+1)/2 with N = 1000; result = 500500. Now imagine don't know about that formula, you don't know about Mathematical induction, you don't even know how to multiply or add numbers, ... So, it may be possible that there is a currently unknown algorithm that just ultimately reduces the complexity of this game and it would just take 5 Minutes to calculate the best move with a current computer. Maybe it would be even possible to estimate it as a human with pen & paper, or even in your mind, given some more time.
So, the quick answer is: If humanity survives long enough, it's just a matter of time!
I'm only 99.9% convinced by the claim that the size of the state space makes it impossible to hope for a solution.
Sure, 10^50 is an impossibly large number. Let's call the size of the state space n.
What's the bound on the number of moves in the longest possible game? Since all games end in a finite number of moves there exists such a bound, call it m.
Starting from the initial state, can't you enumerate all n moves in O(m) space? Sure, it takes O(n) time, but the arguments from the size of the universe don't directly address that. O(m) space might not even be very much. For O(m) space couldn't you also track, during this traversal, whether the continuation of any state along the path you are traversing leads to EitherMayWin, EitherMayForceDraw, WhiteMayWin, WhiteMayWinOrForceDraw, BlackMayWin, or BlackMayWinOrForceDraw? (There's a lattice depending on whose turn it is, annotate each state in the history of your traversal with the lattice meet.)
Unless I'm missing something, that's an O(n) time / O(m) space algorithm for determining which of the possible categories chess falls into. Wikipedia cites an estimate for the age of the universe at approximately 10^60th Planck times. Without getting into a cosmology argument, let's guess that there's about that much time left before the heat/cold/whatever death of the universe. That leaves us needing to evaluate one move every 10^10th Planck times, or every 10^-34 seconds. That's an impossibly short time (about 16 orders of magnitude shorter than the shortest times ever observed). Let's optimistically say that with a super-duper-good implementation running on top of the line present-or-forseen-non-quantum-P-is-a-proper-subset-of-NP technology we could hope to evaluate (take a single step forward, categorize the resulting state as an intermediate state or one of the three end states) states at a rate of 100 MHz (once every 10^-8 seconds). Since this algorithm is very parallelizable, this leaves us needing 10^26th such computers or about one for every atom in my body, together with the ability to collect their results.
I suppose there's always some sliver of hope for a brute-force solution. We might get lucky and, in exploring only one of white's possible opening moves, both choose one with much-lower-than-average fanout and one in which white always wins or wins-or-draws.
We could also hope to shrink the definition of chess somewhat and persuade everyone that it's still morally the same game. Do we really need to require positions to repeat 3 times before a draw? Do we really need to make the running-away party demonstrate the ability to escape for 50 moves? Does anyone even understand what the heck is up with the en passant rule? ;) More seriously, do we really need to force a player to move (as opposed to either drawing or losing) when his or her only move to escape check or a stalemate is an en passant capture? Could we limit the choice of pieces to which a pawn may be promoted if the desired non-queen promotion does not lead to an immediate check or checkmate?
I'm also uncertain about how much allowing each computer hash-based access to a large database of late game states and their possibly outcomes (which might be relatively feasible on existing hardware and with existing endgame databases) could help in pruning the search earlier. Obviously you can't memoize the entire function without O(n) storage, but you could pick a large integer and memoize that many endgames enumerating backwards from each possible (or even not easily provably impossible, I suppose) end state.
I know this is a bit of a bump, but I have to put my 5 cents worth in here. It is possible for a computer, or a person for that matter, to end every single chess game that he/she/it participates in, in either a win or a stalemate.
To achieve this, however, you must know precisely every possible move and reaction and so forth, all the way through to each and every single possible game outcome, and to visualize this, or to make an easy way of analyising this information, think of it as a mind map that branches out constantly.
The center node would be the start of the game. Each branch out of each node would symbolize a move, each one different to its bretheren moves. Presenting it in this manor would take much resources, especially if you were doing this on paper. On a computer, this would take possibly hundreds of Terrabytes of data, as you would have very many repedative moves, unless you made the branches come back.
To memorize such data, however, would be implausable, if not impossible. To make a computer recognize the most optimal move to take out of the (at most) 8 instantly possible moves, would be possible, but not plausable... as that computer would need to be able to process all the branches past that move, all the way to a conclusion, count all conclusions that result in a win or a stalemate, then act on that number of wining conclusions against losing conclusions, and that would require RAM capable of processing data in the Terrabytes, or more! And with todays technology, a computer like that would require more than the bank balance of the 5 richest men and/or women in the world!
So after all that consideration, it could be done, however, no one person could do it. Such a task would require 30 of the brightest minds alive today, not only in chess, but in science and computer technology, and such a task could only be completed on a (lets put it entirely into basic perspective)... extremely ultimately hyper super-duper computer... which couldnt possibly exist for at least a century. It will be done! Just not in this lifetime.
Mathematically, chess has been solved by the Minimax algorithm, which goes back to the 1920s (either found by Borel or von Neumann). Thus, a turing machine can indeed play perfect chess.
However, the computational complexity of chess makes it practically infeasible. Current engines use several improvements and heuristics. Top engines today have surpassed the best humans in terms of playing strength, but because of the heuristics that they are using, they might not play perfect when given infinite time (e.g., hash collisions could lead to incorrect results).
The closest that we currently have in terms of perfect play are endgame tablebases. The typical technique to generate them is called retrograde analysis. Currently, all position with up to six pieces have been solved.
It just might be solvable, but something bothers me:
Even if the entire tree could be traversed, there is still no way to predict the opponent's next move. We must always base our next move on the state of the opponent, and make the "best" move available. Then, based on the next state we do it again.
So, our optimal move might be optimal iff the opponent moves in a certain way. For some moves of the opponent our last move might have been sub-optimal.
I just fail to see how there could be a "perfect" move in every step.
For that to be the case, there must for every state [in the current game] be a path in the tree which leads to victory, regardless of the opponent's next move (as in tic-tac-toe), and I have a hard time figuring that.
Yes , in math , chess is classified as a determined game , that means it has a perfect algorithm for each first player , this is proven to be true even for infinate chess board , so one day probably a fast effective AI will find the perfect strategy, and the game is gone
More on this in this video : https://www.youtube.com/watch?v=PN-I6u-AxMg
There is also quantom chess , where there is no math proof that it is determined game http://store.steampowered.com/app/453870/Quantum_Chess/
and there you are detailed video about quantom chess https://chess24.com/en/read/news/quantum-chess
Of course
There's only 10 to the power of fifty possible combinations of pieces on the board. Having that in mind, to play to every compibation, you would need make under 10 to the power of fifty moves (including repetitions multiply that number by 3). So, there's less than ten to the power of one hundred moves in chess. Just pick those that lead to checkmate and you're good to go
64bit math (=chessboard) and bitwise operators (=next possible moves) is all You need. So simply. Brute Force will find the most best way usually. Of course, there is no universal algorithm for all positions. In real life the calculation is also limited in time, timeout will stop it. A good chess program means heavy code (passed,doubled pawns,etc). Small code can't be very strong. Opening and endgame databases just save processing time, some kind of preprocessed data. The device, I mean - the OS,threading poss.,environment,hardware define requirements. Programming language is important. Anyway, the development process is interesting.