Iterative deepening or trial and error? - performance

I am coding a board game. I have generated a game tree using alpha-beta pruning and have 2 options:
Use iterative deepening to optimize the alpha-beta so that it keeps generating one more ply until time is over.
By trial and error, I know the maximum depth reachable for every board configuration in the time limit without previously inspecting lower plies.
Which approach is better and will make the search reach a deeper depth? I know, for example, that at the beginning I can generate a tree of depth X consuming all the time available... Can iterative deepening add more depth?
Let me know if I can be more clear...

If your branching factor and evaluation function's required time stays about the same between different rounds you might be able to use option 2. But it sounds like it might be very difficult.
Option 1 has the ability to dramatically increase alpha-beta's performance if you don't have perfect move ordering already. You save the best move at every node, found during the previous search up to d-1, and then tell alpha-beta to search that first. It's called a transposition table, and if it is possible to implement this in your case option 1 will dominate option 2.
The first time I heard of a transposition table I didn't think it could make much of a difference, but it increased my maximum search depth by 50%.

Related

Game Tree Algorithms & Progressive Deepening: How to approximate an answer without reaching the leaf nodes?

I just saw this MIT lecture on Game Trees and MinMax algorithms where Alpha Beta pruning and Progressive Deepening was discussed.
https://www.youtube.com/watch?v=STjW3eH0Cik
So If I understand correctly progressive deepening is when you try to approximate the answer at every level and try to go deep towards the leaf nodes depending on the time limit you have for your move. It's important to have some answer at any point of time.
Now, at 36:22 Prof discusses the case when we don't have enough time and we went only till the (d-1) th level, where d is the depth of the tree. And then he also suggests we can have an temporary answer at every level as we go down as we should have some approximate answer at any point of time.
My question is how can we have any answer without going to the leaf nodes because it's only at the leaf nodes we can conclude who can win the game. Think this for tic-tac-toe game. At (d-1)th level we don't have enough information to decide if this series of moves till this node at (d-1) will win me or lose me the game. At higher levels say at (d-3) it's even more blur! Everything is possible as we go down. Isn't it? So, if an algorithm decides to compute till (d-1) th level then all those path options are equal! Nothing guarantees a win and nothing guarantees a lose at (d-1)th level because if I understand correctly wins and losses can be calculated only at the leaf nodes. This is so true especially in pure MinMax algorithm.
So how exactly are we going to have an 'approximate answer' at (d-1)th level or say (d-5)th level?
I will try to explain that well.
Context and importants of progressive deepening
I need you to know that in the real-world game, the time that you will use to decide is limited! ( because user experience and other issue on human-computer interaction or about the problem/design in your game). You have a game tree and to use difference algorithm to optimization that travel all tree. But there are three problems:
You have a time constraints!
You need to calculate the best solution in your current game tree and that's time of calculating depending the deep of tree!
you need to decide if go down in the tree to have a more precise answer without violate the time constraints.
The Answer of all problem is Progressive deepening: in current level you calculate the answer and try to pass the next level in the tree; but if you have not time you ready have a answer in the previous level and to get it out as answer
The answer your question
you can imagine the current level in your tree is "the final level" (you are supposing) in the game tree, but you will get a the best solution if you go to the next level in the tree, then if you can go to the next level: go now! but you need to calculate the optimal answer in the current game tree because it's "the final level" in the game tree as insurance policy if you don't finish the calcutation of the best answer in the next level by time constraint.

Optimal algorithm to find exit of a maze with no information

I have to determine a way for a robot to get out of a maze. The thing is that the layout of the maze is unknown, and the position of the exit is unknown too. The robot also start at an unknown position in the maze.
I found 3 solutions but I have a hard time knowing which one should I use, because in the end it seems that the solutions will purely be random anyway.
I have those 3 solutions :
1) The basic "human" strategy(?), where you put your hand on a wall and go through all the maze if necessary. I also keep a variable "turn counter" to avoid situation where the robot loop.
2) Depth first search
3) Making the robot choose direction randomly
The random one seems the worse, because he could take forever to find the exit (but on the other hand he could be the fastest too...). I'm not sure about the other two though.
Also, is there a way to have some kind of heuristic? Again the lack of information makes me think that it's impossible, but maybe I'm missing something.
Last thing : When the robot find the exit, he will have to go back to his start position using A*. This means that during the first part, where he looks for the exit, he will have draw a map of the maze that he will use for the 2nd part. Maybe this can help too chose the best algorithm for the first part, but yeah I don't see why one would be better.
Could someone help me please? Thanks (Also, sorry for my english).
Problems like this are categorised as real-time search, perhaps the best known example is Learning Real-Time A*, where you combine information about what you've seen before (if you've had to backtrack or know a cheaper way to reach a state), and the actions you can take. As is the case in areas like reinforcement learning, some level of randomness helps balance exploration and exploitation.
Assuming your graph is undirected, time invariant, and the initial and exit node exist in the same component, then choosing a direction at random at each vertex is equivalent to a random walk on a graph.
Regardless of whether the graph is initially known or not, this is a very well understood field of mathematics, equivalent to an absorbing Markov chain, the time to reach the exit state in such cases has a Discrete phase-type distribution - often quite slow, but it's also worth noting that in pathological cases it's possible to design a maze where a random walk will outperform DFS.
#beaker is right in that the first two you suggested should lead the the same result. However, you may be able to improve on the search a little by keeping track of any loops you find. If the Robot finds itself in a spot it has already visited and needs to backtrack once coming to a dead end there may be no need go back so far if there is a shortcut it has found. Also use the segments that have been mapped on the way out and apply Dijkstra's algorithm or A* on it to find the most efficient way back. There may be a faster way back on an unexplored path but this would be the safest way to have a quick result.
Obviously this implementing the checks for loops to prevent unneeded back tracking will make thing more complicated to implement. Though for the return to the start using Dijkstra's algorithm should not be as complex.
If you are feeling ambitious now that found the exit you could use this information and give the robot a sense of direction though in a randomly generated maze that may not help much.

Minimax algorithm with memoization?

I am trying to implement a connect four AI using the minimax algorithm in javascript. Currently, it is very slow. Other than alpha-beta pruning which I will implement, I was wondering if it is worth it to hash game states to
their heuristic evaluations
the next best move
I can immediately see why 2 would be useful since there are many ways to get to the same game state but I am wondering if I also have to hash the current depth to make this work.
For example, if I reached this state with a depth of 3 (so only say 4 more moves to look ahead) vs a depth of 2 with 5 moves to look ahead, I might arrive at a different answer. Doesn't this mean I should take the depth into account with the hash?
My second question is whether hashing boards to their evaluation is worth it. It takes me O(n) time to build my hash, and O(n) time to evaluate a board (though it's really more like O(2 or 3n)). Are game states usually hashed to their evaluations, or is this overkill? Thanks for any help
Whenever you hash a value of a state (using heuristics), you need to have information about the depth at which this state was evaluated. This is because there is a big difference between the value is 0.1 at depth 1 and the value is 0.1 at depth 20. In the first case we barely investigated the space, so we are pretty unsure what happens. In the second case we have already done huge amount of work, so we kind of know what are we talking about.
The thing is that for some games we do not know what the depth is for a position. For example chess. But in connect 4, looking at a position you know what is the depth.
For the connect 4 the depth here is 14 (only 14 circles have been put). So you do not need to store the depth.
As for whether you actually have to hash the state or re-evaluate it. Clearly a position in this game can be reached through many game-paths, so you kind of expect the hash to be helpful. Important question is the trade-off of creating/looking at a hash and how intensive your evaluation function is. If it looks like it does a lot of work - hash it and benchmark.
One last suggestion. You mentioned alpha-beta which is more helpful than hashing at your stage (and not that hard to implement). You can go further and implement move ordering for your alpha-beta. If I were you, I would do it, and only after that I would implement hashing.

Why is chess, checkers, Go, etc. in EXP but conjectured to be in NP?

If I tell you the moves for a game of chess and declare who wins, why can't it be checked in polynomial time if the winner does really win? This would make it an NP problem from my understanding.
First of all: The number of positions you can set up with 32 pieces on a 8x8 field is limited. We need to consider any pawn being converted to any other piece and include any such available position, too. Of course, among all these, there are some positions that cannot be reached following the rules of chess, but this does not matter. The important thing is: we have a limit. Lets name this limit simply MaxPositions.
Now for any given position, let's build up a tree as follows:
The given position is the root.
Add any position (legal chess position or not) as child.
For any of these children, add any position as child again.
Continue this way, until your tree reaches a depth of MaxPositions.
I'm now too tired to think of if we need one additional level of depth or not for the idea (proof?), but heck, just let's add it. The important thing is: the tree constructed like this is limited.
Next step: Of this tree, remove any sub-tree that is not reachable from the root via legal chess moves. Repeat this step for the remaining children, grand-children, ..., until there is no unreachable position left in the whole tree. The number of steps must be limited, as the tree is limited.
Now do a breadth-first search and make any node a leaf if it has been found previously. It must be marked as such(!; draw candidate?). Same for any mate position.
How to find out if there is a forced mate? In any sub tree, if it is your turn, there must be at least one child leading to a forced mate. If it is the opponents move, there must be a grand child for every child that leads to a mate. This applies recursively, of course. However, as the tree is limited, this whole algorithm is limited.
[sensored], this whole algorithm is limited! There is some constant limiting the whole stuff. So: although the limit is incredibly high (and far beyond what up-to-date hardware can handle), it is a limit (please do not ask me to calculate it...). So: our problem actually is O(1)!!!
The same for checkers, go, ...
This applies for the forced mate, so far. What is the best move? First, check if we can find a forced mate. If so, fine, we found the best move. If there are several, select the one with the least moves necessary (still there might be more than one...).
If there is no such forced mate, then we need to measure by some means the 'best' one. Possibly count the number of available successions to mate. Other propositions for measurement? As long as operating on this tree from top to down, we still remain limited. So again, we are O(1).
Now what did we miss? Have a look at the link in your comment again. They are talking about an NxN checkers! The author is varying size of the field!
So have a look back at how we constructed the tree. I think it is obvious that the tree grows exponentially with the size of the field (try to prove it yourself...).
I know very well that this answer is not a prove for that the problem is EXP(TIME). Actually, I admit, it is not really an answer at all. But I think what I illustrated still gives quite a good image/impression of the complexity of the problem. And as long as no one provides a better answer, I dare to claim that this is better than nothing at all...
Addendum, considering your comment:
Let me allow to refer to wikipedia. Actually, it should be suffient to transform the other problem in exponential time, not polynomial as in the link, as applying the transformation + solving the resulting problem still remains exponential. But I'm not sure about the exact definition...
It is sufficient to show this for a problem of which you know already it is EXP complete (transforming any other problem to this one and then to the chess problem again remains exponential, if both transformations are exponential).
Apparently, J.M. Robson found a way to do this for NxN checkers. It must be possible for generalized chess, too, probably simply modifying Robsons algorithm. I do not think it is possible for classical 8x8 chess, though...
O(1) applies for classical chess only, not for generalized chess. But it is the latter one for which we assume not being in NP! Actually, in my answer up to this addendum, there is one prove lacking: The size of the limited tree (if N is fix) does not grow faster than exponentially with growing N (so the answer actually is incomplete!).
And to prove that generalized chess is not in NP, we have to prove that there is no polynomial algorithm to solve the problem on a non-deterministic turing machine. This I leave open again, and my answer remains even less complete...
If I tell you the moves for a game of chess and declare who wins, why
can't it be checked in polynomial time if the winner does really win?
This would make it an NP problem from my understanding.
Because in order to check if the winner(white) does really win, you will have to also evaluate all possible moves that the looser(black) could've made in other to also win. That makes the checking also exponential.

Algorithm for Connect 4 Evaluation of Data Set

I am working on a connect 4 AI, and saw many people were using this data set, containing all the legal positions at 8 ply, and their eventual outcome.
I am using a standard minimax with alpha/beta pruning as my search algorithm. It seems like this data set could could be really useful for my AI. However, I'm trying to find the best way to implement it. I thought the best approach might be to process the list, and use the board state as a hash for the eventual result (win, loss, draw).
What is the best way for to design an AI to use a data set like this? Is my idea of hashing the board state, and using it in a traditional search algorithm (eg. minimax) on the right track? or is there is better way?
Update: I ended up converting the large move database to a plain test format, where 1 represented X and -1 O. Then I used a string of the board state, an an integer representing the eventual outcome, and put it in an std::unsorted_map (see Stack Overflow With Unordered Map to for a problem I ran into). The performance of the map was excellent. It built quickly, and the lookups were fast. However, I never quite got the search right. Is the right way to approach the problem to just search the database when the number of turns in the game is less than 8, then switch over to a regular alpha-beta?
Your approach seems correct.
For the first 8 moves, use alpha-beta algorithm, and use the look-up table to evaluate the value of each node at depth 8.
Once you have "exhausted" the table (exceeded 8 moves in the game) - you should switch to regular alpha-beta algorithm, that ends with terminal states (leaves in the game tree).
This is extremely helpful because:
Remember that the complexity of searching the tree is O(B^d) - where B is the branch factor (number of possible moves per state) and d is the needed depth until the end.
By using this approach you effectively decrease both B and d for the maximal waiting times (longest moves needed to be calculated) because:
Your maximal depth shrinks significantly to d-8 (only for the last moves), effectively decreasing d!
The branch factor itself tends to shrink in this game after a few moves (many moves become impossible or leading to defeat and should not be explored), this decreases B.
In the first move, you shrink the number of developed nodes as well
to B^8 instead of B^d.
So, because of these - the maximal waiting time decreases significantly by using this approach.
Also note: If you find the optimization not enough - you can always expand your look up table (to 9,10,... first moves), of course it will increase the needed space exponentially - this is a tradeoff you need to examine and chose what best serves your needs (maybe even store the entire game in file system if the main memory is not enough should be considered)

Resources