Algorithm for the game of Chomp - algorithm

I am writing a program for the game of Chomp. You can read the description of the game on Wikipedia, however I'll describe it briefly anyway.
We play on a chocolate bar of dimension n x m, i.e. the bar is divided in n x m squares. At each turn the current player chooses a square and eats everything below and right of the chosen square. So, for example, the following is a valid first move:
The objective is to force your opponent to eat the last piece of chocolate (it is poisoned).
Concerning the AI part, I used a minimax algorithm with depth-truncation. However I can't come up with a suitable position evaluation function. The result is that, with my evaluation function, it is quite easy for a human player to win against my program.
Can anyone:
suggest a good position evaluation function or
provide some useful reference or
suggest an alternative algorithm?

How big are your boards?
If your boards are fairly small then you can solve the game exactly with dynamic programming. In Python:
n,m = 6,6
init = frozenset((x,y) for x in range(n) for y in range(m))
def moves(board):
return [frozenset([(x,y) for (x,y) in board if x < px or y < py]) for (px,py) in board]
#memoize
def wins(board):
if not board: return True
return any(not wins(move) for move in moves(board))
The function wins(board) computes whether the board is a winning position. The board representation is a set of tuples (x,y) indicating whether piece (x,y) is still on the board. The function moves computes list of boards reachable in one move.
The logic behind the wins function works like this. If the board is empty on our move the other player must have eaten the last piece, so we won. If the board is not empty then we can win if there is any move we can do such that the resulting position is a losing position (i.e. not winning i.e. not wins(move)), because then we got the other player into a losing position.
You'll also need the memoize helper function which caches the results:
def memoize(f):
cache = dict()
def memof(x):
try: return cache[x]
except:
cache[x] = f(x)
return cache[x]
return memof
By caching we only compute who is the winner for a given position once, even if this position is reachable in multiple ways. For example the position of a single row of chocolate can be obtained if the first player eats all remaining rows in his first move, but it can also be obtained through many other series of moves. It would be wasteful to compute who wins on the single row board again and again, so we cache the result. This improves the asymptotic performance from something like O((n*m)^(n+m)) to O((n+m)!/(n!m!)), a vast improvement though still slow for large boards.
And here is a debugging printing function for convenience:
def show(board):
for x in range(n):
print '|' + ''.join('x ' if (x,y) in board else ' ' for y in range(m))
This code is still fairly slow because the code isn't optimized in any way (and this is Python...). If you write it in C or Java efficiently you can probably improve performance over 100 fold. You should easily be able to handle 10x10 boards, and you can probably handle up to 15x15 boards. You should also use a different board representation, for example a bit board. Perhaps you would even be able to speed it up 1000x if you make use of multiple processors.
Here is a derivation from minimax
We'll start with minimax:
def minimax(board, depth):
if depth > maxdepth: return heuristic(board)
else:
alpha = -1
for move in moves(board):
alpha = max(alpha, -minimax(move, depth-1))
return alpha
We can remove the depth checking to do a full search:
def minimax(board):
if game_ended(board): return heuristic(board)
else:
alpha = -1
for move in moves(board):
alpha = max(alpha, -minimax(move))
return alpha
Because the game ended, heuristic will return either -1 or 1, depending on which player won. If we represent -1 as false and 1 as true, then max(a,b) becomes a or b, and -a becomes not a:
def minimax(board):
if game_ended(board): return heuristic(board)
else:
alpha = False
for move in moves(board):
alpha = alpha or not minimax(move)
return alpha
You can see this is equivalent to:
def minimax(board):
if not board: return True
return any([not minimax(move) for move in moves(board)])
If we had instead started with minimax with alpha-beta pruning:
def alphabeta(board, alpha, beta):
if game_ended(board): return heuristic(board)
else:
for move in moves(board):
alpha = max(alpha, -alphabeta(move, -beta, -alpha))
if alpha >= beta: break
return alpha
// start the search:
alphabeta(initial_board, -1, 1)
The search starts out with alpha = -1 and beta = 1. As soon as alpha becomes 1, the loop breaks. So we can assume that alpha stays -1 and beta stays 1 in the recursive calls. So the code is equivalent to this:
def alphabeta(board, alpha, beta):
if game_ended(board): return heuristic(board)
else:
for move in moves(board):
alpha = max(alpha, -alphabeta(move, -1, 1))
if alpha == 1: break
return alpha
// start the search:
alphabeta(initial_board, -1, 1)
So we can simply remove the parameters, as they are always passed in as the same values:
def alphabeta(board):
if game_ended(board): return heuristic(board)
else:
alpha = -1
for move in moves(board):
alpha = max(alpha, -alphabeta(move))
if alpha == 1: break
return alpha
// start the search:
alphabeta(initial_board)
We can again do the switch from -1 and 1 to booleans:
def alphabeta(board):
if game_ended(board): return heuristic(board)
else:
alpha = False
for move in moves(board):
alpha = alpha or not alphabeta(move))
if alpha: break
return alpha
So you can see this is equivalent to using any with a generator which stops the iteration as soon as it has found a True value instead of always computing the whole list of children:
def alphabeta(board):
if not board: return True
return any(not alphabeta(move) for move in moves(board))
Note that here we have any(not alphabeta(move) for move in moves(board)) instead of any([not minimax(move) for move in moves(board)]). This speeds up the search by about a factor of 10 for reasonably sized boards. Not because the first form is faster, but because it allows us to skip entire rest of the loop including the recursive calls as soon as we have found a value that's True.
So there you have it, the wins function was just alphabeta search in disguise. The next trick we used for wins is to memoize it. In game programming this would be called using "transposition tables". So the wins function is doing alphabeta search with transposition tables. Of course it's simpler to write down this algorithm directly instead of going through this derivation ;)

I don't think a good position evaluation function is possible here, because unlike games like chess, there's no 'progress' short of winning or losing. The Wikipedia article suggests an exhaustive solution is practical for modern computers, and I think you'll find this is the case, given suitable memoization and optimisation.
A related game you may find of interest is Nim.

Related

Write a recurrence for the probability that the champion retains the title?

Problem:
The traditional world chess championship is a match of 24 games. The
current champion retains the title in case the match is a tie. Each
game ends in a win, loss, or draw (tie) where wins count as 1, losses
as 0, and draws as 1/2. The players take turns playing white and
black. White plays first and so has an advantage. The champion plays
white in the first game. The champ has probabilities ww, wd, and wl of
winning, drawing, and losing playing white, and has probabilities bw,
bd, and bl of winning, drawing, and losing playing black.
(a) Write a recurrence for the probability that the champion retains the title.
Assume that there are g games left to play in the match and that the
champion needs to get i points (which may be a multiple of 1/2).
I found this problem in skiena's Algorithm design manual book and tried to solve it.
My take on this problem is At i`th game a champion must either win , lose or tie the game.we have prior probability of ww,wd and wl(since champion starts from white I am considering only ww,wd and wl only).
so the probability that the champion will win is equal to current outcome weightage multiplied by prior probability,
P_win(i) = ∑(prior_p * current_outcome_weightage) = ((ww*win_weight)+(wd*draw_weight)+(wl*loss_weight))/3
current outcome weightage is either one of the three case:
if champion wins : win_weight=1,draw_weight=0,loss_weight=0
if champion loses: win_weight=0,draw_weight=0,loss_weight=0
if champion draws: win_weight=0,draw_weight=1/2,loss_weight=0
Is there any better way to write the recurrence for this problem?
You basically need to "guess" at the current probability if the champion wins loses or draws, and recurse based on it, modifying i according to win/loss/draw - and always reducing g.
Stop clause is if either champion wins (no more points needed) or loses (no more games left).
(Note, conditions are evaluated in order, so g==0, i==0 will apply g<=0 only, for example)
Sol(g, i) = {
i <= 0: 1
g == 0: 0
i%2 == 0: ww*Sol(g-1, i-1) + wd*Sol(g-1,i-1/2) + wl*Sol(g-1,i)
i%2 == 1: bw*Sol(g-1, i-1) + bd*Sol(g-1,i-1/2) + bl*Sol(g-1,i)
}
Quick sanity test, if there is a final match left, and the champion must win it - then the probability of him winning is bw. In this case you need to evaluate: Sol(1, 1) = bw*Sol(0,0) + bd*Sol(0,1/2) + bl*Sol(0,1) = bw*1 + bd*0 + bl*0 = bw

Checking for (x,y) in two 1D arrays one x and one y in Ruby

INTRO:
I am trying to program Snakes in Ruby in order to get myself more familiar with Ruby which I just recently started to learn because of its reputation.
So I saved all the coordinates of the Snake in and two arrays one for all the X coordinates and one for all the Y coordinates like this x=[2,0,0]and y=[2,0,0]. When the snakes eats food a new value is added to both arrays like this x[f]=y[f-1] and y[f]=y[f-1] So every part of the snake inherits the position of the part before. So far so good. But then I realized that the food which is placed through the rand() command some times is on the same position as the a part of the snake.
PROBLEM:
That is where my problem is. I tried to fix it like this:
while randomNumberIsSnake == true
if $x.include? randomX
if $y.include? randomY
randomX = 2 + rand(18)
randomY = 2 + rand(18)
else
randomNumberIsSnake = false
end
else
randomNumberIsSnake = false
end
end
To check if the X coordinates of the food is equal to the X coordinate of a part of the snake (in the array). If that is true than it also checks if the Y coordinate is in the Y array. It will then get a new number till that isn't the case anymore.
It seemed to work just right till I looked at it again and found that bug that if the $y.include? randomYand $x.include? randomXboth return true it doesn't necessarily mean that the food is on the same coordinates as a part of the snake but rather if the food was (4,4) and the snake has a part on (4,8) and a part on (8,4) it will also return true. Which creates a situation like this.
-------#
-#-- #
-------#
#####
Where # even though clearly not on the line of #'s still will make the code return true.
QUESTION:
Is there a way to rephrase the code in order to avoid that or would that require to have only one 2D array and not two 1D arrays. Like I mentioned earlier I am still a leaner in Ruby and there for every help would be highly appreciated.
You should write your checker like this one:
coordinates = loop do
randomX = 2 + rand(18)
randomY = 2 + rand(18)
randomNumberIsSnake = false
$x.size.times do |index|
randomNumberIsSnake = true if $x[index] == randomX && $y[index] == randomY
end
break [randomX, randomY] unless randomNumberIsSnake
end
Then in your coordinates local variable you'll have this values.
But I'd recommend you to store all free cell for current moment. Then take one randomly and remove from this array. If for some step array is empty there is no free cells at all. This solution is possible if that's not huge amount of cells.

How to account for position's history in transposition tables

I'm currently developing a solver for a trick-based card game called Skat in a perfect information situation. Although most of the people may not know the game, please bear with me; my problem is of a general nature.
Short introduction to Skat:
Basically, each player plays one card alternatingly, and every three cards form a trick. Every card has a specific value. The score that a player has achieved is the result of adding up the value of every card contained in the tricks that the respective player has won. I left out certain things that are unimportant for my problem, e.g. who plays against whom or when do I win a trick.
What we should keep in mind is that there is a running score, and who played what before when investigating a certain position (-> its history) is relevant to that score.
I have written an alpha beta algorithm in Java which seems to work fine, but it's way too slow. The first enhancement that seems the most promising is the use of a transposition table. I read that when searching the tree of a Skat game, you will encounter a lot of positions that have already been investigated.
And that's where my problem comes into play: If I find a position that has already been investigated before, the moves leading to this position have been different. Therewith, in general, the score (and alpha or beta) will be different, too.
This leads to my question: How can I determine the value of a position, if I know the value of the same position, but with a different history?
In other words: How can I decouple a subtree from its path to the root, so that it can be applied to a new path?
My first impulse was it's just not possible, because alpha or beta could have been influenced by other paths, which might not be applicable to the current position, but...
There already seems to be a solution
...that I don't seem to understand. In Sebastion Kupferschmid's master thesis about a Skat solver, I found this piece of code (maybe C-ish / pseudo code?):
def ab_tt(p, alpha, beta):
if p isa Leaf:
return 0
if hash.lookup(p, val, flag):
if flag == VALID:
return val
elif flag == LBOUND:
alpha = max(alpha, val)
elif flag == UBOUND:
beta = min(beta, val)
if alpha >= beta:
return val
if p isa MAX_Node:
res = alpha
else:
res = beta
for q in succ(p):
if p isa MAX_Node:
succVal = t(q) + ab_tt(q, res - t(q), beta - t(q))
res = max(res, succVal)
if res >= beta:
hash.add(p, res, LBOUND)
return res
elif p isa MIN_Node:
succVal = t(q) + ab_tt(q, alpha - t(q), res - t(q))
res = min(res, succVal)
if res <= alpha:
hash.add(p, res, UBOUND)
return res
hash.add(p, res, VALID)
return res
It should be pretty self-explanatory. succ(p) is a function that returns every possible move at the current position. t(q) is what I believe to be the running score of the respective position (the points achieved so far by the declarer).
Since I don't like copying stuff without understanding it, this should just be an aid for anyone who would like to help me out. Of course, I have given this code some thought, but I can't wrap my head around one thing: By subtracting the current score from alpha/beta before calling the function again [e.g. ab_tt(q, res - t(q), beta - t(q))], there seems to be some kind of decoupling going on. But what exactly is the benefit if we store the position's value in the transposition table without doing the same subtraction right here, too? If we found a previously investigated position, how come we can just return its value (in case it's VALID) or use the bound value for alpha or beta? The way I see it, both storing and retrieving values from the transposition table won't account for the specific histories of these positions. Or will it?
Literature:
There's almost no English sources out there that deal with AI in skat games, but I found this one: A Skat Player Based on Monte Carlo Simulation by Kupferschmid, Helmert. Unfortunately, the whole paper and especially the elaboration on transposition tables is rather compact.
Edit:
So that everyone can imagine better how the score develops thoughout a Skat game until all cards have been played, here's an example. The course of the game is displayed in the lower table, one trick per line. The actual score after each trick is on its left side, where +X is the declarer's score (-Y is the defending team's score, which is irrelevant for alpha beta). As I said, the winner of a trick (declarer or defending team) adds the value of each card in this trick to their score.
The card values are:
Rank J A 10 K Q 9 8 7
Value 2 11 10 4 3 0 0 0
I solved the problem. Intead of doing weird subtractions upon each recursive call, as suggested by the reference in my question, I subtract the running score from the resulting alpha beta value, only when storing a position in the transposition table:
For exact values (the position hasn't been pruned):
transpo.put(hash, new int[] { TT_VALID, bestVal - node.getScore()});
If the node caused a beta-cutoff:
transpo.put(hash, new int[] { TT_LBOUND, bestVal - node.getScore()});
If the node caused an alpha-cutoff:
transpo.put(hash, new int[] { TT_UBOUND, bestVal - node.getScore()});
Where:
transpo is a HashMap<Long, int[]>
hash is the long value representing that position
bestVal is either the exact value or the value that caused a cutoff
TT_VALID, TT_LBOUND and TT_UBOUND are simple constants, describing the type of transposition table entry
However, this didn't work per se. After posting the same question on gamedev.net, a user named Álvaro gave me the deciding hint:
When storing exact scores (TT_VALID), I should only store positions, that improved alpha.

Distributing points over a surface within boundries

I'm interested in a way (algorithm) of distributing a predefined number of points over a 4 sided surface like a square.
The main issue is that each point has got to have a minimum and maximum proximity to each other (random between two predefined values). Basically the distance of any two points should not be closer than let's say 2, and a further than 3.
My code will be implemented in ruby (the points are locations, the surface is a map), but any ideas or snippets are definitely welcomed as all my ideas include a fair amount of brute force.
Try this paper. It has a nice, intuitive algorithm that does what you need.
In our modelization, we adopted another model: we consider each center to be related to all its neighbours by a repulsive string.
At the beginning of the simulation, the centers are randomly distributed, as well as the strengths of the
strings. We choose randomly to move one center; then we calculate the resulting force caused by all
neighbours of the given center, and we calculate the displacement which is proportional and oriented
in the sense of the resulting force.
After a certain number of iterations (which depends on the number of
centers and the degree of initial randomness) the system becomes stable.
In case it is not clear from the figures, this approach generates uniformly distributed points. You may use instead a force that is zero inside your bounds (between 2 and 3, for example) and non-zero otherwise (repulsive if the points are too close, attractive if too far).
This is my Python implementation (sorry, I don´t know ruby). Just import this and call uniform() to get a list of points.
import numpy as np
from numpy.linalg import norm
import pylab as pl
# find the nearest neighbors (brute force)
def neighbors(x, X, n=10):
dX = X - x
d = dX[:,0]**2 + dX[:,1]**2
idx = np.argsort(d)
return X[idx[1:11]]
# repulsion force, normalized to 1 when d == rmin
def repulsion(neib, x, d, rmin):
if d == 0:
return np.array([1,-1])
return 2*(x - neib)*rmin/(d*(d + rmin))
def attraction(neib, x, d, rmax):
return rmax*(neib - x)/(d**2)
def uniform(n=25, rmin=0.1, rmax=0.15):
# Generate randomly distributed points
X = np.random.random_sample( (n, 2) )
# Constants
# step is how much each point is allowed to move
# set to a lower value when you have more points
step = 1./50.
# maxk is the maximum number of iterations
# if step is too low, then maxk will need to increase
maxk = 100
k = 0
# Force applied to the points
F = np.zeros(X.shape)
# Repeat for maxk iterations or until all forces are zero
maxf = 1.
while maxf > 0 and k < maxk:
maxf = 0
for i in xrange(n):
# Force calculation for the i-th point
x = X[i]
f = np.zeros(x.shape)
# Interact with at most 10 neighbors
Neib = neighbors(x, X, 10)
# dmin is the distance to the nearest neighbor
dmin = norm(Neib[0] - x)
for neib in Neib:
d = norm(neib - x)
if d < rmin:
# feel repulsion from points that are too near
f += repulsion(neib, x, d, rmin)
elif dmin > rmax:
# feel attraction if there are no neighbors closer than rmax
f += attraction(neib, x, d, rmax)
# save all forces and the maximum force to normalize later
F[i] = f
if norm(f) <> 0:
maxf = max(maxf, norm(f))
# update all positions using the forces
if maxf > 0:
X += (F/maxf)*step
k += 1
if k == maxk:
print "warning: iteration limit reached"
return X
I presume that one of your brute force ideas includes just repeatedly generating points at random and checking to see if the constraints happen to be satisified.
Another way is to take a configuration that satisfies the constraints and repeatedly perturb a small part of it, chosen at random - for instance move a single point - to move to a randomly chosen nearby configuration. If you do this often enough you should move to a random configuration that is almost independent of the starting point. This could be justified under http://en.wikipedia.org/wiki/Metropolis%E2%80%93Hastings_algorithm or http://en.wikipedia.org/wiki/Gibbs_sampling.
I might try just doing it at random, then going through and dropping points that are to close to other points. You can compare the square of the distance to save some math time.
Or create cells with borders and place a point in each one. Less random, it depends on if this is a "just for looks thing" or not. But it could be very fast.
I made a compromise and ended up using the Poisson Disk Sampling method.
The result was fairly close to what I needed, especially with a lower number of tries (which also drastically reduces cost).

What algorithm for a tic-tac-toe game can I use to determine the "best move" for the AI?

In a tic-tac-toe implementation I guess that the challenging part is to determine the best move to be played by the machine.
What are the algorithms that can pursued? I'm looking into implementations from simple to complex. How would I go about tackling this part of the problem?
The strategy from Wikipedia for playing a perfect game (win or tie every time) seems like straightforward pseudo-code:
Quote from Wikipedia (Tic Tac Toe#Strategy)
A player can play a perfect game of Tic-tac-toe (to win or, at least, draw) if they choose the first available move from the following list, each turn, as used in Newell and Simon's 1972 tic-tac-toe program.[6]
Win: If you have two in a row, play the third to get three in a row.
Block: If the opponent has two in a row, play the third to block them.
Fork: Create an opportunity where you can win in two ways.
Block Opponent's Fork:
Option 1: Create two in a row to force
the opponent into defending, as long
as it doesn't result in them creating
a fork or winning. For example, if "X"
has a corner, "O" has the center, and
"X" has the opposite corner as well,
"O" must not play a corner in order to
win. (Playing a corner in this
scenario creates a fork for "X" to
win.)
Option 2: If there is a configuration
where the opponent can fork, block
that fork.
Center: Play the center.
Opposite Corner: If the opponent is in the corner, play the opposite
corner.
Empty Corner: Play an empty corner.
Empty Side: Play an empty side.
Recognizing what a "fork" situation looks like could be done in a brute-force manner as suggested.
Note: A "perfect" opponent is a nice exercise but ultimately not worth 'playing' against. You could, however, alter the priorities above to give characteristic weaknesses to opponent personalities.
What you need (for tic-tac-toe or a far more difficult game like Chess) is the minimax algorithm, or its slightly more complicated variant, alpha-beta pruning. Ordinary naive minimax will do fine for a game with as small a search space as tic-tac-toe, though.
In a nutshell, what you want to do is not to search for the move that has the best possible outcome for you, but rather for the move where the worst possible outcome is as good as possible. If you assume your opponent is playing optimally, you have to assume they will take the move that is worst for you, and therefore you have to take the move that MINimises their MAXimum gain.
The brute force method of generating every single possible board and scoring it based on the boards it later produces further down the tree doesn't require much memory, especially once you recognize that 90 degree board rotations are redundant, as are flips about the vertical, horizontal, and diagonal axis.
Once you get to that point, there's something like less than 1k of data in a tree graph to describe the outcome, and thus the best move for the computer.
-Adam
A typical algo for tic-tac-toe should look like this:
Board : A nine-element vector representing the board. We store 2 (indicating
Blank), 3 (indicating X), or 5 (indicating O).
Turn: An integer indicating which move of the game about to be played.
The 1st move will be indicated by 1, last by 9.
The Algorithm
The main algorithm uses three functions.
Make2: returns 5 if the center square of the board is blank i.e. if board[5]=2. Otherwise, this function returns any non-corner square (2, 4, 6 or 8).
Posswin(p): Returns 0 if player p can’t win on his next move; otherwise, it returns the number of the square that constitutes a winning move. This function will enable the program both to win and to block opponents win. This function operates by checking each of the rows, columns, and diagonals. By multiplying the values of each square together for an entire row (or column or diagonal), the possibility of a win can be checked. If the product is 18 (3 x 3 x 2), then X can win. If the product is 50 (5 x 5 x 2), then O can win. If a winning row (column or diagonal) is found, the blank square in it can be determined and the number of that square is returned by this function.
Go (n): makes a move in square n. this procedure sets board [n] to 3 if Turn is odd, or 5 if Turn is even. It also increments turn by one.
The algorithm has a built-in strategy for each move. It makes the odd numbered
move if it plays X, the even-numbered move if it plays O.
Turn = 1 Go(1) (upper left corner).
Turn = 2 If Board[5] is blank, Go(5), else Go(1).
Turn = 3 If Board[9] is blank, Go(9), else Go(3).
Turn = 4 If Posswin(X) is not 0, then Go(Posswin(X)) i.e. [ block opponent’s win], else Go(Make2).
Turn = 5 if Posswin(X) is not 0 then Go(Posswin(X)) [i.e. win], else if Posswin(O) is not 0, then Go(Posswin(O)) [i.e. block win], else if Board[7] is blank, then Go(7), else Go(3). [to explore other possibility if there be any ].
Turn = 6 If Posswin(O) is not 0 then Go(Posswin(O)), else if Posswin(X) is not 0, then Go(Posswin(X)), else Go(Make2).
Turn = 7 If Posswin(X) is not 0 then Go(Posswin(X)), else if Posswin(X) is not 0, then Go(Posswin(O)) else go anywhere that is blank.
Turn = 8 if Posswin(O) is not 0 then Go(Posswin(O)), else if Posswin(X) is not 0, then Go(Posswin(X)), else go anywhere that is blank.
Turn = 9 Same as Turn=7.
I have used it. Let me know how you guys feel.
Since you're only dealing with a 3x3 matrix of possible locations, it'd be pretty easy to just write a search through all possibilities without taxing you computing power. For each open space, compute through all the possible outcomes after that marking that space (recursively, I'd say), then use the move with the most possibilities of winning.
Optimizing this would be a waste of effort, really. Though some easy ones might be:
Check first for possible wins for
the other team, block the first one
you find (if there are 2 the games
over anyway).
Always take the center if it's open
(and the previous rule has no
candidates).
Take corners ahead of sides (again,
if the previous rules are empty)
You can have the AI play itself in some sample games to learn from. Use a supervised learning algorithm, to help it along.
An attempt without using a play field.
to win(your double)
if not, not to lose(opponent's double)
if not, do you already have a fork(have a double double)
if not, if opponent has a fork
search in blocking points for possible double and fork(ultimate win)
if not search forks in blocking points(which gives the opponent the most losing possibilities )
if not only blocking points(not to lose)
if not search for double and fork(ultimate win)
if not search only for forks which gives opponent the most losing possibilities
if not search only for a double
if not dead end, tie, random.
if not(it means your first move)
if it's the first move of the game;
give the opponent the most losing possibility(the algorithm results in only corners which gives 7 losing point possibility to opponent)
or for breaking boredom just random.
if it's second move of the game;
find only the not losing points(gives a little more options)
or find the points in this list which has the best winning chance(it can be boring,cause it results in only all corners or adjacent corners or center)
Note: When you have double and forks, check if your double gives the opponent a double.if it gives, check if that your new mandatory point is included in your fork list.
Rank each of the squares with numeric scores. If a square is taken, move on to the next choice (sorted in descending order by rank). You're going to need to choose a strategy (there are two main ones for going first and three (I think) for second). Technically, you could just program all of the strategies and then choose one at random. That would make for a less predictable opponent.
This answer assumes you understand implementing the perfect algorithm for P1 and discusses how to achieve a win in conditions against ordinary human players, who will make some mistakes more commonly than others.
The game of course should end in a draw if both players play optimally. At a human level, P1 playing in a corner produces wins far more often. For whatever psychological reason, P2 is baited into thinking that playing in the center is not that important, which is unfortunate for them, since it's the only response that does not create a winning game for P1.
If P2 does correctly block in the center, P1 should play the opposite corner, because again, for whatever psychological reason, P2 will prefer the symmetry of playing a corner, which again produces a losing board for them.
For any move P1 may make for the starting move, there is a move P2 may make that will create a win for P1 if both players play optimally thereafter. In that sense P1 may play wherever. The edge moves are weakest in the sense that the largest fraction of possible responses to this move produce a draw, but there are still responses that will create a win for P1.
Empirically (more precisely, anecdotally) the best P1 starting moves seem to be first corner, second center, and last edge.
The next challenge you can add, in person or via a GUI, is not to display the board. A human can definitely remember all the state but the added challenge leads to a preference for symmetric boards, which take less effort to remember, leading to the mistake I outlined in the first branch.
I'm a lot of fun at parties, I know.
A Tic-tac-toe adaptation to the min max algorithem
let gameBoard: [
[null, null, null],
[null, null, null],
[null, null, null]
]
const SYMBOLS = {
X:'X',
O:'O'
}
const RESULT = {
INCOMPLETE: "incomplete",
PLAYER_X_WON: SYMBOLS.x,
PLAYER_O_WON: SYMBOLS.o,
tie: "tie"
}
We'll need a function that can check for the result. The function will check for a succession of chars. What ever the state of the board is, the result is one of 4 options: either Incomplete, player X won, Player O won or a tie.
function checkSuccession (line){
if (line === SYMBOLS.X.repeat(3)) return SYMBOLS.X
if (line === SYMBOLS.O.repeat(3)) return SYMBOLS.O
return false
}
function getResult(board){
let result = RESULT.incomplete
if (moveCount(board)<5){
return result
}
let lines
//first we check row, then column, then diagonal
for (var i = 0 ; i<3 ; i++){
lines.push(board[i].join(''))
}
for (var j=0 ; j<3; j++){
const column = [board[0][j],board[1][j],board[2][j]]
lines.push(column.join(''))
}
const diag1 = [board[0][0],board[1][1],board[2][2]]
lines.push(diag1.join(''))
const diag2 = [board[0][2],board[1][1],board[2][0]]
lines.push(diag2.join(''))
for (i=0 ; i<lines.length ; i++){
const succession = checkSuccesion(lines[i])
if(succession){
return succession
}
}
//Check for tie
if (moveCount(board)==9){
return RESULT.tie
}
return result
}
Our getBestMove function will receive the state of the board, and the symbol of the player for which we want to determine the best possible move. Our function will check all possible moves with the getResult function. If it is a win it will give it a score of 1. if it's a loose it will get a score of -1, a tie will get a score of 0. If it is undetermined we will call the getBestMove function with the new state of the board and the opposite symbol. Since the next move is of the oponent, his victory is the lose of the current player, and the score will be negated. At the end possible move receives a score of either 1,0 or -1, we can sort the moves, and return the move with the highest score.
const copyBoard = (board) => board.map(
row => row.map( square => square )
)
function getAvailableMoves (board) {
let availableMoves = []
for (let row = 0 ; row<3 ; row++){
for (let column = 0 ; column<3 ; column++){
if (board[row][column]===null){
availableMoves.push({row, column})
}
}
}
return availableMoves
}
function applyMove(board,move, symbol) {
board[move.row][move.column]= symbol
return board
}
function getBestMove (board, symbol){
let availableMoves = getAvailableMoves(board)
let availableMovesAndScores = []
for (var i=0 ; i<availableMoves.length ; i++){
let move = availableMoves[i]
let newBoard = copyBoard(board)
newBoard = applyMove(newBoard,move, symbol)
result = getResult(newBoard,symbol).result
let score
if (result == RESULT.tie) {score = 0}
else if (result == symbol) {
score = 1
}
else {
let otherSymbol = (symbol==SYMBOLS.x)? SYMBOLS.o : SYMBOLS.x
nextMove = getBestMove(newBoard, otherSymbol)
score = - (nextMove.score)
}
if(score === 1) // Performance optimization
return {move, score}
availableMovesAndScores.push({move, score})
}
availableMovesAndScores.sort((moveA, moveB )=>{
return moveB.score - moveA.score
})
return availableMovesAndScores[0]
}
Algorithm in action, Github, Explaining the process in more details

Resources