Debugging Recursive MinMax in TicTacToe - ruby

I'm trying to get the minmax algorithm (computer AI) to work in my game of tic-tac-toe. I've been stuck on this for days. Essentially, I don't understand why the computer AI simply places it's marker ("O") in sequential order from board pieces 0-8.
For example, as the human player, if I choose 1, then the computer will choose 0:
O| X| 2
--+---+--
3| 4| 5
--+---+--
6| 7| 8
Next, if I choose 4, then the computer will choose 2:
O| X| O
--+---+--
3| X| 5
--+---+--
6| 7| 8
And so on:
O| X| O
--+---+--
O| X| O
--+---+--
X| 7| X
I've debugged the minmax algorithm as much as I can, but it's getting really hard to follow what's going on.
Here's the ComputerPlayer class with the algorithm (and without all my print statements). The minmax method is where I'm having a lot of trouble. (I'm not 100% sure on using worst_score or even the associated logic.)
class ComputerPlayer < Player
def move(game_board)
minmax(game_board) #minmax to create #best_move
game_board.place_piece(#best_move, marker)
end
def minmax(board, player_tracker = 0)
if board.game_over?
return score(board)
else
worst_score = (1.0/0.0) #Infinity
best_score = -(1.0/0.0) #-Infinity
#best_move = board.get_available_positions.first
new_marker = player_tracker.even? ? 'O' : 'X'
player_tracker += 1
board.get_available_positions.each do |move|
new_board = board.place_piece(move, new_marker)
current_score = minmax(new_board, player_tracker)
if new_marker == marker #if the player is the computer player
if current_score > best_score
#best_move = move
best_score = current_score
end
else
if current_score < worst_score
worst_score = current_score
end
end
end
end
return best_score
end
def score(board)
if board.winner == "O" #'O' == 'O', 'nil' == 'O'
10
elsif board.winner == "X" #'X' != 'O', 'nil' != 'O'
-10
elsif board.winner == nil
0
end
end
end

The problem is that minmax always returns best_score.
The minmax routine constantly toggles between the two players. When the current player being simulated is the computer player, then the best score is the highest score, when the current player being simulated is the human player, then the best score is the lowest score.
I rewrote the routine to try all remaining moves for an iteration and keep track of the corresponding score in a local hash. When finished, the best score is returned and the best move is set, depending on the currently simulated player.
def minmax(board, player_tracker = 0, iteration = 0) #minmax
if board.game_over?
return score(board, iteration)
end
new_marker = player_tracker.even? ? 'O' : 'X'
scores = {}
board.get_available_positions.each do |move|
new_board = board.place_piece(move, new_marker)
scores[move] = minmax(new_board, player_tracker + 1, iteration + 1)
end
if player_tracker.even?
#best_move = scores.sort_by {|_key, value| value}.reverse.to_h.keys[0]
else
#best_move = scores.sort_by {|_key, value| value}.to_h.keys[0]
end
return scores[#best_move]
end
To even increase accuracy, I rewrote the score routine to also consider the iterations needed to create the board to score. Being able to win in 1 iteration should be preferred over winning in 3 iterations, right?
def score(board, iteration)
# "O", "X", "nil"
if board.winner == "O" #'O' == 'O', 'nil' == 'O'
10.0 / iteration
elsif board.winner == "X" #'X' != 'O', 'nil' != 'O'
-10.0 / iteration
elsif board.winner == nil
0
else
raise "ERROR"
end
end
With these 2 routines replaces, the steps taken by the computer seem much more logical.

Related

Minimax algorithm in Ruby in Object-Oriented way

I am trying to implement minimax algorithm for a tic tac toe game in an object-oriented way. I was a bit stuck with how to deal with reverting the state for my board object once the algorithm determined the best move. When running the program, I have noticed that minimax method operated on the current board object which is not ideal.
I added a method to undo the move done by the minimax method: board.[]=(empty_square, Square::INITIAL_MARKER)
I have noticed the algorithm makes the wrong choice. Here, X is a player and O is a computer. If this is the state of the board:
| |
| |
| |
-----+-----+-----
| |
| X |
| |
-----+-----+-----
| |
| | O
| |
When the player X makes a move and picks square 2, minimax (computer, O) will choose 7 instead of 8 which would be a better choice:
| |
| X |
| |
-----+-----+-----
| |
| X |
| |
-----+-----+-----
| |
O | | O
| |
Due to my inexperience, I am a little bit lost on how to proceed and would appreciate any guidance!
Here is the minimax method:
def minimax
best_move = 0
score_current_move = nil
best_score = -10000 if #current_marker == COMPUTER_MARKER
best_score = 10000 if #current_marker == HUMAN_MARKER
board.unmarked_keys.each do |empty_square|
board.[]=(empty_square, #current_marker)
if board.full?
score_current_move = 0
elsif board.someone_won?
score_current_move = -1 if board.winning_marker == HUMAN_MARKER
score_current_move = 1 if board.winning_marker == COMPUTER_MARKER
else
alternate_player
score_current_move = minimax[0]
end
if ((#current_marker == COMPUTER_MARKER) && (score_current_move >= best_score))
best_score = score_current_move
best_move = empty_square
elsif ((#current_marker == HUMAN_MARKER) && (score_current_move <= best_score))
best_score = score_current_move
best_move = empty_square
end
board.[]=(empty_square, Square::INITIAL_MARKER)
end
[best_score, best_move]
end
I see no particular advantage here to defining any classes at all. There is only one board and only two players (the machine and the human) who operate quite differently.
Main method
Next I will write the main method, which depends on several helper methods, all of which could be private.
def play_game(human_moves_first = true)
raise ArgumentError unless [true, false].include?(human_moves_first)
human_marker, machine_marker =
human_moves_first ? ['X', 'O'] : ['O', 'X']
board = Array.new(9)
if human_moves_first
display(board)
human_to_move(board, 'X')
end
loop do
display(board)
play = machine_best_play(board, machine_marker)
board[play] = machine_marker
display(board)
if win?(board)
puts "Computer wins"
break
end
if tie?(board)
puts "Tie game"
break
end
human_to_move(board, human_marker)
if tie?(board)
puts "Tie game"
break
end
end
end
As you see I have provided a choice of who starts, the machine or the human.
Initially, board is an array of 9 nils.
The method simply loops until a determination is made as to whether the machine wins or there is a tie. As we know, the machine, acting logically, cannot lose. In each pass of the loop the machine makes a mark. If that results in a win or a tie the game is over; else the human is called upon to make a mark.
Before considering the method machine_best_play, let's consider a few simple helper method that are needed.
Simple helper methods
I will demonstrate these methods with board defined as follows:
board = ['X', 'O', 'X',
nil, 'O', nil,
nil, nil, 'X']
Note that while the human refers to the nine locations as 1 through 9, internally they they are represented as indices of board, 0 through 8.
Determine unmarked cells
def unmarked_cells(board)
board.each_index.select { |i| board[i].nil? }
end
unmarked_cells(board)
#=> [3, 5, 6, 7]
Ask human to make a selection
def human_to_move(board, marker)
loop do
puts "Please mark '#{marker}' in an unmarked cell"
cell = gets.chomp
if (n = Integer(cell, exception: false)) && n.between?(1, 9)
n -= 1 # convert to index in board
if board[n].nil?
board[n] = marker
break
else
puts "That cell is occupied"
end
else
puts "That is not a number between 1 and 9"
end
end
end
human_to_move(board, 'O')
Please mark an 'O' in an unmarked cell
If cell = gets.chomp #=> "6" then
board
#=> ["X", "O", "X", nil, "O", "O", nil, nil, "X"]
For the following I have set board to its original value above.
Display the board
def display(board)
board.each_slice(3).with_index do |row, idx|
puts " | |"
puts " #{row.map { |obj| obj || ' ' }.join(' | ')}"
puts " | |"
puts "-----+-----+-----" unless idx == 2
end
end
display(board)
| |
X | O | X
| |
-----+-----+-----
| |
| O |
| |
-----+-----+-----
| |
| | X
| |
Determine if the last move (by the machine or human) wins
WINNING_CELL_COMBOS = [
[0,1,2], [3,4,5], [6,7,8], [0,3,6], [1,4,7], [2,5,8], [0,4,8], [2,4,6]
]
def win?(board)
WINNING_CELL_COMBOS.any? do |arr|
(f = arr.first) != nil && arr == [f,f,f]
end
end
win? board
#=> false
win? ['X', nil, 'O', 'nil', 'X', 'O', nil, nil, 'X']
#=> true
win? ['X', nil, 'O', 'nil', 'X', 'O', 'X', nil, 'O']
#=> true
Determine if game ends in a tie
def tie?(board)
unmarked_cells(board).empty?
end
tie?(board)
#=> false
tie? ['X', 'X', 'O', 'O', 'X', 'X', 'X', 'O', 'O']
#=> true
Note unmarked_cells.empty? can be replaced with board.all?.
Determine machine's best play using minimax algorithm
MACHINE_WINS = 0
TIE = 1
MACHINE_LOSES = 2
NEXT_MARKER = { "X"=>"O", "O"=>"X" }
def machine_best_play(board, marker)
plays = open_cells(board)
plays.min_by |play|
board_after_play = board.dup.tap { |a| a[play] = marker }
if machine_wins?(board_after_play, marker)
MACHINE_WIN
elsif plays.size == 1
TIE
else
human_worst_outcome(board_after_play, NEXT_MARKER[marker])
end
end
end
This requires two more methods.
Determine machine's best worst outcome for current state of board
def machine_worst_outcome(board, marker)
plays = open_cells(board)
plays.map |play|
board_after_play = board.dup.tap { |a| a[play] = marker }
if win?(board_after_play)
MACHINE_WINS
elsif plays.size == 1
TIE
else
human_worst_outcome(board_after_play, NEXT_MARKER[marker])
end
end.min
end
Determine human's best worst outcome for current state of board assuming
human also plays a minimax strategy
def human_worst_outcome(board, marker)
plays = open_cells(board)
plays.map |play|
board_after_play = board.dup.tap { |a| a[play] = marker }
if win?(board_after_play)
MACHINE_LOSES
elsif plays.size == 1
TIE
else
machine_worst_outcome(board_after_play, NEXT_MARKER[marker])
end
end.max
end
Notice that the human maximizes the worst outcome from the machine's perspective whereas the machine minimizes its worst outcome.
Almost there
All that remains is to quash any bugs that are present. Being short of time at the moment I will leave that to you, should you wish to do so. Feel free to edit my answer to make any corrections.

How to create a Minimax algorithm comparing arrays

I'm trying to code a "minimax" algorithm for Tic Tac Toe.
Each node of the tree is of the form [nil/Int, String] where the last element is a nine character string describing the board, and the first is an Integer ranking the node, or nil by default.
If the value is nil, it tries to inherit the appropriate value from child nodes.
This is where I get an error, when comparing an array with an array failed.
class Scene_TicTacToe #Script 2/2
def initialize
#Boardstate as a str from top left corner to bottom right corner.
#boardstate = "---------"
#1 = player, -1 = comp
#active_player = 1
end
def wincheck(boardstate=#boardstate)
#should return -1 for loss, 0 for draw, 1 for win
["OOO","XXX"].each do |f|
for i in 0..2
if (boardstate[i]+boardstate[i+3]+boardstate[i+6]).chr == f || boardstate[(3*i)..(3*i)+2] == f
return f == "OOO" ? 1 : -1
end
end
if (boardstate[0]+boardstate[4]+boardstate[8]).chr == f || (boardstate[2]+boardstate[4]+boardstate[6]).chr == f
return f == "OOO" ? 1 : -1
end
end
return 0
end
def computer_play
#Sets depth,and alpha/beta for pruning, so far so good
depth = 3
alpha = -100
beta = 100
##boardstate starts as "---------"
##active_player: 1=player, -1=computer
play(minimax(#boardstate, depth, alpha, beta, #active_player))
end
def play(array)
#Check actual boardside with parameter boardside to see what move has been
#selected and plays that move
for i in 0...array[1].length
if #boardstate[i] != array[1][i]
#color = array[1][i].chr == "X" ? #ai : #player
##cursor.y = (i / 3) * #side
##cursor.x = (i % 3) * #side
##board.bitmap.fill_rect(#cursor.x,#cursor.y,#side,#side,color)
#boardstate = array[1].dup
end
end
end
def minimax(boardstate, depth, alpha, beta, active_player)
#If bottom node reached, returns [boardstate_score, boardstate]
#wincheck returns 1 if player wins, -1 if computer wins, and 0 otherwise
if depth == 0 or wincheck(boardstate) != 0 or (/-/ =~ boardstate) == nil
return [wincheck(boardstate),boardstate]
end
if active_player == 1 #if player's turn
#Gets an array of all the next possible boardstates and return the one with
#the best eval.
child = generate_child(boardstate, active_player)
child.each do |f| #f = [Int/nil, String]
if f[0] == nil
#This should turn all the nil wincheck values to the best value of children nodes
f[0] = minimax(f[1], depth-1, alpha, beta, -active_player).last[0]
end
alpha = [f[0], alpha].max
if beta <= alpha
break
end
end
return child.sort_by{|c| c[0]}
end
if active_player == -1 #if computer's turn
#Same as above but with worst eval.
child = generate_child(boardstate, active_player)
child.each do |f|
if f[0] == nil
f[0] = minimax(f[1], depth-1, alpha, beta, -active_player).first[0]
end
beta = [f[0], beta].min
if beta <= alpha
break
end
end
#Following line raises "comparison of array with array failed" error :
return child.sort_by{|c| c[0]}
end
end
def generate_child(boardstate, active_player)
#returns boardstate string with one X or O more than current boardstate
#and sets nil as a default wincheck value
c = active_player == 1 ? "O" : "X"
a = []
for i in 0...boardstate.length
if boardstate[i].chr == "-"
s = boardstate.dup
s[i]= c
a << [nil, s]
end
end
return a
end
end
Error: comparison of array with array failed

Minimax algorithm Ruby Tic Tac Toe

I am writing unbeatable tic tac toe game using minimax algorithm. For some reason my scores hash lose its value as it comes out for the loop. If it is happening then I must be doing something wrong that I am unable to catch. I am new to coding. Need help!!!
mark is the mark for current player
mark1 and mark2 are the two marks for player1 and player2 respectively
spots is the empty spots on the board
require_relative 'board'
class ComputerPlayer
attr_reader :board
def initialize
#board = Board.new
#optimal_moves = [0, 2, 4, 6, 8]
end
def get_position(name, spots, mark, mark1, mark2)
position = nil
other_mark = nil
mark == mark1 ? other_mark = mark2 : other_mark = mark1
if spots.length == 9
position = #optimal_moves.sample.to_i
else
position = best_move(spots, mark, other_mark)
end
print "Enter your move #{name}: #{position}\n"
position
end
def best_move(spots, mark, other_mark, depth = 0, scores = {})
return 1 if board.winner == mark
return 0 if board.draw?
return -1 if board.winner == other_mark
spots.each do |move|
board.place_mark(move, mark)
scores[move] = best_move(spots[1..-1], mark, other_mark, depth += 1, {})
board.reset_position(move)
end
# it does not keep the value of scores. scores here is {}
return scores.max_by { |key, value| value }[0] if depth == 0
return scores.max_by { |key, value| value }[1] if depth > 0
return scores.min_by { |key, value| value }[1] if depth < 0
end
end
It looks like you are passing an empty hash back into best_move every time. What you're probably wanting to do is pass scores in on each recurrence to build up an object with moves and scores.
scores[move] = best_move(spots[1..-1], mark, other_mark, depth += 1, scores)

Getting a stack overflow when trying to impliment Negamax in Tic Tac Toe

I apologize in advance if this is a re-post, but I have read other posts on the topic and I am still can't figure out what to do. I am trying to implement Negamax in a ruby tic tac toe game and I am getting a stack overflow error on line 55. I have read so many posts and articles on Negamax and I still can't get it to work.
This file isn't part of a larger program yet, I just wanted to pass in a board and see if it would make a move.
class Negamax
attr_accessor :board, :mark, :depth, :winning_routes
def initialize
#board = ["X","2","3","4","5","6","7","8","9"]
#mark = "O"
#depth = 1
#winning_routes = [[0,1,2],[3,4,5],[6,7,8],[0,3,6],[1,4,7],[2,5,8],[0,4,8],[2,4,6]]
end
def negamax(board, mark, depth)
if winner?(mark) || isboardfull?(board)
return game_result(mark)
else
max = -1.0/0
mark == "O" ? opponent = "X" : "O"
available_moves.each do |space|
board[space] = mark
score = -negamax(board, opponent, depth + 1)
board[space] = "#{space + 1}"
if score > max
max = score
best_move = space if depth == 1
board[best_move] = mark
end
end
return max
end
end
def available_moves()
board.each_index.select { |s| board[s] != "X" && board[s] != "O"}
end
def isboardfull?(board)
boardtos = board.join(",")
boardtos =~ /\d/ ? false : true
end
def game_result(mark)
if winner?(mark)
return 1
elsif winner?(mark == "O" ? "X" : "O")
return -1
else
return 0
end
end
def winner?(mark)
result = false
marker = mark
winning_routes.each do |group|
if board[group[0]] == marker && board[group[1]] == marker && board[group[2]] == marker
result = true
end
end
result
end
end
game = Negamax.new()
game.negamax(game.board, game.mark, game.depth)
print game.board

How do I update one element in an array and show new version of my array in the tictactoe grid?

This is a tictactoe game. When a player chooses a grid via a number, it is supposed to update the picked number with X or O, and then show the updated grid. Unfortunately, it continues to output the same default grid of numbers. I used map! but it does not work. If I switch #player1 : x to x: #player1, it changes the entire array to X or O.
Once I figure this out, the win method will be the next task to check. Will win work to determine the winning combinations?
#a = [1,2,3,4,5,6,7,8,9]
def game
#game_board = "#{#a[0]}|#{#a[1]}|#{#a[2]}\n" "------\n" "#{#a[3]}|#{#a[4]}|#{#a[5]}\n" "------\n" "#{#a[6]}|#{#a[7]}|#{#a[8]}\n"
#game_board
end
def secondchoice
if #player1 == "X"
#player2 = "O"
elsif #player1 == "O"
#player2 = "X"
else puts "please pick a valid number"
end
end
def start
puts " Player One, Pick Your Tic or Toe , X or O"
#player1 = gets.chomp.upcase
#player2 = secondchoice
puts "Player one is #{#player1}"
puts "Now Player Two is #{#player2}"
end
def player_turns
# player 1 gets then player 2 gets until game over == true
# nine total moves using a loop with a counter will work
moves = 1
while moves < 10
if moves.odd?
puts #game_board
puts "its player One's turn! place #{#player1} on the board by picking a number: "
cell = gets.chomp
#a.map! {|x|x == cell ? #player1 : x}
moves += 1
unless check_win == true
end
else
puts #game_board
puts "its player Two's turn! place #{#player2} on the board by picking a number: "
kell = gets.chomp
#a.map! {|x|x == kell ? #player2 : x}
moves += 1
unless check_win == true
end
end
end
end
def win
# 10 turns(count) with no combination is a draw
return [[#a[0]+ #a[1] + #a[2]],[#a[3] + #a[4] + #a[5]],[#a[6] + #a[7] + #a[8]],[#a[0]+ #a[3] + #a[6]],[#a[1] + #a[4] + #a[7]],[#a[2] + #a[5]+ #a[8]],[#a[0]+ #a[4] + #a[8]],[#a[2] + #a[4] + #a[6]]]
#game_over check if player has 3 in a row
end
# Create a loop that gives player turns. Player one then player two until a draw or three in row
def check_win
win.each do |arr|
str = arr.join
if str == "xxx"
puts "X Wins!"
return true
elsif str == "ooo"
puts "O Wins!"
return true
end
end
return false
end
while check_win != true
start
game
player_turns
end
It's not clear why map! would factor in here. A simple array manipulation is all that's required:
#a[cell.to_i - 1] = #player2
Specifying the cells as zero indexed would make this easier.
There's a lot of evidence of going against the grain here that's made for a ton more code than necessary. For example, variables like #player1 and #player2 are usually a sign of bad design. Why not #players = %w[ X O ]? That alone solves a lot of problems if you apply that array throughout your code instead of having per-player variables. Each turn: #players.unshift(#players.pop)

Resources