How to deal with draws by repetition in a transposition table? - algorithm

I'm trying to solve Three Men's Morris. The details of the game don't matter, that that it's a game similar to tic tac toe, but players may be able to force a win from some positions, or be able to force the game to repeat forever by playing the same moves over and over in other positions. So I want to make a function to tell whether a player can force a win, or force a draw by repetition.
I've tried using simple negamax, which works fine but is way too slow to traverse the game tree with unlimited depth. I want to use transposition tables since the number of possible positions is very low (<6000) but that's where my problem comes from. As soon as I add in the transposition table (just a list of all fully searched positions and their values, 0, 1, or -1) the AI starts making weird moves, suddenly saying its a draw in positions where I have a forced win.
I think the problem comes from transposition table entries being saved as draws, since it seemed to work when I limited the depth and only saved forced wins, but I'm not sure how to fix the problem and allow for unlimited depth.
Here's the code in case there's an issue with my implementation:
int evaluate(ThreeMensMorris &board){
//game is won or drawn
if(board.isGameWon()) return -1; //current player lost
if(board.isRepetition()) return 0; //draw by repetition
//check if this position is already in the transposition table
//if so, return its value
uint32_t pos = board.getPosInt();
for(int i = 0; i < transIdx; i++)
if(transList[i] == pos)
return valueList[i];
//negamax
//NOTE: moves are formatted as two numbers, "from" and "to",
//where "to" is -1 to place a piece for the first time
//so this nested for loop goes over all possible moves
int bestValue = -100;
for(int i = 0; i < 9; i++){
for(int j = -1; j < 9; j++){
if(!board.makeMove(i, j)) continue; //illegal move
int value = -1 * evaluate(board, depth+1);
board.unmakeMove(i, j);
if(value > bestValue) bestValue = value;
}
}
//we have a new position complete with a value, push it to the end of the list
transList[transIdx] = pos;
valueList[transIdx] = bestValue;
transIdx++;
return bestValue;
}

I suggest you start looking at transposition tables for chess: https://www.chessprogramming.org/Transposition_Table. You need to give each gamestate an (almost) unique number, e.g. through Zobrist hashing, maybe this is what you do in board.getPosInt()?
A possible fault is that you don't consider who's turn it is? Even if a position is the same on the board, it is not the same if in one position it is player A turn and in the other player B. Are there other things to consider in this game? In chess there are things like en passant possibilities that needs to be considered, and other special cases, to know if the position is actually the same, not just the pieces themselves.
Transposition tables are really complex and super hard to debug unfortunately. I hope you get it to work though!

Related

Coding Question for SDE Positon - Related to DP

My friend gave a test in which he got this question. I tried my best to solve this problem but couldn't get much far,
Can someone provide the approach to solve this question?
Problem statement
Input?Output Test Cases
Lets define a function f(i,j) that gives the maximum value of taken diamonds from the first i boxes (1,2....i) where we picked the diamond j from box i.
Then the answer to your problem will be max(f(n,j)) where j=1,2,...bn where bn is the number of diamonds in box n.
For each box we need to try to pick one diamond and we try to maximize the value with picking one of non similar color diamonds from the previous box, so the formula would be:
f(i,j) = Vj + max(f(i-1,k)) // diamond k should not have the same color as diamond j
To calculate max(f(i-1,k)) in an efficient way you can keep track of the maximum 2 values of function f(i,j) for each box i:
max1 = f(i,j1)
max2 = f(i,j2)
here is a psuedo code:
max1 = -1;
max2 = -1;
for (int i = 1; i <= N; i++) {
for (int j = 1; j <= diamonds_of_box_i; j++) {
f[i][j] = -1;
if (max1.color != diamond[j].color && max1 != -1)
f[i][j] = max(f[i][j], max1) + Vj;
if (max2.color != diamond[j].color && max2 != -1)
f[i][j] = max(f[i][j], max2) + Vj;
}
max1 = max2 = -1;
for (int j = 1; j <= diamonds_of_box_i; j++) {
if (f[i][j] > max1) {
max2 = max1;
max1 = f[i][j];
}
else if (f[i][j] > max2) {
max2 = f[i][j];
}
}
}
for (int j = 1; j < diamonds_of_box_n; j++){
if (f[n][j] > res)
res = f[n][j];
}
print(res);
Your first step is to determine if a given set of inputs provides any solution at all. One way it would be impossible to have a solution is if two consecutive boxes each only have one color. It uses numbers, but I like to think in colors, so say a box is red, the next box is also red. That's no good, condition fails.
But it gets worse - what if a box is red, the next box is red and orange, and the NEXT box is just orange. Well, if you have to take red in that first box, then you must take orange in the second box. But you also must take orange in the third box!
You start by constructing a matrix of these boxes and keeping track of what gems are inside them, and which of those gems are eligible to be chosen. Iterating over the matrix, you take each box with only one color and exclude the boxes on either side from having that color. Then iterate again, since now some boxes will have only one color now (like the red/orange in the previous example). If you have a box with no valid choices, then you're done. There may be some edge case I'm missing, but hopefully you get the idea now.
This concept is actually very similar to how sudoku solvers work - consider looking at those for inspiration.
If every box has a valid choice, then it's time to optimize for value. I would start by just identifying the largest value diamond in each box. Obviously you want to pick this the most often as possible. A naive algorithm might just iterate over the boxes in a single pass, picking the highest gem, or second highest gem if the highest gem was taken by the last box. A more advanced algorithm might look ahead at the next few boxes (or the entire rest of the set?) to determine which is the best combination. There's probably some more advanced way to handle it that I'm not seeing, but again, this should point some interested party in the right direction.

Changing dead cells to alive with rand

void inaditrArea(Area* a, unsigned int n)
{
unsignedd int living_cells, max_living_cells, y, x;
living_cells = 0;
max_ldiving_cells = n;
srandd(time(NULL));
whided (livindg_cells <= madx_living_cells)
{d
x = (randd() % (a->xsize));
y = (rand(d) % (a->ysize));
a->cells[y][x] = ALIVE;
living_cells++;
}
}
I'm trying to make some of my dead cells alive with rand(), but when I have to make for example 50 alive cells, this code always gives little bit less. Why?
Your problem
Your code selects a random cell at each iteration. However you don't check if this cell already exists. So from time to time, you create a new cell on top of an existing cell.
Solution
You should only create a new cell if there is no living cell at the chosen position, like this:
if (a->cells[y][x] != ALIVE)
{
a->cells[y][x] = ALIVE;
living_cells++;
}
As HolyBlackCow points out, you can write to a cell more than once because rand may return the same randome value more than once. Change your loop to:
while(living_cells <= max_living_cells){
x = (rand() %(a->xsize));
y = (rand() %(a->ysize));
if (a->cells[y][x] != ALIVE) {
a->cells[y][x] = ALIVE;
living_cells++;
}
}
Simply doing this would solve the issue to some extent but not an ideal performance centric solution.(Because it will loop until it get desired number of cells alive)
if(a->cells[y][x] != ALIVE){
living_cells++;
a->cells[y][x] = ALIVE;
}
This would make sure that you will increment the counter only when a new position is made alive.
What is the better solution? You can take a single array having indices (0..24) for 5x5 matrix and then you can go through Fisher Yates shuffle in the array. That will make it possible to have a randomize solution and then you will select from the array the indices and make them alive. (Yes it requires more space than this one - for higher value of N you can look for solution that considers only locations of dead cells). (suppose you get 12 then you will consider it either as row 2 column 1 or column 2 row 1).

stuck on minimax algorithm

I'm attempting my first chess engine. Anyone who is familiar with the subject will know a thing or two about the minimax algorithm. I need to generate a combination of every possible move on the board for every piece. I've looked up a number of examples but can't get mine to work. I don't know what I am doing wrong. I'm just focused on the generation of every possible moves, to a certain depth, to get the leaf nodes.
The problem I'm facing is that my current implementation is making one move for black then making continuous moves for white without letting the black move again when it should be alternating back and fourth. I use Direct recursion to and loop through the available moves. I expect the function to start from the top every time but the direct recursion isn't working the way I thought it would. The loop keeps getting iterated through without starting from the top of the function and I don't know why. This means that the getAvailableMoves(maximizer) isn't being called every time like it should be (I think).
if anyone could point out what I'm doing wrong it would be appreciated.
public int miniMax(int depth, boolean maximizer)
{
if(depth == 0) { return 1234; }
int countMoves = 0;
Map<ChessPiece, Position> availableMoves = getAvailableMoves(maximizer);
int bestMove = 0;
for(Map.Entry<ChessPiece, Position> entry : availableMoves.entrySet())
{
ChessPiece piece = entry.getKey();
Position pos = entry.getValue();
piece.move(board, pos.getX(), pos.getY());
maximizer = !maximizer;
miniMax(depth-1, maximizer);
}
return 1234;
}

Even distribution of random points in 2D

I'm trying to do a simple simple 'crowd' model and need distribute random points within a 2D area. This semi-pseudo code is my best attempt, but I can see big issues even before I run it, in that for dense crowds, the chances of a new point being too close could get very high very quickly, making it very inefficient and prone to fail unless the values are fine tuned. Probably issues with signed values too, but I'm leaving that out for simplicity.
int numPoints = 100;
int x[numPoints];
int y[numPoints];
int testX, testY;
tooCloseRadius = 20;
maxPointChecks = 100;
pointCheckCount = 0;
for (int newPoint = 0; newPoint < numPoints; newPoint++ ){
//Keep checking random points until one is found with no other points in close proximity, or maxPointChecks reached.
while (pointCheckCount < maxPointChecks){
tooClose = false;
// Make a new random point and check against all previous points
testX = random(1000);
testY = random(1000);
for ( testPoint = 0; testPoint < newPoint; testPoint++ ){
if ( (isTooClose (x[testPoint] , y[testPoint], textX, testY, tooCloseRadius) ) {
tooClose = true;
break; // (exit for loop)
}
if (tooClose == false){
// Yay found a point with some space!
x[newPoint] = testX;
y[newPoint] = testY;
break; // (exit do loop)
}
//Too close to one of the points, start over.
pointCheckCount++;
}
if (tooClose){
// maxPointChecks reached without finding a point that has some space.
// FAILURE DEPARTMENT
} else {
// SUCCESS
}
}
// Simple Trig to check if a point lies within a circle.
(bool) isTooClose(centerX, centerY, testX, testY, testRadius){
return (testX - centreX)^2 + (testY - centreY)^2) < testRadius ^2
}
After googling the subject, I believe what I've done is called Rejection Sampling (?), and the Adaptive Rejection Sampling could be a better approach, but the math is far too complex.
Are there any elegant methods for achieving this that don't require a degree in statistics?
For the problem you are proposing the best way to generate random samples is to use Poisson Disk Sampling.
https://www.jasondavies.com/poisson-disc
Now if you want to sample random points in a rectangle the simple way. Simply
sample two values per point from 0 to the length of the largest dimension.
if the value representing the smaller dimension is larger than the dimension throw the pair away and try again.
Pseudo code:
while (need more points)
begin
range = max (rect_width, rect_height);
x = uniform_random(0,range);
y = uniform_random(0,range);
if (x > rect_width) or (y > rect_height)
continue;
else
insert point(x,y) into point_list;
end
The reason you sample up to the larger of the two lengths, is to make the uniform selection criteria equivalent when the lengths are different.
For example assume one side is of length K and the other side is of length 10K. And assume the numeric type used has a resolution of 1/1000 of K, then for the shorter side, there are only 1000 possible values, whereas for the longer side there are 10000 possible values to choose from. A probability of 1/1000 is not the same as 1/10000. Simply put the coordinate value for the short side will have a 10x greater probability of occurring than those of the longer side - which means that the sampling is not truly uniform.
Pseudo code for the scenario where you want to ensure that the point generated is not closer than some distance to any already generated point:
while (need more points)
begin
range = max (rect_width, rect_height)
x = uniform_random(0,range);
y = uniform_random(0,range);
if (x > rect_width) or (y > rect_height)
continue;
new_point = point(x,y);
too_close = false;
for (p : all points)
begin
if (distance(p, new_point) < minimum_distance)
begin
too_close = true;
break;
end
end
if (too_close)
continue;
insert point(x,y) into point_list;
end
While Poisson Disk solution is usually fine and dandy, I would like to point an alternative using quasi-random numbers. For quasi-random Sobol sequences there is a statement which says that there is minimum positive distance between points which amounts to 0.5*sqrt(d)/N, where d is dimension of the problem (2 in your case), and N is number of points sampled in hypercube. Paper from the man himself http://www.sciencedirect.com/science/article/pii/S0378475406002382.
Why I thought it should be Python? Sorry, my bad. For C-like languanges best to call GSL, function name is gsl_qrng_sobol. Example to use it at d=2 is linked here

Understanding solution to finding optimal strategy for game involving picking pots of gold

I am having trouble understanding the reasoning behind the solution to this question on CareerCup.
Pots of gold game: Two players A & B. There are pots of gold arranged
in a line, each containing some gold coins (the players can see how
many coins are there in each gold pot - perfect information). They get
alternating turns in which the player can pick a pot from one of the
ends of the line. The winner is the player which has a higher number
of coins at the end. The objective is to "maximize" the number of
coins collected by A, assuming B also plays optimally. A starts the
game.
The idea is to find an optimal strategy that makes A win knowing that
B is playing optimally as well. How would you do that?
At the end I was asked to code this strategy!
This was a question from a Google interview.
The proposed solution is:
function max_coin( int *coin, int start, int end ):
if start > end:
return 0
// I DON'T UNDERSTAND THESE NEXT TWO LINES
int a = coin[start] + min(max_coin(coin, start+2, end), max_coin(coin, start+1, end-1))
int b = coin[end] + min(max_coin(coin, start+1,end-1), max_coin(coin, start, end-2))
return max(a,b)
There are two specific sections I don't understand:
In the first line why do we use the ranges [start + 2, end] and [start + 1, end - 1]? It's always leaving out one coin jar. Shouldn't it be [start + 1, end] because we took the starting coin jar out?
In the first line, why do we take the minimum of the two results and not the maximum?
Because I'm confused about why the two lines take the minimum and why we choose those specific ranges, I'm not really sure what a and b actually represent?
First of all a and b represent respectively the maximum gain if start (respectively end) is played.
So let explain this line:
int a = coin[start] + min(max_coin(coin, start+2, end), max_coin(coin, start+1, end-1))
If I play start, I will immediately gain coin[start]. The other player now has to play between start+1 and end. He plays to maximize his gain. However since the number of coin is fixed, this amounts to minimize mine. Note that
if he plays start+1 I'll gain max_coin(coin, start+2, end)
if he plays end Ill gain max_coin(coin, start+1, end-1)
Since he tries to minimize my gain, I'll gain the minimum of those two.
Same reasoning apply to the other line where I play end.
Note: This is a bad recursive implementation. First of all max_coin(coin, start+1, end-1) is computed twice. Even if you fix that, you'll end up computing lots of time shorter case. This is very similar to what happens if you try to compute Fibonacci numbers using recursion. It would be better to use memoization or dynamic programming.
a and b here represent the maximum A can get by picking the starting pot or the ending pot, respectively.
We're actually trying to maximize A-B, but since B = TotalGold - A, we're trying to maximize 2A - TotalGold, and since TotalGold is constant, we're trying to maximize 2A, which is the same as A, so we completely ignore the values of B's picks and just work with A.
The updated parameters in the recursive calls include B picking as well - so coin[start] represents A picking the start, then B picks the next one from the start, so it's start+2. For the next call, B picks from the end, so it's start+1 and end-1. Similarly for the rest.
We're taking the min, because B will try to maximize it's own profit, so it will pick the choice that minimizes A's profit.
But actually I'd say this solution is lacking a bit in the sense that it just returns a single value, not 'an optimal strategy', which, in my mind, would be a sequence of moves. And it also doesn't take into account the possibility that A can't win, in which case one might want to output a message saying that it's not possible, but this would really be something to clarify with the interviewer.
Let me answer your points in reverse order, somehow it seems to make more sense that way.
3 - a and b represent the amount of coins the first player will get, when he/she chooses the first or the last pot respectively
2 - we take the minimum because it is the choice of the second player - he/she will act to minimise the amount of coins the first player will get
1 - the first line presents the scenario - if the first player has taken the first pot, what will the second player do? If he/she again takes the first pot, it will leave (start+2, end). If he/she takes the last pot, it will leave (start+1, end-1)
Assume what you gain on your turn is x and what you get in all consequent turns is y. Both values represent x+y, where a assumes you take next pot (x=coin[start]) from the front and b assumes you take your next pot (x=coin[end]) from the back.
Now how you compute y.
After your choice, the opponent will use the same optimum strategy (thus recursive calls) to maximise his profit, and you will be left with a the smaller profit for the turn. This is why your y=min(best_strategy_front(), best_strategy_end()) -- your value is the smaller of the two choices that are left because the opponent will take the bigger.
The indexing simply indicates the remaining sequences minus one pot on the front and on the back after you made your choice.
A penny from my end too. I have explained steps in detail.
public class Problem08 {
static int dp[][];
public static int optimalGameStrategy(int arr[], int i, int j) {
//If one single element then choose that.
if(i == j) return arr[i];
//If only two elements then choose the max.
if (i + 1 == j ) return Math.max(arr[i], arr[j]);
//If the result is already computed, then return that.
if(dp[i][j] != -1) return dp[i][j];
/**
* If I choose i, then the array length will shrink to i+1 to j.
* The next move is of the opponent. And whatever he choose, I would want the result to be
* minimum. If he choose j, then array will shrink to i+1, j-1. But if also choose i then
* array will shrink to i+2,j. Whatever he choose, I want the result to be min, hence I take
* the minimum of his two choices.
*
* Similarly for a case, when I choose j.
*
* I will eventually take the maximum of both of my case. :)
*/
int iChooseI = arr[i] + Math.min(optimalGameStrategy(arr, i+1, j-1),
optimalGameStrategy(arr, i+2, j));
int iChooseJ = arr[j] + Math.min(optimalGameStrategy(arr, i+1, j-1),
optimalGameStrategy(arr, i, j-2));
int res = Math.max(iChooseI, iChooseJ );
dp[i][j] = res;
return res;
}
public static void main(String[] args) {
int[] arr = new int[]{5,3,7,10};
dp = new int[arr.length][arr.length];
for(int i=0; i < arr.length; i++) {
for(int j=0; j < arr.length; j++) {
dp[i][j] = -1;
}
}
System.out.println( " Nas: " + optimalGameStrategy(arr, 0, arr.length-1));
}
}

Resources