Alpha-beta pruning - how this code implement resetting variables alpha and beta? - algorithm

Hello,
I'm trying to understand the alpha beta pruning algorithm using chess as an example from the following code:
def minimax(position, depth):
"""Returns a tuple (score, bestmove) for the position at the given depth"""
if depth == 0 or position.is_checkmate() or position.is_draw():
return (position.evaluate(), None)
else:
if position.to_move == "white":
bestscore = -float("inf")
bestmove = None
for move in position.legal_moves():
new_position = position.make_move(move)
score, move = minimax(new_position, depth - 1)
if score > bestscore: # white maximizes her score
bestscore = score
bestmove = move
return (bestscore, bestmove)
else:
bestscore = float("inf")
bestmove = None
for move in position.legal_moves():
new_position = position.make_move(move)
score, move = minimax(new_position, depth - 1)
if score < bestscore: # black minimizes his score
bestscore = score
bestmove = move
return (bestscore, bestmove)
Here's the link to the blog I got it from: LINK (you can view the code from the link if you like highlighted syntax)
What I don't understand is that in alpha beta pruning the value of alpha and beta variable must change sometimes when you go higher up in a tree. I attached a picture explaining my problem - while I understand the steps 1), 2) and 3), I don't get the 4) step. I know that the 4) step should look like on the picture but I don't know what's going on in the code at that step that the values change.
I followed the code carefully but for some reason I ended up with a = 5 and b = 5 in the 4) step which is ridiculous because that would mean that the branch on the right would get removed which is obviously wrong.

I think your reasoning in your comments is not correct. From your comments, your implicitly believe the search goes to the right branch of tree then back to the left branch of the tree, which is of course incorrect.
Your logic is wrong because at the (5) non-leaf node on the left branch of the tree, the search has only visited the nodes underneath (the leaf nodes (5) and (4). It has not visited the nodes on the right branch of the tree and therefore has no idea what the value will be. Therefore your comment
"there is a max node (a square) and the choice is made between 5 (on the left) and 4 (on the right). And a MAX node above them wants a bigger value so I think alpha should be set to 5 which is a lower bound." is not correct.
It's wrong because only the root node (the max node) knows the value 4 on the right, but it can only be done AFTER step 4. In fact, it can only be done at the end of the search, after all the nodes in the right branch of the tree are visited.

Related

Job Interview Question Using Trees, What data to save?

I was solving the following job interview question and solved most of it but failed at the last requirement.
Q: Build a data structure which supports the following functions:
Init - Initialise Empty DS. O(1) Time complexity.
SetPositiveInDay(d,x) - Add to the DS that in day d exactly x new people were infected with covid-19. O(log n)Time complexity.
WorseBefore(d) - From the days inserted into the DS and smaller than d return the last one which has more newly infected people than d. O(log n)Time complexity.
For example:
Init()
SetPositiveInDay(1,10)
SetPositiveInDay(2,20)
SetPositiveInDay(3,15)
SetPositiveInDay(5,17)
SetPositiveInDay(23,180)
SetPositiveInDay(8,13)
SetPositiveInDay(13,18)
WorstBefore(13) // Returns day #2
SetPositiveInDay(10,19)
WorstBefore(13) // Returns day #10
Important note: you can't suppose that days will be entered by order and can't suppose too that there won't be "gaps" between days. (Some days may not be saved in the DS while those after it may be).
What I did?
I used AVL tree (I could use 2-3 tree too).
For each node I have:
Sick - Number of new infected people in that day.
maxLeftSick - Max number of infected people for left son.
maxRightSick - Max number of infected people for right son.
When inserted a new node I made sure that in rotation data won't get missed plus, for each single node from the new one till the root I did:
But I wasn't successful implementing WorseBefore(d).
Where to search?
First you need to find the node node corresponding to d in the tree ordered by days. Let x = Sick(node). This can be done in O(log n).
If maxLeftSick(node) > x, the solution must be in the left subtree of node. Search for the solution there and return the answer. This can be done in O(log n) - see below.
Otherwise, traverse the tree upwards towards the root, starting from node, until you find the first node nextPredecessor satisfying this property (this takes O(log n)):
nextPredecessor is smaller than node,
and either
Sick(nextPredecessor) > x or
maxLeftSick(nextPredecessor) > x.
If no such node exists, we give up. In case 1, just return nextPredecessor since that is the best solution.
In case 2, we know that the solution must be in the left subtree of nextPredecessor, so search there and return the answer. Again, this takes O(log n) - see below.
Note that there is no need to search in the right subtree of nextPredecessor since the only nodes that are smaller than node in that subtree would be the left subtree of node itself, and we have already excluded that.
Note also that it is not necessary to traverse further up the tree than nextPredecessor since those nodes are even smaller, and we are looking for the largest node satisfying all constraints.
How to search?
OK, so how do we search for the solution in a subtree? Finding the largest day within a subtree rooted in q that is worse than an infection number x is simple using the maxLeftSick and maxRightSick information:
If q has a right child and maxRightSick(q) > x then search in the right subtree of q.
If q has no right child and Sick(q) > x, return Day(q).
If q has a left child and maxLeftSick(q) > x then search in the left subtree of q.
Otherwise there is no solution within the subtree q.
We are effectively using maxLeftSick and maxRightSick to prune the search tree to include only "worse" nodes, and within that pruned tree we get the right most node, i.e. the one with the largest day.
It is easy to see that this algorithm runs in O(log n) where n is the total number of nodes since the number of steps is bounded by the height of the tree.
Pseudocode
Here is the pseudocode (assuming maxLeftSick and maxRightSick return -1 if no corresponding child node exists):
// Returns the largest day smaller than d such that its
// infection number is larger than the infection number on day d.
// Returns -1 if no such day exists.
int WorstBefore(int d) {
node = find(d);
// try to find the solution in the left subtree
if (maxLeftSick(node) > Sick(node)) {
return FindLastWorseThan(node -> left, Sick(node));
}
// move up towards root until we find the first node
// that is smaller than `node` and such that
// Sick(nextPredecessor) > Sick(node) or
// maxLeftSick(nextPredecessor) > Sick(node).
nextPredecessor = findNextPredecessor(node);
if (nextPredecessor == null) return -1;
// Case 1
if (Sick(nextPredecessor) > Sick(node)) return nextPredecessor;
// Case 2: maxLeftSick(nextPredecessor) > Sick(node)
return FindLastWorseThan(nextPredecessor -> left, Sick(node));
}
// Finds the latest day within the given subtree with root "node" where
// the infection number is larger than x. Runs in O(log(size(q)).
int FindLastWorseThan(Node q, int x) {
if ((q -> right) = null and Sick(q) > x) return Day(q);
if (maxRightSick(q) > x) return FindLastWorseThan(q -> right, x);
if (maxLeftSick(q) > x) return FindLastWorseThan(q -> left, x);
return -1;
}
First of all, your chosen data structure looks fine to me. You did not mention it explicitly, but I assume that the "key" you use in the AVL tree is the day number, i.e. an in-order traversal of the tree would list the nodes in their chronological order.
I would just suggest a cosmetic change: store the maximum value of sick in the node itself, so that you don't have two similar informations (maxLeftSick and maxRightSick) stored in one node instance, but move those two informations to the child nodes, so that your node.maxLeftSick is actually stored in node.left.maxSick, and similarly node.maxRightSick is stored in node.right.maxSick. This is of course not done when that child does not exist, but then we don't need that information either. In your structure maxLeftSick would be 0 when left is not defined. In my proposed structure, you would not have that value -- the 0 would follow naturally from the fact that there is no left child. In my proposal, the root node would have an information in maxSick which is not present in yours, and which would be the sum of your root.maxLeftSick and root.maxRightSick. This information would not really be used, but it is just there to make the structure consistent throughout the tree.
So you would just store one maxSick, which considers the current node's sick value also in that maximum. The processing you do during rotations will need to change accordingly, but will not become more complex.
I will assume that your AVL tree is single-threaded, i.e. you don't keep track of parent-pointers. So create a find method which will return the path to the node to be found. For instance, in Python syntax, it could look like this:
def find(self, day):
node = self.root
path = [] # an array of nodes
while node:
path.append(node)
if node.day == day: # bingo
return path
if day < node.day:
node = node.left
else:
node = node.right
Then the worstBefore method could look like this:
def worstBefore(self, day):
path = self.find(day)
if not path:
return # day not found
# get number of sick people on that day:
sick = path[-1].sick
# look for recent day with greater number of sick
while path:
node = path.pop() # walk upward, starting with found node
if node.day < day and node.sick > sick:
return node.day
if node.left and node.left.maxSick > sick:
# we will find the result in this subtree
node = node.left
while True:
if node.right and node.right.maxSick > sick:
node = node.right
elif node.sick > sick: # bingo
return node.day
else:
node = node.left
So the path returned by the find method will be used to get the parents of a node when you need to backtrack upwards in the tree along that path.
If along that path you find a left child whose maxSick is greater, then you know that the targeted node must be in that subtree. It is then a matter to walk down that subtree in a controlled way, choosing the right child when it still has maxSick greater. Otherwise check the current node's sick value and return that one if that value is greater. Otherwise go left, and repeat.
While there is no such left sub tree, go up along the path. If that parent would be a match, then return it (make sure to verify the day number). Keep checking for left sub trees that have a larger maxSick.
This runs in O(logn) because you first will walk zero or more steps upward and then zero or more steps downward (in a left subtree).
You can see your example scenario run on repl.it. There I focussed on this question, and didn't implement the rotations.

Finding group sizes in matrices

So i was wondering, is there an easy way to detect the sizes of adjacent same values in a matrix? For example, when looking at the matrix of values between 0 and 12 below:
The size of the group at [0,4] is 14 because there are 14 5's connected to each other. But the 1 and 4 are not connected.
I think you can use a breath first search (well kind of, try to visualize the matrix as a tree)
Here's a pseudo python implementation. that does this. Would this work for you? Did you have a complexity in mind?
Code
visited_nodes = set()
def find_adjacent_vals(target_val, cell_row, cell_column):
if inside_matrix(cell_row, cell_column)
cell = matrix(cell_row, cell_column)
if cell not in visited_nodes:
visited_nodes.add(cell)
if cell.value == target_val:
return (1 +
find_adjacent_vals(target_val, cell_row + 1, cell_column) # below
+find_adjacent_vals(target_val, cell_row - 1, cell_column) # above
+find_adjacent_vals(target_val, cell_row, cell_column -1) # left
+find_adjacent_vals(target_val, cell_row, cell_column +1) # right
))
print "Adjacent values count: " + str(find_adjacent_vals(target_val, target_row, target_column))
Explanation
Let's say you start at a node, you start branching out visiting nodes you haven't visited before. You do this till you encounter no new cells of the same value. And each node is guaranteed to have only 1 parent node thanks to the set logic. Therefore no cell is double counted.

How to display Alpha Beta Pruning algorithm result?

Updates
Update 1
I tried this (2nd line): I added changing node color as first instruction in alphabeta function. I am getting this result:
Green nodes are visited nodes. It looks like, algorithm is going throw nodes correctly, right? But how to output correct values in nodes — I also need to do this? Minimum of children values, maximum of children values (excluding pruned branches).
Update 2
I tried to output alpha and beta to the tree nodes and didn't get correct result. This is code (line 18 and 31 were added). This is result of the code:
On this image I show strange places:
First arrow: why minimum of 7 and 6 is 5? Second arrow: why maximum of 4, 3 and 2 is 5? Strange. Thats why I think, that it is now working correctly.
Old question
Once upon a time I created similar question here. It was like: "why I get this error?". Lets rollback and created new one. This question will be: "How to display Alpha Beta Pruning algorithm result?"
I found pseudocode of this algorithm on the wiki. It can be found here.
My realization is below (it is on JavaScript, but I don't think that to answer this question you have to know JS or Java or C++ etc). The question is how to output result of this algorithm on the graph (tree structure)? On start I have this tree structure:
NOTE: I have tree structure (some amount of linked nodes), on which I will use alpha beta pruning algorithm, and I have another tree structure (for displaying results, lets call it "graph"). Nodes of tree, which I use to display graph are connected with nodes, which I use to find result of the algorithm.
So, code of the alpha beta pruning algroithm is below. Can you clarify what and where I have to output to display process/results of the algorithm correctly, please?
My assumption is to output alpha and beta, but I think, it is wrong. I tried it, but it doesn't work.
I want to display prunings and fill in all nodes in the tree with correct values.
This is my realization of minimax with alpha beta pruning:
function alphabeta(node, depth, alpha, beta, isMax, g) {
if((depth == 0) || (node.isTerminal == true)) {
return node.value;
}
if(isMax) {
console.log('maximizing');
for (var i in node.children) {
var child = node.children[i];
console.log(child);
alpha = Math.max(alpha, alphabeta(child, depth-1, alpha, beta, false, g));
if(beta <= alpha) {
console.log('beta '+beta+' alpha '+alpha);
break;
}
}
return alpha;
} else {
console.log('minimizing');
for (var i in node.children) {
console.log('1 child');
var child = node.children[i];
console.log(child);
beta = Math.min(beta, alphabeta(child, depth-1, alpha, beta, true, g));
if (beta <= alpha) {
console.log('beta '+beta+' alpha '+alpha);
break;
}
}
return beta;
}
}
Why don't you just store the nodes that are actually visited, and colour those nodes Red. Then you will see which nodes got evaluated compared to the entire tree. E.g.
After a long discussion in the comments, I think I can now shed light on this. As the alpha beta goes around the tree, it has three values, when operating on a given node, it has the alpha and beta that were carried down to it from its parent node, and then it has the best value it has found so far. If it finds a value outside the alpha-beta window, it immediately prunes, as it knows that this node is not an optimal move, irrespective of its value. Thus, for some nodes alpha beta never works out the "true value" of the node.
Thus, when you are asked to display the "result" of alpha beta, I mistakenly thought that you meant the alpha-beta window, since the "true value" is never necessarily evaluated.
You would need to write separate code to print the "true node values". I think that the minimax algorithm will do this for you.
Also, be aware when comparing by hand that if you are using a "set" of nodes, the list iterator is not guaranteed to return the nodes in a predictable order, so if inside the nodes you are using sets rather than lists, you might find that its hard to follow by hand. List iterators return in insertion order. Set iterators have no predictable iterator.

Shortest Path by rolling a dice

I have a difficult problem to solve (at least that's how I see it). I have a die (faces 1 to 6) with different values (others than [1-6]), and a board (n-by-m). I have a starting position and a finish position. I can move from a square to another by rolling the die. By doing this I have to add the top face to the sum/cost.
Now I have to calculate how to get from the start position to the end position with a minimum
sum/cost. I have tried almost everything but I can't find the correct algorithm.
I tried Dijkstra but it's useless because in the right path there are some intermediate nodes
that I can reach with a better sum from another path (that proves to be incorrect in the end). How should I change my algorithm?
algorithm overview:
dijkstra : PriorityQueue
if(I can get to a node with a smaller sum)
,remove it from the queue,
I change its cost and its die position
,add it to queue.
This is the code :
public void updateSums() {
PriorityQueue<Pair> q = new PriorityQueue<>(1, new PairComparator());
Help h = new Help();
q.add(new Pair(startLine, startColumn, sums[startLine][startColumn]));
while (!q.isEmpty()) {
Pair current = q.poll();
ArrayList<Pair> neigh = h.getNeighbours(current, table, lines, columns);
table[current.line][current.column].visit(); //table ->matrix with Nodes
for (Pair a : neigh) {
int alt = sums[current.line][current.column] + table[current.line][current.column].die.roll(a.direction);
if (sums[a.line][a.column] > alt) {
q.remove(new Pair(a.line, a.column, sums[a.line][a.column]));
sums[a.line][a.column] = alt; //sums -> matrix with costs
table[a.line][a.column].die.setDie(table[current.line][current.column].die, a.direction);
q.add(new Pair(a.line, a.column, sums[a.line][a.column]));
}
}
}
}
You need to also consider the position of the die in your Dijkstra states.
I.e. you cannot just have sums[lines][column], you'll have to do something such as sums[lines][column][die_config], where die_config is some way you create to convert the die position into an integer.
For example, if you have a die that looks like this initially:
^1 <4 v2 >9 f5 b7 (^ = top face, < = left... down, right, front and
back)
int initial_die[6] = {1,4,2,9,5,7}
You can convert it to an integer by simply considering the index of the face (from 0 to 5) that is pointing up and the one that is to the left. This means your die has less than 36 (see bottom note) possible rotation positions, which you can encode through something such as (0-based) (up*6 + left). By this I mean each face would have a value from 0 through 5 that you decide, regardless of their cost-associated value, so following the example above we would encode the initially top face as being the index 0, the left face as being the index 1, and so on.
So the die with config value 30 means that left = 30%6 (=0) the face that was initially pointing up (initial_die[0]), is currently pointing to the left, and up = (30 - left)/6 (=5) the face that is currently pointing up, is the one that was initially pointing to the back of the die (initial_die[5]). So this means the die currently has the cost 1 on its left face, and the cost 7 on its top face, and you can derive the rest of the die's faces from this information, since you know the initial disposition. (Basically this tells us the die rolled once to its left, followed by once towards its front, in comparison to the initial state)
With this additional information, your Dijkstra will be able to find the correct answer you seek, by considering the cheapest cost that reaches the final node, as you could have multiple with different final die positions.
Note: It doesn't actually have 36 possible positions, because some are impossible, for example two initially opposite sides won't be able to become adjacent on Up/Left. There are in fact only 24 valid positions, but the simple encoding I used above will actually use indexes up to ~34 depending on how you encode your die.

what is the best algorithm to traverse a graph with negative nodes and looping nodes

I have a really difficult problem to solve and Im just wondering what what algorithm can be used to find the quickest route. The undirected graph consist of positive and negative adjustments, these adjustments effect a bot or thing which navigate the maze. The problem I have is mazes which contain loops that can be + or -. An example might help:-
node A gives 10 points to the object
node B takes 15 from the object
node C gives 20 points to the object
route=""
the starting node is A, and the ending node is C
given the graph structure as:-
a(+10)-----b(-15)-----c+20
node() means the node loops to itself - and + are the adjustments
nodes with no loops are c+20, so node c has a positive adjustment of 20 but has no loops
if the bot or object has 10 points in its resource then the best path would be :-
a > b > c the object would have 25 points when it arrives at c
route="a,b,c"
this is quite easy to implement, the next challenge is knowing how to backtrack to a good node, lets assume that at each node you can find out any of its neighbour's nodes and their adjustment level. here is the next example:-
if the bot started with only 5 points then the best path would be
a > a > b > c the bot would have 25 points when arriving at c
route="a,a,b,c"
this was a very simple graph, but when you have lots of more nodes it becomes very difficult for the bot to know whether to loop at a good node or go from one good node to another, while keeping track of a possible route.
such a route would be a backtrack queue.
A harder example would result in lots of going back and forth
bot has 10 points
a(+10)-----b(-5)-----c-30
a > b > a > b > a > b > a > b > a > b > c having 5 pts left.
another way the bot could do it is:-
a > a > a > b > c
this is a more efficient way, but how the heck you can program this is partly my question.
does anyone know of a good algorithm to solve this, ive already looked into Bellman-fords and Dijkstra but these only give a simple path not a looping one.
could it be recursive in some way or some form of heuristics?
referring to your analogy:-
I think I get what you mean, a bit of pseudo would be clearer, so far route()
q.add(v)
best=v
hash visited(v,true)
while(q is not empty)
q.remove(v)
for each u of v in G
if u not visited before
visited(u,true)
best=u=>v.dist
else
best=v=>u.dist
This is a straightforward dynamic programming problem.
Suppose that for a given length of path, for each node, you want to know the best cost ending at that node, and where that route came from. (The data for that length can be stored in a hash, the route in a linked list.)
Suppose we have this data for n steps. Then for the n+1st we start with a clean slate, and then take each answer for the n'th, move it one node forward, and if we land on a node we don't have data for, or else that we're better than the best found, then we update the data for that node with our improved score, and add the route (just this node linking back to the previous linked list).
Once we have this for the number of steps you want, find the node with the best existing route, and then you have your score and your route as a linked list.
========
Here is actual code implementing the algorithm:
class Graph:
def __init__(self, nodes=[]):
self.nodes = {}
for node in nodes:
self.insert(node)
def insert(self, node):
self.nodes[ node.name ] = node
def connect(self, name1, name2):
node1 = self.nodes[ name1 ]
node2 = self.nodes[ name2 ]
node1.neighbors.add(node2)
node2.neighbors.add(node1)
def node(self, name):
return self.nodes[ name ]
class GraphNode:
def __init__(self, name, score, neighbors=[]):
self.name = name
self.score = score
self.neighbors = set(neighbors)
def __repr__(self):
return self.name
def find_path (start_node, start_score, end_node):
prev_solution = {start_node: [start_score + start_node.score, None]}
room_to_grow = True
while end_node not in prev_solution:
if not room_to_grow:
# No point looping endlessly...
return None
room_to_grow = False
solution = {}
for node, info in prev_solution.iteritems():
score, prev_path = info
for neighbor in node.neighbors:
new_score = score + neighbor.score
if neighbor not in prev_solution:
room_to_grow = True
if 0 < new_score and (neighbor not in solution or solution[neighbor][0] < new_score):
solution[neighbor] = [new_score, [node, prev_path]]
prev_solution = solution
path = prev_solution[end_node][1]
answer = [end_node]
while path is not None:
answer.append(path[0])
path = path[1]
answer.reverse()
return answer
And here is a sample of how to use it:
graph = Graph([GraphNode('A', 10), GraphNode('B', -5), GraphNode('C', -30)])
graph.connect('A', 'A')
graph.connect('A', 'B')
graph.connect('B', 'B')
graph.connect('B', 'B')
graph.connect('B', 'C')
graph.connect('C', 'C')
print find_path(graph.node('A'), 10, graph.node('C'))
Note that I explicitly connected each node to itself. Depending on your problem you might want to make that automatic.
(Note, there is one possible infinite loop left. Suppose that the starting node has a score of 0 and there is no way off of it. In that case we'll loop forever. It would take work to add a check for this case.)
I'm a little confused by your description, it seems like you are just looking for shortest path algorithms. In which case google is your friend.
In the example you've given you have -ve adjustments which should really be +ve costs in the usual parlance of graph traversal. I.e. you want to find a path with the lowest cost so you want more +ve adjustments.
If your graph has loops that are beneficial to traverse (i.e. decrease cost or increase points through adjustments) then the best path is undefined because going through the loop one more time will improve your score.
Here's some psuedocode
steps = []
steps[0] = [None*graph.#nodes]
step = 1
while True:
steps[step] = [None*graph.#nodes]
for node in graph:
for node2 in graph:
steps[step][node2.index] = max(steps[step-1][node.index]+node2.cost, steps[step][node2.index])
if steps[step][lastnode] >= 0:
break;

Resources