Understanding the cut-off condition in the alpha-beta pruning algorithm

Understanding the cut-off condition in the alpha-beta pruning algorithm - algorithm

I'm having trouble understanding this pseudocode I found for alpha beta pruning on wikipedia:
function alphabeta(node, depth, α, β, Player)
if depth = 0 or node is a terminal node
return the heuristic value of node
if Player = MaxPlayer
for each child of node
α := max(α, alphabeta(child, depth-1, α, β, not(Player)))
if β ≤ α
break (* Beta cut-off *)
return α
else
for each child of node
β := min(β, alphabeta(child, depth-1, α, β, not(Player)))
if β ≤ α
break (* Alpha cut-off *)
return β
What is confusing me is the if Player = MaxPlayer condition. I understand the whole recursively calling the function with not(Player) to get the minimum value, which will then recursively call the function with Player, repeating until the depth limit is reached or a goal state has be found. However, I don't understand the
if β ≤ α
break
statement. My understanding of that is that, the second value higher than the minimum value found in the previous call (β) is found, that is the value that is used. But since this is the MAX part of the function, don't we want the HIGHEST value, not just ANY value that is greater than beta?

This is the trimming phase of the algorithm, in the MaxPlayer clause (When checking for max value for the player in this node):
Beta is the parameter of the function which is the "trimming factor". It represents the minimum score you have found so far. It means that the parent of the current node, which is a minimizing node - has already found a solution which is beta.
Now, if we continue iterating all children, we will get something at least as good as the current alpha. Since beta <= alpha - the parent node - which is minimizing node - will NEVER chose this alpha (or any value greater than it) - it will chose a value which is beta or lower - and the current node has no chance of finding such, so we can trim the calculation.
Example:
MIN
/ \
/ \
/ \
/ \
5 MAX
/ | \
/ | \
/ | \
6 8 4
When evaluating the MAX node, we will return 8, if we apply the normal min-max algorithm. However, we know that the MIN function is going to do min(5, MAX(6, 8, 4)). Since after we read 6 we know max(6, 8, 4) >= 6, we can return 6 without continuing computations because the MIN computation of the upper level will be min(5, MAX(6, 8, 4)) = min(5, 6) = 5.
This is the intuition for one level, it is of course done recursively to "flow" to all levels with the same idea.
The same idea holds for the trimming condition in the MIN vertex.

Related

Algorithm to find the minimum among the distances between pairs of vertices of a tree

A tree is given whose every vertex is initially coloured white.
Now, the vertices are coloured black one by one, and the aim is to find, after colouring every new vertex, the minimum among the distances between all possible pairs of black vertices.
It should be noted that the distance between a pair of vertices is the number of edges on the path between them.
To solve this problem, I've used an array min_dist[] (0-based indexing) such that min_dist[u] is the distance of vertex u from the nearest black vertex. Then, I've used a depth-first search on the graph after colouring every new vertex.
Let G represent the tree and c[] be the array representing the vertices to be coloured black in order.
SOLVE(G, c)
1. ans = +(infinity)
2. for (i = 0 to (|G.V| - 1))
3. min_dist[i] = +(infinity)
4. for (i = 0 to (|G.V| - 1))
5. min_dist[c[i]] = 0
6. DFS(G, ans, min_dist, c[i], -1)
7. print ans
DFS(G, ans, min_dist, v, parent)
1. for (each vertex child in G.Adj[v])
2. if (child == parent)
3. continue
4. if (min_dist[child] > min_dist[v] + 1)
5. min_dist[child] = min_dist[v] + 1
6. DFS(G, ans, min_dist, child, v);
7. else if (ans > min_dist[child] + min_dist[v] + 1)
8. ans = (min_dist[child] + min_dist[v] + 1)
Now, I think that my algorithm is correct, but the official solution to this problem is a slightly modified version of my algorithm where they've added an extra check for termination in the DFS.
DFS(G, ans, min_dist, v, parent)
1. if (min_dist[v] >= ans)
2. return
3. for (each vertex child in G.Adj[v])
...
I need help in verifying the correctness of this modified version.
I've taken many examples and in all of those, this modified version produces the correct answers. (https://kushagrj.github.io/Codeforces-Round-847-Div-3-Problem-F/)

So you have the optimization where you don't DFS into a child unless the child's min_dist was updated. Using the same reasoning, we can derive the extra test.
Suppose that the problem had an extra parameter θ, and you were to determine whether ans < θ. By tedious induction, if DFS is called when min_dist[v] ≥ θ−1, then it only writes values ≥ θ to min_dist. The only other reads would be to evaluate < θ, so you could restrict the distance labels to 0...θ, making θ effectively ∞. Then the no-update optimization kicks in.
The other idea is that you can repeatedly lower θ to the current best value of ans, effectively using ans as θ.

Data Structures and Algorithmn in C++ 2nd Ed - Goodrich . Page 295 question on vector-based structure binary tree worst case for space 2^n - 1

Let me explain as best as i can. This is about binary tree using vector.
According to author, the implementation is as follows:
A simple structure for representing a binary tree T is based on a way of numbering
the nodes of T. For every node v of T, let f(v) be the integer defined as follows:
• If v is the root of T, then f(v) = 1
• If v is the left child of node u, then f(v) = 2 f(u)
• If v is the right child of node u, then f(v) = 2 f(u)+ 1
The numbering function f is known as a level numbering of the nodes in a binary
tree T, because it numbers the nodes on each level of T in increasing order from
left to right, although it may skip some numbers (see figures below).
Let n be the number of nodes of T, and let fM be the maximum value of f(v)
over all the nodes of T. The vector S has size N = fM + 1, since the element of S at
index 0 is not associated with any node of T. Also, S will have, in general, a number
of empty elements that do not refer to existing nodes of T. For a tree of height h,
N = O(2^h). In the worst case, this can be as high as 2^n − 1.
Question:
The last statement worst case 2^n-1 does not seem right. Here n=number of nodes. I think he meant 2^h-1 instead of 2^n-1. Using figure a) as an example, this would mean 2^n -1 means 2^15-1 = 32768-1 = 32767. Does not make sense.
Any insight is appreciated.
Thanks.

The worst case is when the tree is degenerated to a chain from the root, where each node has two children, but at least one of which is always a leaf. When this chain has n nodes, then the height of the tree is n/2. The vector must span all the levels and allocate room for full levels, even though there is in this degenerate tree only one node per level. The size S of the vector will still be O(2h), but now that in this degenerate case h is O(n/2) = O(n), this makes it O(2n) in the worst case.
The formula 2n-1 seems to suggest the author does not have a proper binary tree in mind, and then the above reasoning should be done with a degenerate tree that consists of a single chain where every node has at the most one child.
Example of worst case
Here is an example tree (not a proper tree, but the principle for proper trees is similar):
1
/
2
\
5
\
11
So n = 4, and h = 3.
The vector however needs to store all the slots where nodes could have been, so something like this:
_____ 1 _____
/ \
__2__ __ __
/ \ / \
_5_
/ \ / \ / \ / \
11
...so the vector has a size of 1+2+4+8 = 15. (Even 16 when we account for the unused slot 0 in the vector)
This illustrates that the size S of the vector is always O(2h). In this worst case (worst with respect to n, not with respect to h), S is O(2n).
Example n=6
When n=6, we could have this as a best case:
1
/ \
2 3
/ \ \
4 5 7
This tree can be represented by a vector of size 8, where the entries at index 0 and index 6 are filled with nulls (unused).
However, for n=6 we could have a worst case ("worst" for the impact on the vector size) when the tree is very unbalanced:
1
\
2
\
3
\
4
\
5
\
7
Now the tree's height is 5 instead of 2, and the vector needs to put that node 7 in the slot at index 63... S is 64. Remember that the vector spans each complete binary level, which doubles in size at each next level.
So when n is 6, S can be 8, 16, 32, or 64. It depends on the shape of the tree. In each case we have that S=O(2h). But when we express S in terms of n, then there is variation, and the best case is that S=O(n), while the worst case is S=O(2n).

Algorithmic puzzle: Ball Stacking Problem

I am trying to solve this problem: https://www.urionlinejudge.com.br/judge/en/problems/view/1312
The XYZ TV channel is developing a new game show, where a contestant
has to make some choices in order to get a prize. The game consists of
a triangular stack of balls, each of them having an integer value, as
the following example shows.
The contestant must choose which balls he is going to take and his
prize is the sum of the values of those balls. However, the contestant
can take any given ball only if he also takes the balls directly on
top of it. This may require taking additional balls using the same
rule. Notice that the contestant may choose not to take any ball, in
which case the prize is zero.
The TV show director is concerned about
the maximum prize a contestant can make for a given stack. Since he is
your boss and he does not know how to answer this question, he
assigned this task to you.
Input
Each test case is described using several lines. The first line
contains an integer N representing the number of rows of the stack (1
≤ N ≤ 1000). The i th of the next N lines contains i integers Bij
(−105 ≤ Bij ≤ 105 for 1 ≤ j ≤ i ≤ N); the number Bij is the value of
the j th ball in the i th row of the stack (the first row is the
topmost one, and within each row the first ball if the leftmost one).
The last test case is followed by a line containing one zero.
Output
Sample Input | Sample Output
4 | 7
3 | 0
-5 3 | 6
-8 2 -8 |
3 9 -2 7 |
2 |
-2 |
1 -10 |
3 |
1 |
-5 3 |
6 -4 1 |
0 |
I'd love a pointer or two on how to solve this problem.
It seems like it is solvable using a DP approach, but I can't quite formulate the recurrence. The fact that two adjacent balls could have overlapping children is making things a bit difficult.

This is DP, but we're going sideways instead of top-down. Let's tilt the ball stack a little to the left, so we can look at the whole stack as a sequence of columns.
3 3 -8 7
-5 2 -2
-8 9
3
From this viewpoint the rule of the game becomes: if we want to take a ball, we also need to take the ball above, and the ball directly to its left.
Now, solving the problem. We'll calculate a quantity S[i, j] for each ball
-- this represents the best sum we could achieve if the ball at position [i, j] is taken (the `jth ball from the top of the ith column), while considering only the first i columns.
I claim that the following recurrence holds (with some sensible initial conditions):
S[i, j] = MAX(S[i-1, j] + C[i, j], S[i, j+1])
where C[i, j] is the sum of the first j balls in the ith column.
Let's break that down a bit. We want to calculate S[i, j].
We have to take the ball at [i, j]. And let's suppose for now that this is the bottom-most ball we take from this column.
This requires all the balls in this column above it to be taken, with the sum (including [i, j] itself) being C[i, j].
It also requires the ball at [i-1, j] to be taken (unless we're at the leftmost column, of course). We know that the best sum from taking this ball is S[i-1, j], by definition.
So the best possible total sum is: S[i-1, j] + C[i, j], or just C[i, j] for the leftmost column.
But we can choose differently and take more balls from this column (if we have more balls). We need to calculate and take the maximum value out of S[i-1, j] + C[i, j], S[i-1, j+1] + C[i, j+1], and so on, all the way down to the bottom of the pile.
With a little induction it's easy to see this is equal to MAX(S[i-1, j] + C[i, j], S[i, j+1]).
The implementation should be obvious now. We process the stack column-by-column, in each column calculate the partial sum C[i, j] from the top down, then work out S[i, j] from bottom up.
Finally, just take the maximum value of S[i, j] we've encountered (or 0) as the answer.
This runs in linear time to the number of balls, so O(N^2).
To illustrate, here's (C[i, j], S[i, j]) pairs for the given example.
( 3, 3) ( 3,7) ( -8,-1) (7,6)
( -2,-2) ( 5,7) (-10,-3)
(-10,-7) (14,7)
( -7,-7)

(Updated with better understanding from Worakarn Isaratham's answer.)
We can have a naive recurrence with O(N^2) search space (note that there are O(N^2) total balls so to do any better we could not examine all entries) by iterating over the diagonals. Let's say southwest.
\ jth NW diagonal
x
x o
A o x
x o x o
x o x B x / ...etc
C o x o x o / 3rd iteration
x o D o E F x / 2nd iteration
x o x o x o x o / 1st iteration (ith SW diagonal)
x o x o x o x G x
/ / / \
Each choice along a southwest diagonal, would restrict all the rest of the choices and sums below a northwest diagonal (e.g., being able to choose E means we've only chosen as far as the FG diagonal on previous iterations and choosing it would restrict all subsequent choices below the AE diagonal.
Say we label southwest diagonals as i and our northwest bound as j, and have a function sum_northwest that calculates in O(1) (using prefix sums) and takes as parameters one northwest diagonal and a southwest bound. Then, if f(i, j) represents the optimal choice up to the ith southwest column with nothwest bound j:
f(i, j) = max(
// Skip choosing from
// this southwest diagonal
f(i - 1, j),
// Choose this northwest diagonal
// on this southwest diagonal
sum_northwest(j - 1, i) + f(i - 1, j - 1),
// Choose an earlier northwest diagonal,
// but then we are obliged to also
// include this northwest diagonal
sum_northwest(j - 1, i) + f(i, j - 1)
)
Time complexity is O(|I| * |J|), assuming we are tabling results.
JavaScript code (not optimised):
function sum_northwest(M, j, i){
return M[j].slice(0, i + 1)
.reduce((a, b) => a + b, 0)
}
function f(M, i, j){
if (i < 0 || j < 1 || i >= M[M.length-j].length)
return 0
let this_northwest =
sum_northwest(M, M.length - j, i)
return Math.max(
f(M, i - 1, j),
this_northwest + f(M, i - 1, j - 1),
this_northwest + f(M, i, j - 1)
)
}
var M = [
[ 3, 3,-8, 7],
[-5, 2,-2],
[-8, 9],
[ 3]
]
console.log(f(M, 3, 4))
M = [
[-2,-10],
[ 1]
]
console.log(f(M, 1, 2))
M = [
[ 1, 3, 1],
[-5,-4],
[ 6]
]
console.log(f(M, 2, 3))

O
/\
/ \
A/ \B
/ \ / \
/ \/ \
C M D
Suppose we take a ball at M, then we also need to take all the balls from the region MAOB. We are left with 2 triangles: CAM and MBD, and we need to select balls from these two triangles to maximize the points. This is the same problem, but with a smaller input.
So we define the value function over the sets of all sub-triangles in the stack
Let C[i,j,h] = maximum points from a sub-triangle with the top at (i,j) and with height = h (I use right-angled triangles to illustrate here because it is easier to draw)
|\
| \
| \
| \
| A \
| |\ \
| | \ \
| | \ \
| |D \E \
| |\ |\ \
| | \ | \ \
| |__\|__\ \
| B M C \
|_____________\
A = (i,j)
B = (i+h, j)
C = (i+h, j+h)
M = (i+h, j+k)
D = (i, j+h-k)
E = (i+k, j+k)
Recursive formula:
C[i,j,h] = max(
C[i,j,h-1] // pick no ball at line i+h
C[i,j+h-k,k] + C[i+k,j+k,h-k] + sum(ADME) for k from 0 to h // if pick a ball at (i+h, j+k)
)

I will show a solution in O(N log N) time complexity and O(N) memory complexity.
First of all, you are correct we will use a Dynamic Programing approach. The data structure we will use will be a triangle of the same size. Each ball in the new triangle will be the value of the competitor if it took that ball. We can build it in O(N) time - top to bottom.
Now we need to notice some interesting claim: The ball with the highest value in the new triangle will be taken (unless it's negative in which case we will not take any thing).
Proof: (Quite straightforward) If not taken, than we can take it and necessarily get the total value higher.
Another interesting claim: All the balls below (that will take it if they were taken) the the ball with highest score in the data structure - will not be taken.
Proof: If they were taken it mean that the value of them is positive without the ball with the highest value and that means that their value is higher which is impossible.
Now to the algorithm:
Build the data structure.
Set of all taken items = {}
while there are positive elements in the structure:
Take highest one - put in the set.
make all below it 0, and put then in the set.
make all above it negative infinity.
Rebuild the data structure for the rest.
Return the set
This will always be optimal - we take the ball with highest value, we will never miss a ball that ban help us, and we will never miss other balls since we rebuild the data structure.
Memory complexity is simple: O(N) for the data structure.
Time complexity is tricky: Creating the data structure is O(N), each time in the loop we remove at least half of the elements and we will not recompute for them, therefore number of iterations will be logarithmic in N so O(N log N).

When to terminate iterative deepening with alpha beta pruning and transposition tables?

How do I know when I can stop increasing the depth for an iterative deepening algorithm with negamax alpha beta pruning and transposition tables? The following pseudo code taken from a wiki page:
function negamax(node, depth, α, β, color)
alphaOrig := α
// Transposition Table Lookup; node is the lookup key for ttEntry
ttEntry := TranspositionTableLookup( node )
if ttEntry is valid and ttEntry.depth ≥ depth
if ttEntry.Flag = EXACT
return ttEntry.Value
else if ttEntry.Flag = LOWERBOUND
α := max( α, ttEntry.Value)
else if ttEntry.Flag = UPPERBOUND
β := min( β, ttEntry.Value)
endif
if α ≥ β
return ttEntry.Value
endif
if depth = 0 or node is a terminal node
return color * the heuristic value of node
bestValue := -∞
childNodes := GenerateMoves(node)
childNodes := OrderMoves(childNodes)
foreach child in childNodes
val := -negamax(child, depth - 1, -β, -α, -color)
bestValue := max( bestValue, val )
α := max( α, val )
if α ≥ β
break
// Transposition Table Store; node is the lookup key for ttEntry
ttEntry.Value := bestValue
if bestValue ≤ alphaOrig
ttEntry.Flag := UPPERBOUND
else if bestValue ≥ β
ttEntry.Flag := LOWERBOUND
else
ttEntry.Flag := EXACT
endif
ttEntry.depth := depth
TranspositionTableStore( node, ttEntry )
return bestValue
And this is the iterative deepening call:
while(depth < ?)
{
depth++;
rootNegamaxValue := negamax( rootNode, depth, -∞, +∞, 1)
}
Of course, when I know the total number of moves in a game I could use depth < numberOfMovesLeft as an upper bound. But if this information is not given, when do I know that another call of negamax doesn't give any better result then the previous run? What do I need to change in the algorithm?

The short answer is: when you run out of time (and the transpositional tables are irrelevant to the answer/question)
Here I assume that your evaluation function is reasonable (gives good approximation of the position).
The main idea to combine the iterative deepening with alpha beta is the following: let's assume that you have 15 seconds to come up with the best move. How far can you search? I do not know and no one else know. You can try to search till depth = 8 only to find out that the search finished in 1 second (so you waisted available 14 seconds of time). With trial and error you found that depth = 10 gives you result in 13 seconds. So you decided to use it all the time. But now something went terribly wrong (your alpha beta was not pruning good enough, some of the positions took too much time to evaluate) and your result was not ready in 15 seconds. So you either made a random move or have lost the game.
So that this would never happened it is nice to have a good result ready. So you do the following. Get the best result for depth=1 and store it. Find the best result for depth=2, and overwrite it. And so on. From time to time check how much time left, and if it is really close to timelimit - return your best move.
Now you do not need to worry about the time, your method will give the best result you have found so far. With all these recalculations of different subtrees you only waste half of your resources (if you check the whole tree, but in alpha-beta you most probably are not). The additional advantage is that now you reorder the moves from the best to worse on each depth iteration and thus will make pruning more aggressive.

Trouble applying the Alpha Beta Pruning algorithm to this tree

I am trying to apply the alpha beta pruning algorithm to this given tree.
I am stuck when I hit node C because after expanding all the children of B, I give A >= -4, I then expand C to get I =-3, which IS greater than -4 (-3 >= -4). Do I therefore, update A to -3? If so do I then afterwards, prune J and K because -3 >= -3 ? When I worked through the example, I pruned, J, K, M and N. I am really uncertain about this =(
EDIT:
Another question: After exploring B and passing the value of B to A, do we pass this value to C and thus to I? I saw an example that this was the case. Here it is: http://web.cecs.pdx.edu/~mm/AIFall2011/alphabeta-example.pdf
However, in this example, http://web.cecs.pdx.edu/~mm/AIFall2011/alphabeta-example.pdf, it doesn't seem to pass down values, instead it seems to only propagate values upwards. I am not sure which one is correct or if it makes a difference at all.

After expanding all the children of B, then A has α=-4, β=∞.
When you get to I, then α=-4, β=-3. α < β so J and K are not pruned. They would need to be evaluated to make sure that they're not less than -3, lowering the evaluation of C. The value of A is updated to α=-3, β=∞ after C is expanded. You can't use the updated alpha value of A when evaluating J because it wouldn't have been updated yet.
J and K would be pruned if I was -5 instead. In that case it wouldn't matter what J and K are because we already know the evaluation of C is worse than B because -5 < -4, and J and K can only make that worse.
Each node passes the alpha and beta values to its children. The children will then update their own copies of the alpha or beta value depending on whose turn it is and return the final evaluation of that node. That is then used to update the alpha or beta value of the parent.
See Alpha-Beta pruning for example:
function alphabeta(node, depth, α, β, Player)
if depth = 0 or node is a terminal node
return the heuristic value of node
if Player = MaxPlayer
for each child of node
α := max(α, alphabeta(child, depth-1, α, β, not(Player)))
if β ≤ α
break // Beta cut-off
return α
else
for each child of node
β := min(β, alphabeta(child, depth-1, α, β, not(Player)))
if β ≤ α
break // Alpha cut-off
return β
// Initial call
alphabeta(origin, depth, -infinity, +infinity, MaxPlayer)

Whenever I need to refresh my understanding of the algorithm I use this:
http://homepage.ufp.pt/jtorres/ensino/ia/alfabeta.html
You can enter your tree there and step through the algorithm. The values you would want are:
3 3 3 3
-2 -4 3 etc.
I find that deducing the algorithm from an example provides a deeper understanding.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio