Game Playing Using MiniMax with Alpha-Beta Cutoff
ALPHA-BETA cutoff is a method for reducing the number of nodes explored in the Minimax strategy. For the nodes it explores it computes, in addition to the score, an alpha value and a beta value.
ALPHA value of a node
- It is a value never greater than the true score of this node.
- Initially it is the score of that node, if the node is a leaf, otherwise it is -infinity.
- Then at a MAX node it is set to the largest of the scores of its successors explored up to now, and at a MIN node to the alpha value of its predecessor.
BETA value of a node
- It is a value never smaller than the true score of this node. Initially it is the score of that node, if the node is a leaf, otherwise it is +infinity.
- Then at a MIN node it is set to the smallest of the scores of its successors explored up to now, and at a MAX node to the beta value of its predecessor.
It Is Guaranteed That:
- The score of a node will always be no less than the alpha value and no greater than the beta value of that node.
- As the algorithm evolves, the alpha and beta values of a node may change, but the alpha value will never decrease, and the beta value will never increase.
- When a node is visited last, its score is set to the alpha value of that node, if it is a MAX node, otherwise it is set to the beta value.
Function MINIMAX-AB
function MINIMAX-AB(N, A, B) is ;; Here A is always less than B begin if N is a leaf then return the estimated score of this leaf else Set Alpha value of N to -infinity and Beta value of N to +infinity; if N is a Min node then For each successor Ni of N loop Let Val be MINIMAX-AB(Ni, A, Min); Set Beta value of N to Min; When A >= Beta value of N then Return Beta value of N endloop; Return Beta value of N; else For each successor Ni of N loop Let Val be MINIMAX-AB(Ni, Max, B); Set Alpha value of N to Max; When Alpha value of N >= B then Return Alpha value of N endloop; Return Alpha value of N; end MINIMAX-AB;
Starting the Game
- At the start of the game, the MINIMAX-AB function is called with the following parameters:
- the root of the game tree
- -infinity (-I) as alpha value
- +infinity (+I) as beta value
An Example of MiniMax with Alpha Beta Cutoff
In the game tree that you see, the root is a Max node. The scores of the leaf nodes are presented immediately below them.
Trace of the Execution
Here is a trace of the execution of the Minimax strategy with Alpha Beta Cutoff
NODE TYPE A B ALPHA BETA SCORE A Max -I +I -I +I B Min -I +I -I +I C Max -I +I -I +I D Min -I +I -I +I E Max -I +I 10 D Min -I +I -I 10 F Max -I 10 11 D Min -I +I -I 10 10 C Max -I +I 10 +I G Min 10 +I -I +I H Max 10 +I 9 G Min 10 +I -I 9 9 C Max -I +I 10 +I 10 B Min -I +I -I 10 J Max -I 10 -I +I K Min -I 10 -I +I L Max -I 10 14 K Min -I 10 -I 14 M Max -I 10 15 K Min -I 10 -I 14 14 J Max -I 10 14 +I 14 B Min -I +I -I 10 10 A Max -I +I 10 +I Q Min 10 +I -I +I R Max 10 +I -I +I S Min 10 +I -I +I T Max 10 +I 15 S Min 10 +I -I 15 V Max 10 +I 2 S Min 10 +I -I 2 2 R Max 10 +I 2 +I Y Min 10 +I -I +I W Max 10 +I 4 Y Min 10 +I -I 4 4 R Max 10 +I 4 +I 4 Q Min 10 +I -I 4 4 A Max -I +I 10 4 10
Efficiency of the Alpha-Beta Procedure
- Alpha-Beta is guaranteed to compute the same minimax value for the root node as computed by Minimax
- In the worst case Alpha-Beta does NO pruning, examining b^d leaf nodes, where each node has b children and a d-ply search is performed
- The efficiency of the Alpha-Beta procedure depends on the order in which successors of a node are examined.
- If we were lucky, at a MIN node we would always consider the nodes in order from low to high score and at a MAX node the nodes in order from high to low score.
- For example, a "lucky" game tree (binary, of depth 4, complete) has leaf scores, from left to right, 4,3,8,7,2,1,6,5. Of these leaves only 5 will be examined.
- In the best case, Alpha-Beta will examine only (2b)^(d/2) leaf nodes.
- In general it can be shown that in the most favorable circumstances the alpha-beta search opens as many leaves as minimax on a game tree with double its depth.
- In the chess program Deep Blue, they found empirically that Alpha-Beta pruning meant that the average branching factor at each node was about 6 instead of about 35-40
Games that Involve Chance
- Minimax and alpha-beta must be modified when we deal with games that involve chance.
- For example, in backgammon the moves of each player take place after a throw of the dices.
Cutting off Search
- So far we have assumed a fixed depth d where the search is stopped and the static evaluation function is applied. But there are variations on this that are important to note:
- Don't stop at non-quiescent nodes.
- Iterative Deepening is frequently used with Alpha-Beta to allow searches to successively deeper plies if there is time
Non-Quiescent Nodes
- If a node represents a state in the middle of an exchange of pieces, then the node is not quiescent and therefore the evaluation function may not give a reliable estimate of board quality.
- A definition for chess: "a state is non-quiescent if any piece is attacked by one of lower value, or by more pieces than defenses, or if any check exists on a square controlled by the opponent."
- In this case, expand more nodes and only apply the evaluation function at quiescent nodes.
- The identification of non-quiescent nodes partially deals with the horizon effect.
Horizon Effect
- A negative horizon is where the state seen by the evaluation function is evaluated as better than it really is because an undesirable effect is just beyond this node (i.e., the search horizon).
- A positive horizon is where the evaluation function wrongly underestimates the value of a state when positive actions just over the search horizon indicate otherwise.
Iterative Deepening
- Iterative Deepening is frequently used with Alpha-Beta so that searches to successively deeper plies can be attempted if there is time
- The move selected is the one computed by the deepest search completed when the time limit is reached.