Information Technology Reference
In-Depth Information
T
function takes the current tree
as input and performs descent toward the best
node q and returns the move m to produce a new node. Then the expand function
applies m to q and produces q new . Starting from this new node q new ,aplayout
helps to collect statistical information of q new .Then backpropagation function
insert q new in the tree
as statistically informed node and backpropagates q new
results of its parents. If during the descent, this q new node is a winning leaf,
this leaf is updated and future tree expansions are expected to perform different
descents and select different nodes. According to this exclusion of endgames
inside
T
,the select function detailed in Alg. 2 always selects node q with at
most 1 move. It selects the best node q best from all possible moves m from q or
itbreaksatthefirstnode( q + m )notin
T
T
.
input : T the current tree
output : T expanded with 1 to nbPlayouts nodes
for nbPlayouts do
( q , m ) select ( T );
q new expand ( q , m );
res ₐ playout ( q new );
backpropagate ( q new , res );
Algorithm 1. Classical
MCTS
applied to perfect information games
input : T the current tree
output :( q , m )where q ∈T and q to expand by applying a move m
q ₐ root ;
while true do
q best ₐ {∅} ;
foreach move m from q do
if ( q + m ) /∈T then return ( q , m );
q best best ( q best ,( q + m ));
q ₐ q best ;
Algorithm 2. Select function of classical
MCTS
Considering stochastic games and modifications of tree's representation that
are arising with chance-nodes and move-groups usage :
- the select function may distinguish types of nodes.
- the endgame shortcut assertion is no more true. Thus endgames can be
selected inside select function. It implies modifications in main loop and in
select function.
 
Search WWH ::




Custom Search