Information Technology Reference
In-Depth Information
T
function takes the current tree
as input and performs descent toward the best
node
q
and returns the move
m
to produce a new node. Then the
expand
function
applies
m
to
q
and produces
q
new
. Starting from this new node
q
new
,aplayout
helps to collect statistical information of
q
new
.Then
backpropagation
function
insert
q
new
in the tree
as statistically informed node and backpropagates
q
new
results of its parents. If during the descent, this
q
new
node is a winning leaf,
this leaf is updated and future tree expansions are expected to perform different
descents and select different nodes. According to this exclusion of endgames
inside
T
,the
select
function detailed in Alg. 2 always selects node
q
with at
most 1 move. It selects the best node
q
best
from all possible moves
m
from
q
or
itbreaksatthefirstnode(
q
+
m
)notin
T
T
.
input
:
T
the current tree
output
:
T
expanded with 1 to
nbPlayouts
nodes
for
nbPlayouts
do
(
q
,
m
)
ₐ
select
(
T
);
q
new
ₐ
expand
(
q
,
m
);
res ₐ
playout
(
q
new
);
backpropagate
(
q
new
,
res
);
Algorithm 1.
Classical
MCTS
applied to perfect information games
input
:
T
the current tree
output
:(
q
,
m
)where
q ∈T
and
q
to expand by applying a move
m
q ₐ root
;
while
true
do
q
best
ₐ {∅}
;
foreach
move m from q
do
if
(
q
+
m
)
/∈T
then return
(
q
,
m
);
q
best
ₐ
best
(
q
best
,(
q
+
m
));
q ₐ q
best
;
Algorithm 2.
Select
function of classical
MCTS
Considering stochastic games and modifications of tree's representation that
are arising with chance-nodes and move-groups usage :
-
the
select
function may distinguish types of nodes.
-
the endgame shortcut assertion is no more true. Thus endgames can be
selected inside
select
function. It implies modifications in main loop and in
select
function.
Search WWH ::
Custom Search