Monte-Carlo Tree Reductions for Stochastic Games - Technologies and Applications of Artificial Intelligence

Information Technology Reference

In-Depth Information

input : T the current tree

output :( q , m )where q ∈T and q to expand by applying a move m

q ₐ root ;

while true do

if no move from q then break ;

q best ₐ {∅} ;

foreach possible move m from q do

if m is a classical move then

if ( q + m ) /∈T then return ( q , m );

q best ₐ best ( q best ,( q + m ));

else if m is a reveal move then

q new ₐ revealRandomlyAt ( q , m );

if q new /∈T then return ( q , m );

q best ₐ best ( q best , q new );

q ₐ q best ;

return ( q , {∅} );

Algorithm 6. Select function with group-nodes

boards. Thus a board with 3 known pieces with 4 possible moves each and with

10 unrevealed pieces will have 12 children for its known pieces and 10 children

for its unrevealed pieces. As revealing positions can leads to different boards,

possible moves are always recomputed with group-nodes. The select function

returns the first unevaluated classical move or the first unevaluated reveal move

from the current best node in the tree. The function revealRandomlyAt applies

a random reveal at the position m . As revealed pieces will be different, sub-

groups will be also different. Thus the group-nodes regrouping policy produced

an approximate evaluation of groups.

In this paper, we investigate the way that groups constitution influence

MCTS

performances in CDC stochastic game. To achieve this, we consider different

regrouping policies and different generating policies inside groups:

- revealed group or unrevealed group : these 2 groups are simply defined on

the board by revealed and unrevealed pieces. Using these 2 groups, we tried

to generate randomly new moves (abrev. move-group-random ) and to cycle

over the considered move-group's elements (abrev. move-group-cycle ).

- revealed pieces or unrevealed group : this is equivalent to group-nodes. Un-

revealed pieces are considered randomly inside the unrevealed group and

revealed pieces are considered individually (abrev. group-nodes ).

4 Experiments

In the first experiment, we compare the 5 regrouping policies move-groups-

random , move-groups-cycle-R , move-groups-cycle-M , group-nodes and chance-

nodes to a random player and to a reference player rand-mm . The policies

Search WWH ::

Custom Search

Home