Monte-Carlo Tree Reductions for Stochastic Games - Technologies and Applications of Artificial Intelligence

Information Technology Reference

In-Depth Information

Section 2 describes related works. Section 3 presents move-groups, chance-

nodes and group-nodes principles applied to

algorithms and different

regrouping policies. Section 4 shows experimental results. At the end, section 5

concludes.

MCTS

2 Related Works

In this section we expose related works on creating nodes and regrouping them

in CDC and in stochastic games.

Most previous works related to CDC consider openings building [2], endgames

building [3-5], sub-problems resolution [6]. Due to a long expertise in alpha-beta ,

most programs use the minimax extension to games of chance called expecti-

max [7] with its common pruning extensions Star1 and Star2 [8]. It remains

that

programs are highly sensitive to their parameters [1].

Move-groups have been proposed in [10] to address the problem of

MCTS

in

games with a high branching factor. When there are too many moves, it can be

a good heuristic to regroup the statistics of moves that share some properties.

For example in the game of Amazons where the branching factor can be of the

order of a thousand of moves, a natural way to reduce the branching factor is to

separate the queen move from the stone placement. In the imperfect information

game Dou Di Zhu, Information Set [11] has been combined with move-groups:

the player first chooses the base move and then the kicker, as two separate

consecutive decision nodes in the tree. Move-groups have also been analyzed on

an abstract game [12].

Chen et al. [7] used alpha-beta algorithm with different revealing policies com-

bined with the initial-depth flipping method to reduce the branching factor.

Yen et al. [1] presented a non-deterministic

MCTS

with chance-nodes [13].

They create shorter simulation by moderating the three policies named Capture

First , Capture Stronger Piece First and Capture and Escape Stronger Piece First .

Jouandeau and Cazenave [9] presented MCTS influence of various playout

size, of basic or advanced policies and with heuristic playouts. They studied

group constitution related to position's pieces. They showed that relevant play-

out sizes, policies and heuristic playouts are not equivalent for chance-nodes and

group-nodes.

More generally,

MCTS

has been successfully applied in the past to other

multi-player perfect information stochastic games with different regrouping op-

timizations.

In

MCTS

, Monte Carlo playouts have been parallelized to evaluate a

position. As playing a position depends on dices, random dices sequences have

been also evaluate in parallel. As the number of possible moves increases when

dices are doubles, states evaluation can be divided in 2 sub-problems [14] : the

first without double dices and the second with double dices. In other words, they

distinguished states in 2 groups : light states that have small branching factor

and heavy states that have huge branching factor.

In

Backgammon

, simulations are restricted inside a directed acyclic graph to

produce existing words [15].

Scrabble

Technologies and Applications of Artificial Intelligence

Search WWH ::

Custom Search

Home