Monte-Carlo Tree Reductions for Stochastic Games - Technologies and Applications of Artificial Intelligence

Information Technology Reference

In-Depth Information

Monte-Carlo Tree Reductions

for Stochastic Games

Nicolas Jouandeau 1 and Tristan Cazenave 2

1 LIASD, Universite de Paris 8, France

n@ai.univ-paris8.fr

2 LAMSADE, Universite Paris-Dauphine, France

cazenave@lamsade.dauphine.fr

Abstract. Monte-Carlo Tree Search (MCTS) is a powerful paradigm

for perfect information games. When considering stochastic games, the

tree model that represents the game has to take chance and a huge

branching factor into account. As effectiveness of MCTS may decrease

in such a setting, tree reductions may be useful. Chance-nodes are a

way to deal with random events. Move-groups are another way to deal

eciently with a large branching factor by regrouping nodes. Group-

nodes are regrouping only reveal moves and enable a choice between

reveal moves and classical moves. We present various policies to use such

reductions for the stochastic game Chinese Dark Chess. Move-groups,

chance-nodes and group-nodes are compared.

1 Introduction

Chinese Dark Chess (CDC) is a popular stochastic two player game in Asia

that is most commonly played on a 4x8 rectangular board where players do not

know payoff of reveal moves. The 2 players (called black and red) start with the

same set of pieces. Before the first move, players do not know their colors. The

first player move defines the first player color. Then pieces can capture other

pieces according to their values and their positions. Even if reveal moves imply

a huge number of possible boards, classical moves can lead to similar positions

during the game and capturing rules are different for each piece [7]. As Monte

Carlo Tree Search (

) techniques deal with nodes statistics, blindness goes

along with branching factor.

MCTS

programs seem to be promising in

CDC

.

In

TCGA

2012, one participant was a

MCTS

program. In

TCGA

2013, five

participants, including the winner called DarkKnight ,were

MCTS

programs.

In

2014, the alpha-beta program Yahari won the competition ahead

of DarkKnight .As

TCGA

has a huge branching due to the revealing moves, we

try to reduce the revealing moves dependency by applying different ways of

regrouping nodes. In the context of stochastic games, we believe that a better

understanding of MCTS behavior is needed. We show in this paper 3 different

MCTS

CDC

implementations, called move-groups, chance-nodes and group-nodes,

that are using longer playouts, the same playout policy and no heuristic playouts.

Technologies and Applications of Artificial Intelligence

Search WWH ::

Custom Search

Home