Reinforcement Learning - Design of Experiments for Reinforcement Learning

Civil Engineering Reference

In-Depth Information

2.1.2

Games

Games represent a large proportion of the applications of reinforcement learning.

In almost all of these applications, the domains are based on two-agent, adver-

sarial, and zero-sum situations. Games in which reinforcement learning have been

applied include: Tic-Tac-Toe (Wiering 1995 ; Ghory 2004 ; Patist and Wiering 2004 ;

Konen and Beielstein 2008 , 2009 ; Gatti and Embrechts 2012 ), Chung-Toi (Gatti et

al. 2011a ), Connect4 (Ghory 2004 ), Solitare (Yan et al. 2004 ), checkers/draughts

(Schaeffer et al. 2001 ; Patist and Wiering 2004 ; Wiering et al. 2007 ), Chess (Thrun

1995 ; Baxter et al. 1998a ; Mannen and Wiering 2004 ; Wiering et al. 2007 ; Veness et

al. 2009 ), Othello (Binkley et al. 2007 ; van Eck and van Wezel 2008 ; Yoshioka et al.

1999 ; Skoulakis and Lagoudakis 2012 ), 9

9 Go (Schraudolph et al. 1994 ; Silver

et al. 2012 ), and backgammon (Tesauro 1995 ; Wiering et al. 2007 ; Papahristou and

Refanidis 2011 ). Note that these games span a large range of difficulty and that they

all have unique environmental characteristics. It is also very important to note that

the large majority of these applications to games have not resulted in agents that can

play perfect games against any level of opponent. A truly successful application of

reinforcement learning to games is often considered one that is capable of matching

or beating the performance of human master players or (non-reinforcement learning

trained) computer programs. Some of these successes include those to chess (Veness

et al. 2009 ), checkers (Schaeffer et al. 2001 ), 9

×

9 Go (Silver et al. 2012 ), and

backgammon (Tesauro 1995 ). In most circumstances, reinforcement learning appli-

cations to games are evaluated against computer opponents that either use a simplistic

action-selection policy or are computer programs that have been developed for the

same game (e.g., Schraudolph et al. 1994 ; Patist and Wiering 2004 ; Silver et al.

2012 ). Far fewer applications evaluate the performance of a reinforcement learning

agent against human opponents, such as in Tesauro ( 1995 ) or Gatti et al. ( 2011b ).

The most notable and most cited application of reinforcement learning is the

work of Tesauro ( 1995 ) who trained a neural network to play the board game of

backgammon that could challenge and win against human grandmasters in world

championship play. The reasons why this applications was so powerful are not quite

understood; similarly, the reasons why no other application has seen similar success

is not understood as well. Some attribute the success of this application to the speed

of play, representation smoothness, and stochasticity (Baxter et al. 1998b ), while

others question whether it is actually a true success and claim that its success is not

due to reinforcement learning but rather the dynamics and co-evolution of the game

(Schraudolph et al. 1994 ; Pollack and Blair 1996 ).

×

2.1.3

Real-World Applications

Applications of reinforcement learning to real-world problem are much less cited,

though there are still numerous applications in a variety of domains. In the field of

robotics, Smart and Kaelbling ( 2002 ) and Smart ( 2002 ) used Q -learning to train

Design of Experiments for Reinforcement Learning

Search WWH ::

Custom Search

Home