Civil Engineering Reference
In-Depth Information
2.1.2
Games
Games represent a large proportion of the applications of reinforcement learning.
In almost all of these applications, the domains are based on two-agent, adver-
sarial, and zero-sum situations. Games in which reinforcement learning have been
applied include: Tic-Tac-Toe (Wiering 1995 ; Ghory 2004 ; Patist and Wiering 2004 ;
Konen and Beielstein 2008 , 2009 ; Gatti and Embrechts 2012 ), Chung-Toi (Gatti et
al. 2011a ), Connect4 (Ghory 2004 ), Solitare (Yan et al. 2004 ), checkers/draughts
(Schaeffer et al. 2001 ; Patist and Wiering 2004 ; Wiering et al. 2007 ), Chess (Thrun
1995 ; Baxter et al. 1998a ; Mannen and Wiering 2004 ; Wiering et al. 2007 ; Veness et
al. 2009 ), Othello (Binkley et al. 2007 ; van Eck and van Wezel 2008 ; Yoshioka et al.
1999 ; Skoulakis and Lagoudakis 2012 ), 9
9 Go (Schraudolph et al. 1994 ; Silver
et al. 2012 ), and backgammon (Tesauro 1995 ; Wiering et al. 2007 ; Papahristou and
Refanidis 2011 ). Note that these games span a large range of difficulty and that they
all have unique environmental characteristics. It is also very important to note that
the large majority of these applications to games have not resulted in agents that can
play perfect games against any level of opponent. A truly successful application of
reinforcement learning to games is often considered one that is capable of matching
or beating the performance of human master players or (non-reinforcement learning
trained) computer programs. Some of these successes include those to chess (Veness
et al. 2009 ), checkers (Schaeffer et al. 2001 ), 9
×
9 Go (Silver et al. 2012 ), and
backgammon (Tesauro 1995 ). In most circumstances, reinforcement learning appli-
cations to games are evaluated against computer opponents that either use a simplistic
action-selection policy or are computer programs that have been developed for the
same game (e.g., Schraudolph et al. 1994 ; Patist and Wiering 2004 ; Silver et al.
2012 ). Far fewer applications evaluate the performance of a reinforcement learning
agent against human opponents, such as in Tesauro ( 1995 ) or Gatti et al. ( 2011b ).
The most notable and most cited application of reinforcement learning is the
work of Tesauro ( 1995 ) who trained a neural network to play the board game of
backgammon that could challenge and win against human grandmasters in world
championship play. The reasons why this applications was so powerful are not quite
understood; similarly, the reasons why no other application has seen similar success
is not understood as well. Some attribute the success of this application to the speed
of play, representation smoothness, and stochasticity (Baxter et al. 1998b ), while
others question whether it is actually a true success and claim that its success is not
due to reinforcement learning but rather the dynamics and co-evolution of the game
(Schraudolph et al. 1994 ; Pollack and Blair 1996 ).
×
2.1.3
Real-World Applications
Applications of reinforcement learning to real-world problem are much less cited,
though there are still numerous applications in a variety of domains. In the field of
robotics, Smart and Kaelbling ( 2002 ) and Smart ( 2002 ) used Q -learning to train
Search WWH ::




Custom Search