Civil Engineering Reference
In-Depth Information
Appendix A
Parameter Effects in the Game of Chung Toi
This chapter previously appeared as: Gatti et al. (2011). Parameter settings of rein-
forcement learning for the game of Chung Toi. In IEEE International Conference
on Systems, Man, and Cybernetics (SMC 2011), Anchorage, AK, 9-12 October
(pp. 3530-3535). doi: 10.1109/ICSMC.2011.6084216. This work used a one-factor-
at-a-time (OFAT) study to assess the impact of changing individual parameters on
the performance of reinforcement learning in a two player board game. This study
was our first exploration into attempting to understand how parameters affect rein-
forcement learning, which led to a subsequent study using a classical experimental
design (Appendix B), and the current work described in this dissertation.
A.1
Introduction
Reinforcement learning is a machine learning method in which an agent learns a
behavior by repeatedly interacting with an environment with the goal of maximizing
the total rewards received (Sutton and Barto 1998). This strategy is essentially a
method of trial and error in which feedback is provided to the agent based on the
utility of its actions. This feedback is used to improve the agent's knowledge of the
environment so that, in subsequent interactions, the agent behaves in a more optimal
manner. This paradigm pairs well with game playing in the sense that players aim to
select actions which are likely to lead to the best outcome. Game playing is a unique
scenario in that feedback is typically not provided during the game and following
each action, rather it is provided only at the end of the game in terms of a win or loss.
This sole piece of information must then be used to inform the agent of the utility of
its actions.
Reinforcement learning has been applied to many board games including Tic-Tac-
Toe (Wiering 1995; Ghory 2004), Othello (Binkley et al. 2007), and, most notably,
backgammon (Tesauro 1992, 2002; Wiering et al. 2007). The current work adds to
this body of literature by applying reinforcement learning to the game of Chung Toi
(Gatti et al. 2011b), a challenging extension of Tic-Tac-Toe. This work extends our
previous work which showed that a basic implementation of reinforcement learning,
Search WWH ::




Custom Search