Parameter Effects in the Game of Chung Toi - Design of Experiments for Reinforcement Learning

Civil Engineering Reference

In-Depth Information

Appendix A

Parameter Effects in the Game of Chung Toi

This chapter previously appeared as: Gatti et al. (2011). Parameter settings of rein-

forcement learning for the game of Chung Toi. In IEEE International Conference

on Systems, Man, and Cybernetics (SMC 2011), Anchorage, AK, 9-12 October

(pp. 3530-3535). doi: 10.1109/ICSMC.2011.6084216. This work used a one-factor-

at-a-time (OFAT) study to assess the impact of changing individual parameters on

the performance of reinforcement learning in a two player board game. This study

was our first exploration into attempting to understand how parameters affect rein-

forcement learning, which led to a subsequent study using a classical experimental

design (Appendix B), and the current work described in this dissertation.

A.1

Introduction

Reinforcement learning is a machine learning method in which an agent learns a

behavior by repeatedly interacting with an environment with the goal of maximizing

the total rewards received (Sutton and Barto 1998). This strategy is essentially a

method of trial and error in which feedback is provided to the agent based on the

utility of its actions. This feedback is used to improve the agent's knowledge of the

environment so that, in subsequent interactions, the agent behaves in a more optimal

manner. This paradigm pairs well with game playing in the sense that players aim to

select actions which are likely to lead to the best outcome. Game playing is a unique

scenario in that feedback is typically not provided during the game and following

each action, rather it is provided only at the end of the game in terms of a win or loss.

This sole piece of information must then be used to inform the agent of the utility of

its actions.

Reinforcement learning has been applied to many board games including Tic-Tac-

Toe (Wiering 1995; Ghory 2004), Othello (Binkley et al. 2007), and, most notably,

backgammon (Tesauro 1992, 2002; Wiering et al. 2007). The current work adds to

this body of literature by applying reinforcement learning to the game of Chung Toi

(Gatti et al. 2011b), a challenging extension of Tic-Tac-Toe. This work extends our

previous work which showed that a basic implementation of reinforcement learning,

Search WWH ::

Custom Search

Home