Reinforcement Learning - Design of Experiments for Reinforcement Learning

Civil Engineering Reference

In-Depth Information

Chapter 2

Reinforcement Learning

This chapter provides an overview of the field of reinforcement learning and concepts

that are relevant to the proposed work. The field of reinforcement learning is not very

well-known and although the learning paradigm is easily understandable, some of

the more detailed concepts can be difficult to grasp. Accordingly, reinforcement

learning is presented beginning with a review of the the fundamental concepts and

methods. This introduction to reinforcement learning is followed by a review of the

three major components of the reinforcement learning method: the environment, the

learning algorithm, and the representation of the learned knowledge. Some of the

terminology used herein may be slightly different from other fields, though this is

done to be consistent with the reinforcement learning literature.

Note that in this work reinforcement learning is considered from the artificial

intelligence or computer science perspective on solving sequential decision making

problems. Sequential decision making problems, however, are also the focus of other

fields with difference perspectives, including control theory and operations research

(Kappen 2007 ; Powell 2008 ). Each of these fields uses slightly different methods that

have been developed for, or have been successful in, solving types of problems with

unique characteristics that are specific to each field, though there may be considerable

overlap in the types of problems solved by each community. The operations research

community uses approaches such as simulation-optimization, forecasting approaches

for rolling-horizon problems, and dynamic programming methods (Powell 2008 ),

whereas the control theory community uses integral control and related methods

based on plant models (Kappen 2007 ). The field of reinforcement learning (from

the artificial intelligence perspective) is not only related to other computational and

mathematical approaches for solving similar problems, but it is also a well-accepted

and fundamental physiological model of learning in the neuroscience community

(Rescorla and Wagner 1972 ; Dayan and Niv 2008 ) with conceptual intersections

between the two fields (Maia 2009 ;Niv 2009 ).

Portions of this chapter previously appeared as: Gatti & Embrechts ( 2012 ).

Search WWH ::

Custom Search

Home