Reinforcement Learning - Design of Experiments for Reinforcement Learning - page 12

Civil Engineering Reference

In-Depth Information

Fig. 2.3 Example Gridworld

domain: The goal of this

domain is to move from the

starting location ( gray square )

to the goal ( star ) by taking

cardinal actions and avoiding

the walls ( black squares ).

Fig. 2.4 Mountain car

problem: The goal of this

domain is to have the

underpowered car reach the

top of the mountain ( star )by

building momentum using

forward or reverse actions or

using no action.

Fig. 2.5 Cart pole-balancing

task: The goal of the cart-pole

balancing task is to keep the

pole vertically-oriented by

applying left- or right-

directed forces to the cart.

ʸ

F

Additional benchmark problems include low-dimensional and continuous control

problems that are based on well-defined dynamics of physical systems. Examples of

these domains include the mountain car domain (Moore 1990 ; Singh and Sutton 1996 ;

Riedmiller 2005 ; Fig. 2.4 ), acrobot (Sutton and Barto 1998 ), cart-pole balancing

(Barto et al. 1983 ; Riedmiller 2005 ; Fig. 2.5 ), and the pendulum swing-up task

(Doya 1996 , 2000 ). The dynamics of the environment (i.e., equations of motion) are

unknown to the agent, though it must learn how to behave in the environment only

through selecting control actions and the receipt of rewards or penalties.

Next Page

Design of Experiments for Reinforcement Learning

Search WWH ::

Custom Search

Home