Up the Down Staircase: Hierarchical Reinforcement Learning - Realtime Data Mining

Database Reference

In-Depth Information

particular by using the isomorphism ( 4.1 ) between states and actions in REs and

so further extend the concept of the hierarchical splitting of states to actions.

6.1

Introduction

We will approach the problem of hierarchical methods from two sides: firstly from

the historical - analytical - viewpoint, then from the algebraic viewpoint. We

deliberately omit most of the mathematical infrastructure, which in parts is exceed-

ingly complex, and attempt to explain the underlying ideas in an understandable

(and sometimes slightly simplified) fashion.

6.1.1 Analytical Approach

So far we have only ever considered the state-value and action-value functions v(s)

and q(s, a) in tabular form. However, we are dealing with functions, and so in RL we

often have to resort to approximation methods such as linear and polynomial func-

tions or, for instance, neural networks in order to represent them using only a few

coefficients. We therefore start now from the actual functions.

Since most useful function spaces V are infinite dimensional, we will instead

consider finite-dimensional subspaces V n V , where n is their dimension. In most

cases this is the central assumption for being able to efficiently find a numerical

solution of the associated operator equation.

We can represent a finite-dimensional function f n ∈

V n as follows:

f n ðÞ¼ X

n

c i ϕ i ðÞ ,

ð 6

:

1 Þ

i¼ 1

where

ϕ i are the basis functions and c i are their coefficients. By inserting the

proposition ( 6.1 ) into the operator equation, for instance, the Bellman equation

( 3.7 ), we can reduce the determination of the function f

∈

V to the determination of

the coefficients c i of our approximated function f n ∈

V n . (We are omitting now

subtleties such as the use of test functions in variation formulations.)

Now as a rule there are several bases for a function space V n . Let us consider in

addition to the basis

Φ n ¼ [

ϕ 1 ,

ϕ 2 ,

...

,

ϕ n ] another basis

Ψ n ¼ [

ψ 1 ,

ψ 2 ,

...

,

ψ n ]

in V n . Then every function f n ∈

V n also over the basis

Ψ n can be represented by

coefficients d i :

f n ðÞ¼ X

n

c i ϕ i ðÞ¼ X

n

d i ψ i ðÞ:

i¼ 1

Realtime Data Mining

Search WWH ::

Custom Search

Home