Database Reference
In-Depth Information
particular by using the isomorphism ( 4.1 ) between states and actions in REs and
so further extend the concept of the hierarchical splitting of states to actions.
6.1
Introduction
We will approach the problem of hierarchical methods from two sides: firstly from
the historical - analytical - viewpoint, then from the algebraic viewpoint. We
deliberately omit most of the mathematical infrastructure, which in parts is exceed-
ingly complex, and attempt to explain the underlying ideas in an understandable
(and sometimes slightly simplified) fashion.
6.1.1 Analytical Approach
So far we have only ever considered the state-value and action-value functions v(s)
and q(s, a) in tabular form. However, we are dealing with functions, and so in RL we
often have to resort to approximation methods such as linear and polynomial func-
tions or, for instance, neural networks in order to represent them using only a few
coefficients. We therefore start now from the actual functions.
Since most useful function spaces V are infinite dimensional, we will instead
consider finite-dimensional subspaces V n V , where n is their dimension. In most
cases this is the central assumption for being able to efficiently find a numerical
solution of the associated operator equation.
We can represent a finite-dimensional function f n
V n as follows:
f n ðÞ¼ X
n
c i ϕ i ðÞ ,
ð 6
:
1 Þ
1
where
ϕ i are the basis functions and c i are their coefficients. By inserting the
proposition ( 6.1 ) into the operator equation, for instance, the Bellman equation
( 3.7 ), we can reduce the determination of the function f
V to the determination of
the coefficients c i of our approximated function f n
V n . (We are omitting now
subtleties such as the use of test functions in variation formulations.)
Now as a rule there are several bases for a function space V n . Let us consider in
addition to the basis
Φ n ¼ [
ϕ 1 ,
ϕ 2 ,
...
,
ϕ n ] another basis
Ψ n ¼ [
ψ 1 ,
ψ 2 ,
...
,
ψ n ]
in V n . Then every function f n
V n also over the basis
Ψ n can be represented by
coefficients d i :
f n ðÞ¼ X
n
c i ϕ i ðÞ¼ X
n
d i ψ i ðÞ:
1
1
Search WWH ::




Custom Search