Up the Down Staircase: Hierarchical Reinforcement Learning - Realtime Data Mining

Database Reference

In-Depth Information

A step toward understanding convergence properties of the method in the

nonsymmetric case is the insight that the operator ( I L ( R T AL ) 1 RA ) transforming

the error vector before to that after the coarse grid projection, that is,

e ,

1 RA

e 0

I LR T AL

¼

though not orthogonal in general, is always a projector (i.e., its square equals itself)

along the range of the interpolation operator. Moreover, it can be shown that there is

always an inner product in which the correction operator is an orthogonal projector

(Proposition 3.6.2 in [Pap10]). The iteration matrices corresponding to standard

splittings, however, are no contractions with respect to such an inner product in

general. Hence, it is in some cases easier to analyze the asymptotic convergence

rate of applying the V-cycle procedure in an iterative fashion.

For the sake of completeness, we should mention that it is possible to circumvent

the nonsymmetric case by applying an AMG procedure to the equivalent system

A T Ax ¼ A T b ,

which has a symmetric and positive definite coefficient matrix if A is non-singular.

This approach, however, brings along difficulties of its own. First, the condition of the

symmetrized matrix A T A is square of that of A , which renders the solution consider-

ably more sensitive to perturbations in the data. Furthermore, many structural

features of A, such as sparsity, are not inherited by the symmetrized system. There-

fore, the symmetrized approach turns out to be unsatisfactory in most situations.

With this we come to the last point of this introduction to hierarchical methods for

acceleration of convergence: multilevel splitting can be used in different ways:

directly, as multigrid or as preconditioners, additively and multiplicatively, etc. It

is beyond the scope of this study to present them all individually. We refer here in

particular to the Abstract Schwarz Theory [Os94], which gives unified access via

basis transformations to virtually all multilevel methods - including sparse grids. In

the next sections we will concentrate primarily on the algebraic construction of the

grid hierarchy, fromwhich a wide variety of hierarchical approaches can be derived.

6.2 Multilevel Methods for Reinforcement Learning

We now come to the application of multilevel methods for RL. We consider the

Bellman equation in the form ( 3.15 ) . Let us now define (leaving out the policy

notation)

P π

A ¼ I γ

ð 6

:

9 Þ

Search WWH ::

Custom Search

Home