The Big Picture: Toward a Synthesis of RL and Adaptive Tensor Factorization - Realtime Data Mining

Database Reference

In-Depth Information

consisting of only one set, the update rule coincides with that for a classical MDP.

Hence, our factorization model incorporates the case where a k -MDP is approximated

by a 1-MDP model. This bears an epistemic value with regard to the assessment of

the quality of 1-MDP models in environments that actually satisfy a GMA with

k

>

1. Specifically, with regard to recommendation environments, which, arguably,

may be assumed to be more accurately represented by a k -MDP, this insight

may enable us to assess the quality of the classical MDP models discussed in

foregoing chapters. For example, we may obtain bounds on the modeling error

entailed by employing a classical MDP model from bounds on the approximation

error of the factorized representation. Admittedly, we are as yet in no position to

produce such error bounds here. Hence, we leave the topic for future research.

10.4 Factored Representation and Computation

of the State Values

10.4.1 A Model-Based Approach

In the following, we shall be interested in approximations of the form

v s s X

β ∈ m

u s β θ sβ ¼ θ sβ

ð 10

:

9 Þ

to the state-value function. Here, U denotes an aggregation prolongator as

introduced in Equation ( 10.5 ). In order to solve the Bellman equation ( 10.1 )

approximately, we devise the least squares approach

! 2

X

min

θ

θ sβ ðÞ γ

c ss 0 β ðÞ θ s 0 β s ðÞ

b s s

,

ð 10

:

10 Þ

,

s ∈ S

s 0

s ∈ S

∈ S

which is obtained by inserting the factorized representations ( 10.9 ) and ( 10.4 ), with

U taken to be the aggregation prolongator defined in ( 10.5 ), for v and P in the least

squares version of ( 10.1 ),

! 2

X

v s s γ

p s ss 0 v s ð s 0

:

min

v

,

s ∈ S

s 0

s ∈ S

∈ S

As regards practical computation in a recommendation framework, one may

proceed as follows: first, the core tensor C is estimated from observation by means

of the updating procedure ( 10.7 ). Eventually, Equation ( 10.10 ) may be solved by

means of numerical linear algebra.

Realtime Data Mining

Search WWH ::

Custom Search

Home