Database Reference
In-Depth Information
F 1
P P
:
:
58
The k -means algorithm ( k ¼ 2) applied to this problem automatically finds the
first partition G 1 ¼ {1,2}, G 2 ¼ {3} which is in fact the best one.
If we, by the way, use the global transition probabilities P for all i , i.e., P ðÞ ¼ P ,
this results, as expected, in a higher approximation error of approximately 2.07.
10.3 Estimation of Factorized Transition Probabilities
In what follows, we shall present a procedure to estimate the factorized transition
probabilities in an adaptive online fashion, that is, a method doing without any
representation of an estimate of the full transition probability tensor.
To this end, recall the update rule for the transition probabilities of a classical
MDP presented in Chap. 3 . In virtue of state space augmentation, this rule and the
corresponding convergence result carry over to the k -MDP case as follows:
P s ss 0 þ t 1
1 t 1
P s ss 0
:
In tensor notation, the update rule reads as
P þ t 1 e s e s e s 0 ,
1 t 1
P
where
1,
i ¼ j ,
eðÞ j ¼
0,
i
6¼ j
:
Matricifying with respect to the first mode and pre-multiplying with U + eventu-
ally gives rise to the update rule
C 1
þ t 1 e β ðÞ
1 t 1
C
G β ðÞ e s e s 0 ,
ð 10
:
7 Þ
where
β
(s) denotes the unique index
β
m satisfying s
G β , for the core tensor. In
index notation, this corresponds to
C ss 0 β ðÞ þ t 1
C ss 0 β ðÞ 1 t 1
:
ð 10
:
8 Þ
Co nvergence of the update rule follows immediately from convergence of that
for P and partial continuity of the multilinear product.
Besides its computational use, the update rule also reveals another interesting
property of the factorization model: for the trivial partition, that is, the partition
Search WWH ::




Custom Search