Database Reference
In-Depth Information
¼ r
s
0
d
ΦðÞ:¼ r
ΦðÞ
s
;
ðÞγ ΦðÞ
s
;
e
s
;ðÞ
γ
e
Φθ:
ðÞ
s
;s
0
We shall show that, when applied simultaneously to the same k-MDP, the
iterates satisfy
v ¼ θ
for all iterations. To this end, we point out that the following simplification holds in
the here addressed special case
m ¼
1:
Φθ ¼
X
α
,
β
ϕ
αβ
θ
αβ
¼
X
ϕ
α
1
θ
α
1
¼
X
α
δ
α¼s
θ
α
1
α
which implies, with an inconsequential abuse of notation in the last equality,
¼ e
T
s
0
ð
∈
S
ΦðÞ
s
;
s
;ð
Φθ ¼ θ
s
1
¼: θ
s
8
s
;
:
Hence,
d
ΦðÞ¼r
e
s
;ðÞ
γ
e
s
;s
ðÞ
Φθ ¼ r
e
s
;ð
Φθ γ
e
ð
Φθ
s
;s
0
¼ r θ
s
γθ
s
0
Now it only remains to show that
T
z ¼
z
Φ
for all iterations, which we carry out by induction: both
z
and
z
are initialized
as vectors of all zeros. Therefore, the sought-after statement holds for the first
iteration. To conclude the induction, we argue as follows: let
z
,
z
denote the
previous values of
z
,
z
, i.e.,
z ¼ γλ
z
þ e
s
;ð
,
z
¼ γλ
z
þ e
s
:
Since by induction assumption,
T
z
¼
z
,
Φ
we obtain
¼ γλΦ
T
z ¼ Φ
T
T
z
þ Φ
T
e
s
;ðÞ
¼ γλ
Φ
γλ
z
þ e
s
;ðÞ
z
þ e
s
¼
z
,
which yields the desired result.
□