Information Technology Reference
In-Depth Information
Suppose ʴ<
1
and ʔ
ⓦ
is a subdistribution in a finitary pLTS given by
Lemma 4.9
T
,
ʘ
ⓦ
,
S
,
ʩ
˄
,
ₒ
. There exists a static resolution
ₒ
such that
ʴ
,
h
max
(
ʔ
ⓦ
)
ʴ
,
h
(
ʘ
ⓦ
)
V
= V
for any reward vector h.
T
,
ʘ
ⓦ
,
Proof
Let
ₒ
be a resolution with an injective resolving function
f
such
that
˄
−ₒ
˄
−ₒ
ʴ
,
max
(
f
(
ʘ
))
ʴ
,
max
(
ʔ
)
if
t
ʘ
then
V
=
max
{V
|
f
(
t
)
ʔ
}
.
A finitary pLTS is finitely branching, which ensures the existence of such resolving
function
f
.
Let
g
:
T
ʴ
,
max
(
f
(
t
)) for all
t
ₒ
[0, 1] be the function defined by
g
(
t
)
= V
∈
T
.
ˉ
−ₒ
ˉ
−ₒ
ʴ
,
h
.If
t
Below we show that
g
is a fixed point of
C
, then
f
(
t
)
. Therefore,
˄
−ₒ
ʘ
.By
ˉ
ʴ
(
g
)(
t
)
ʴ
,
max
(
f
(
t
))
C
=
h
(
ˉ
)
= V
=
g
(
t
). Now suppose
t
−ₒ
and
t
˄
−ₒ
f
(
ʘ
) such that the condition,
ˉ
the definition of
f
,wehave
f
(
t
)
−ₒ
,
f
(
t
)
˄
−ₒ
ʴ
,
max
(
f
(
ʘ
))
ʴ
,
max
(
ʔ
)
V
=
max
{V
|
f
(
t
)
ʔ
}
, holds. Therefore,
ʴ
,
h
(
g
)(
t
)
C
=
ʴ
·
g
(
ʘ
)
·
t
∈
T
ʘ
(
t
)
=
ʴ
·
g
(
t
)
·
t
∈
T
ʘ
(
t
)
=
ʴ
· V
ʴ
,
max
(
f
(
t
))
·
s
∈
S
f
(
ʘ
)(
s
)
ʴ
,
h
=
ʴ
· V
max
(
s
)
ʴ
,
max
(
f
(
ʘ
))
=
ʴ
· V
˄
−ₒ
ʴ
,
max
(
ʔ
)
=
ʴ
·
max
{V
|
f
(
t
)
ʔ
}
ʴ
,
max
(
f
(
t
))
= V
=
g
(
t
)
.
Since
C
ʴ
has a unique fixed point
V
ʴ
,
h
, we derive that
g
coincides with
V
ʴ
,
h
, i.e.
ʴ
,
h
(
t
)
ʴ
,
max
(
f
(
t
)) for all
t
∈
T
, from which we can obtain the required
V
=
g
(
t
)
= V
ʴ
,
h
(
ʘ
ⓦ
)
ʴ
,
max
(
ʔ
ⓦ
).
result
V
= V
Let ʔ
ⓦ
be a subdistribution in a finitary pLTS
Theorem 4.3
S
,
ʩ
˄
,
ₒ
. There
T
,
ʘ
ⓦ
,
h
)
h
exists a static resolution
ₒ
such that Exp
ʘ
ⓦ
(
V
=
Exp
ʔ
ⓦ
(
V
max
)
.
Proof
(0, 1), there exists a static
resolution that achieves the maximum expected reward. Since the pLTS is finitary,
there are finitely many different static resolutions. There must exist a static resolution
that achieves the maximum expected reward for infinitely many discount factors. In
other words, for every nondecreasing sequence
By Lemma
4.9
, for every discount factor
d
∈
{
ʴ
n
}
n
≥
1
converging to 1, there exists
T
,
ʘ
ⓦ
,
a subsequence
{
ʴ
n
k
}
k
≥
1
and a static resolution
−ₒ
with resolving function
Search WWH ::
Custom Search