Information Technology Reference
In-Depth Information
where
π
[
i
] denotes the
i
-th state along
π
. Thus all paths are considered that hit
G
at some position, no matter how much time has elapsed upon hitting
G
.Forsuch
(unbounded) reachability events, positional policies suce, i.e., there is no need
anymore to “know” the total time that has elapsed along the computation so far.
In fact,
p
max
(
s,
[0
,
∞
)) can be determined by considering the discrete-probabi-
listic process that is embedded in the IMC at hand. The discretised counterpart
of an IMC is an interactive probabilistic chain.
Definition 8 (Interactive probabilistic chain [23]).
An
interactive proba-
bilistic chain
(IPC) is a tuple
P
=(
S,
Act
,
−→
,
P
,s
0
)
,where
S,
Act
,
IT
and
s
0
are as in Def. 1 and
P
:
S
×
S
→
[0
,
1]
is a transition probability function
sastifying
∀
s
∈
S.
P
(
s, S
)
∈{
0
,
1
}
.
is
probabilistic
iff
s
∈S
P
(
s, s
)=1and
A state
s
in an IPC
.Asfor
IMCs, we adopt the
maximal progress assumption
. Hence, interactive internal
transitions take precedence over probabilistic transitions and their execution
takes zero discrete time steps. The embedded IPC of an IMC is obtained by
considering the discrete-probabilistic interpretation of
P
IT
(
s
)=
∅
, i.e.,
P
(
s, s
)=
R
(
s,s
)
E
(
s
)
⇒
MT
(
s
)
∅
if
=
,and0otherwise.Itthenfollows:
I
P
:
p
I
(
s,
[0
,
∞
))=
p
P
(
s,
[0
,
∞
Theorem 5.
For any IMC
with embedded IPC
))
.
The values
p
P
(
s,
[0
,
)) can be obtained by applying a slight variation of value
iteration algorithms for MDPs [7].
∞
Discretisation.
The computation of
p
max
(
s, I
)withinf
I
canbedoneusing
discretisation, and as we will see, can also be reduced —though in a different
way as explained above— to value iteration on MDPs.
=
∅
Definition 9 (Discretisation [66]).
An IMC
I
=(
S,
Act
,
−→
,
⇒
,s
0
)
and
,
P
,s
0
)
,
astepduration
δ
∈
R
>
0
induce the
discretised IPC
P
δ
=(
S,
Act
,
−→
where
P
(
s, s
)=
1
e
−E
(
s
)
·δ
·
P
(
s, s
)
=
s
−
if
s
1
e
−E
(
s
)
·δ
(1)
P
(
s, s
)+
e
−E
(
s
)
·δ
if
s
=
s
.
−
·
Let
p
max
(
s,
[
k
a
,k
b
]) for an IPC
k
b
be
the supremum of the probabilities to reach a set of goal states within step interval
[
k
a
,k
b
],
k
a
,k
b
∈
N
P
with state
s
and step-interval 0
≤
k
a
≤
. The following result allows to approximate this probability
in the underlying IMC by a step-bounded reachability analysis in its discretised
IPC. This discretisation is indeed
quantifiably correct
:
Theorem 6 (Approximation theorem [66]).
Let
I
=(
S,
Act
,
−→
,
⇒
,s
0
)
be an IMC,
G
S
a set of goal states and
δ>
0
a step duration. Further, let
I
be a time interval with
inf
I
=
a
and
sup
I
=
b
such that
a<b
and
a
=
k
a
δ
and
b
=
k
b
δ
for some
k
a
∈
N
⊆
and
k
b
∈
N
>
0
.Then:
p
P
max
s,
(
k
a
,k
b
]
−
(
λδ
)
2
2
≤
p
P
max
s,
(
k
a
,k
b
]
+
k
b
·
(
λδ
)
2
2
p
max
k
a
·
(
s, I
)
≤
+
λδ.