Information Technology Reference
In-Depth Information
where π [ i ] denotes the i -th state along π . Thus all paths are considered that hit G
at some position, no matter how much time has elapsed upon hitting G .Forsuch
(unbounded) reachability events, positional policies suce, i.e., there is no need
anymore to “know” the total time that has elapsed along the computation so far.
In fact, p max ( s, [0 ,
)) can be determined by considering the discrete-probabi-
listic process that is embedded in the IMC at hand. The discretised counterpart
of an IMC is an interactive probabilistic chain.
Definition 8 (Interactive probabilistic chain [23]). An interactive proba-
bilistic chain (IPC) is a tuple
P
=( S,
Act
,
−→
, P ,s 0 ) ,where S,
Act
,
IT
and s 0
are as in Def. 1 and P : S
×
S
[0 , 1] is a transition probability function
sastifying
s
S. P ( s, S )
∈{
0 , 1
}
.
is probabilistic iff s ∈S P ( s, s )=1and
A state s in an IPC
.Asfor
IMCs, we adopt the maximal progress assumption . Hence, interactive internal
transitions take precedence over probabilistic transitions and their execution
takes zero discrete time steps. The embedded IPC of an IMC is obtained by
considering the discrete-probabilistic interpretation of
P
IT
( s )=
, i.e., P ( s, s )= R ( s,s )
E ( s )
MT
( s )
if
=
,and0otherwise.Itthenfollows:
I
P
: p I ( s, [0 ,
))= p P ( s, [0 ,
Theorem 5. For any IMC
with embedded IPC
)) .
The values p P ( s, [0 ,
)) can be obtained by applying a slight variation of value
iteration algorithms for MDPs [7].
Discretisation. The computation of p max ( s, I )withinf I
canbedoneusing
discretisation, and as we will see, can also be reduced —though in a different
way as explained above— to value iteration on MDPs.
=
Definition 9 (Discretisation [66]). An IMC
I
=( S,
Act
,
−→
,
,s 0 ) and
, P ,s 0 ) ,
astepduration δ
R > 0 induce the discretised IPC
P δ =( S,
Act
,
−→
where
P ( s, s )= 1
e −E ( s ) ·δ ·
P ( s, s )
= s
if s
1
e −E ( s ) ·δ
(1)
P ( s, s )+ e −E ( s ) ·δ
if s = s .
·
Let p max ( s, [ k a ,k b ]) for an IPC
k b be
the supremum of the probabilities to reach a set of goal states within step interval
[ k a ,k b ], k a ,k b N
P
with state s and step-interval 0
k a
. The following result allows to approximate this probability
in the underlying IMC by a step-bounded reachability analysis in its discretised
IPC. This discretisation is indeed quantifiably correct :
Theorem 6 (Approximation theorem [66]). Let
I
=( S,
Act
,
−→
,
,s 0 )
be an IMC, G
S a set of goal states and δ> 0 a step duration. Further, let I
be a time interval with inf I = a and sup I = b such that a<b and a = k a δ and
b = k b δ for some k a N
and k b N > 0 .Then:
p P max s, ( k a ,k b ]
( λδ ) 2
2
p P max s, ( k a ,k b ] + k b ·
( λδ ) 2
2
p max
k a ·
( s, I )
+ λδ.
 
Search WWH ::




Custom Search