Information Technology Reference
In-Depth Information
4.4.2
Nonnegative Rewards
Reward testing is obtained from the probabilistic testing in Sect. 4.2 by associating
each success action ˉ
ʩ a reward, which is a nonnegative number in the unit
interval [0, 1], and a run of a probabilistic process in parallel with a test yielding
an expected reward accumulated by those states that can enable success actions. A
reward tuple h
[0, 1] ʩ
is used to assign reward h ( ˉ ) to success action ˉ , for each
ˉ
ʩ . Due to the presence of nondeterminism, the application of a test T to a
process P produces a set of expected rewards. Two sets of rewards can be compared
by examining their supremum/infimum elements; this gives us two methods of testing
called reward may/must testing. In general, a reward could also be a negative real
number. But in this chapter we only consider nonnegative rewards; the general case
will be discussed in Sect. 6.9.
Formally, let h : ʩ
[0, 1] be a reward vector that assigns a nonnegative
reward to each success action. In analogy to the function
C
in ( 4.1 ), we now define a
h
function
[0, 1]) with respect to reward vector h in order
to associate a reward with a deterministic process, which in turn will associate a set
of rewards with a process.
C
:( R
[0, 1])
( R
ˉ
−ₒ
h ( ˉ )
if r
h ( f )( r ):
ˉ
˄
C
=
(4.10)
0
if r
−ₒ
and r
−ₒ
˄
−ₒ
ˉ
Exp ʔ ( f ) f r
−ₒ
and r
ʔ .
h
h . Let
h ( T , P )
The function
C
is also continuous, thus has a least fixed point
V
A
denote the set of rewards
h )
{
Exp ʘ (
V
|
R , ʘ ,
is a resolution of [[ T
| Act P ]]
}
.
Definition 4.6 Let P and Q be two probabilistic processes. We define two reward
testing preorders:
nr may Q if for every ʩ test T and nonnegative reward tuple h
[0, 1] ʩ ,
(i)
P
A
A
h ( T , P )
h ( T , Q ).
nr must Q if for every ʩ test T and nonnegative reward tuple h
[0, 1] ʩ ,
(ii)
P
h ( T , P )
h ( T , Q ).
A
A
These preorders are abbreviated to P
nr may Q and P
nr must Q , when
|
ʩ
|=
1.
Example 4.2 Let us use the notation a.ˉ from the calculus of communicating sys-
tems to mean the test that can perform action a followed by ˉ before reaching a
deadlock state. By applying this test to the process Q 1 in Fig. 4.3 a, we obtain the
pLTS in Fig. 4.3 b that is already deterministic, hence has only one resolution itself.
Moreover, the outcome
h associated with it is determined by its value at the state
s 0 . This in turn is the least solution of the equation
V
Search WWH ::




Custom Search