Information Technology Reference
In-Depth Information
4.4.2
Nonnegative Rewards
Reward testing is obtained from the probabilistic testing in Sect.
4.2
by associating
each success action
ˉ
ʩ
a reward, which is a nonnegative number in the unit
interval [0, 1], and a run of a probabilistic process in parallel with a test yielding
an expected reward accumulated by those states that can enable success actions. A
reward tuple
h
∈
[0, 1]
ʩ
∈
is used to assign reward
h
(
ˉ
) to success action
ˉ
, for each
ˉ
ʩ
. Due to the presence of nondeterminism, the application of a test
T
to a
process
P
produces a set of expected rewards. Two sets of rewards can be compared
by examining their supremum/infimum elements; this gives us two methods of testing
called reward may/must testing. In general, a reward could also be a negative real
number. But in this chapter we only consider nonnegative rewards; the general case
will be discussed in Sect. 6.9.
Formally, let
h
:
ʩ
∈
[0, 1] be a reward vector that assigns a nonnegative
reward to each success action. In analogy to the function
ₒ
C
in (
4.1
), we now define a
h
function
[0, 1]) with respect to reward vector
h
in order
to associate a reward with a deterministic process, which in turn will associate a set
of rewards with a process.
C
:(
R
ₒ
[0, 1])
ₒ
(
R
ₒ
⊧
⊨
ˉ
−ₒ
h
(
ˉ
)
if
r
h
(
f
)(
r
):
ˉ
˄
C
=
(4.10)
0
if
r
−ₒ
and
r
−ₒ
⊩
˄
−ₒ
ˉ
Exp
ʔ
(
f
) f
r
−ₒ
and
r
ʔ
.
h
h
. Let
h
(
T
,
P
)
The function
C
is also continuous, thus has a least fixed point
V
A
denote the set of rewards
h
)
{
Exp
ʘ
(
V
|
R
,
ʘ
,
ₒ
is a resolution of [[
T
|
Act
P
]]
}
.
Definition 4.6
Let
P
and
Q
be two probabilistic processes. We define two reward
testing preorders:
nr
may
Q
if for every
ʩ
test
T
and nonnegative reward tuple
h
[0, 1]
ʩ
,
(i)
P
∈
A
≤
A
h
(
T
,
P
)
h
(
T
,
Q
).
nr
must
Q
if for every
ʩ
test
T
and nonnegative reward tuple
h
∈
[0, 1]
ʩ
,
(ii)
P
h
(
T
,
P
)
h
(
T
,
Q
).
A
≤
A
These preorders are abbreviated to
P
nr
may
Q
and
P
nr
must
Q
, when
|
ʩ
|=
1.
Example 4.2
Let us use the notation
a.ˉ
from the calculus of communicating sys-
tems to mean the test that can perform action
a
followed by
ˉ
before reaching a
deadlock state. By applying this test to the process
Q
1
in Fig.
4.3
a, we obtain the
pLTS in Fig.
4.3
b that is already deterministic, hence has only one resolution itself.
Moreover, the outcome
h
associated with it is determined by its value at the state
s
0
. This in turn is the least solution of the equation
V
Search WWH ::
Custom Search