Database Reference
In-Depth Information
sets the function value at
state
to
value
. It is only supported for
StateValueTable
because in general a mining model cannot be changed directly but is the result of the
mining process.
public void updateValue(State state, double value) throws
MiningException;
updates the function value. It adds the new value to the current value of the
function. This method internally combines
getValue
with
setValue
and thus is also
limited to models of
StateValueTable.
Example 12.19
Consider a simple state-value function with just two states. Using
the default
StateValueTable
, it can be written as follows:
// Define state-value function (with two states):
StateValueFunction sf
¼
new StateValueFunction(2);
double[] s1
¼
{1};
State st1
¼
new State(s1, 0); // state index 0
sf.setValue(st1, 0.9);
double[] s2
¼
{2};
State st2
¼
new State(s2, 1); // state index 1
sf.setValue(st2, 1.9);
// Retrieve function value:
System.out.println(st1 + " -
>
val1
¼
" + sf.getValue(st1) );
■
The class
ActionValueFunction
for action-value functions
q(s, a
) is similar to
StateValueFunction
but uses the state-action pair
(s, a)
instead of a state
s
. The
StateActionVector
class is the internal representative of the state-action pair.
Consequently,
ActionValueFunction
also owns a variable
function
of the class
MiningModel
to store the function values. For discrete problems, in further analogy
to
StateValueFunction
, it provides an extended
TableMiningModel
, the
ActionVa-
lueTable
, to store all pairs of argument and function value, i.e., {
(s, a)
,
Q(s, a)
}.
ActionValueFunction
contains similar methods to get, set, and update its function
values like
StateValueFunction
but with state-action pairs as keys (instead of
states only).
Policies
The abstract class
Policy
is the base class of the stochastic policy
π
(
s
,
a
) (see
Sect.
3.3
).
It owns a variable
actionSet
to store the corresponding action set
A(s).
The
values of the actions, called
action values
, can be defined in different ways, most
importantly by virtue of an
ActionValueFunction
.