Information Technology Reference
In-Depth Information
F
::=
B
(
F,F
)
| U
(
F
)
| V | C
B
::= +
|−|×|÷|min | max
U
::=
sqrt | ln | abs | opposite | inverse
V
::=
r
k
| σ
k
| t
k
| t
C
::= 1
,
2
,
3
,
5
,
7
Fig. 1.
The grammar used for generating candidate index functions and two example formula
parse trees corresponding to
r
k
+2
/t
k
and
r
k
+
2
ln
(
t
)
/t
k
-
or an atomic variable
F
=
V
,where
V
belongs to a set of variables
V
,
-
or a constant
F
=
C
,where
C
belongs to a set of constants
C
.
In the following, we consider a set of operators and constants that provides a good
compromise between high expressiveness and low cardinality of
F
.Thesetofbinary
B
operators considered in this paper
includes the four elementary mathematic opera-
tions and the
min
and
max
operators:
B
=
{
+
,
−
,
×
,
÷
,
min
,
max
}
. The set of unary
U
operators
contains the square root, the logarithm, the absolute value, the opposite and
=
√
.,
ln(
.
)
,|.|,−.,
.
. The set of variables
the inverse:
U
V
is:
V
=
{r
k
, σ
k
,t
k
,t}
.
The set of constants
has been chosen to maximize the number of different numbers
representable by small formulas. It is defined as
C
.
Figure 1 summarizes our grammar of formulas and gives two examples of index
functions. The length of a formula
length
(
f
)
is the number of symbols occurring in the
C
=
{
1
,
2
,
3
,
5
,
7
}
formula. For example, the length of
r
k
+2
/t
k
is 5 and the length of
r
k
+
2
ln
(
t
)
/t
k
is 9. Let
L
be a given maximal length.
Θ
is the subset of formulas whose length is no
more than
L
:
Θ
=
×
and
Π
Θ
is the set of index-based policies whose
index functions are defined by formulas
f
{
f
|
length
(
f
)
≤
L
}
∈
Θ
.
5.2
Optimisation Algorithm
We now discuss the optimization of Equation 7 in the case of our symbolic parame-
terization. First, notice that several different formulas can lead to the same policy. For
example, any increasing function of
r
k
defines the greedy policy, which always selects
the arm that is believed to be the best. Examples of such functions in our formula search
space include
r
k
,
r
k
×
r
k
or
√
r
k
.
Since it is useless to evaluate equivalent policies multiple times, we propose the
following two-step approach. First, the set
Θ
is partitioned into equivalence classes, two
formulas being equivalent if and only if they lead to the same policy. Then, Equation
7 is solved over the set of equivalence classes (which is typically one or two orders of
magnitude smaller than the initial set
Θ
).
2
,
r
k
×
Partitioning
Θ
. This task is far from trivial: given a formula, equivalent formulas can
be obtained through commutativity, associativity, operator-specific rules and through
any increasing transformation. Performing this step exactly involves advanced static
analysis of the formulas, which we believe to be a very difficult solution to implement.