Information Technology Reference
In-Depth Information
F ::= B ( F,F ) | U ( F ) | V | C
B ::= + |−|×|÷|min | max
U ::= sqrt | ln | abs | opposite | inverse
V ::= r k | σ k | t k | t
C ::= 1 , 2 , 3 , 5 , 7
Fig. 1. The grammar used for generating candidate index functions and two example formula
parse trees corresponding to r k +2 /t k and r k + 2 ln ( t ) /t k
- or an atomic variable F = V ,where V belongs to a set of variables
V
,
- or a constant F = C ,where C belongs to a set of constants
C
.
In the following, we consider a set of operators and constants that provides a good
compromise between high expressiveness and low cardinality of
F
.Thesetofbinary
B
operators considered in this paper
includes the four elementary mathematic opera-
tions and the min and max operators:
B
=
{
+ ,
,
×
,
÷
, min , max
}
. The set of unary
U
operators
contains the square root, the logarithm, the absolute value, the opposite and
= ., ln( . ) ,|.|,−., . . The set of variables
the inverse:
U
V
is:
V
=
{r k , σ k ,t k ,t}
.
The set of constants
has been chosen to maximize the number of different numbers
representable by small formulas. It is defined as
C
.
Figure 1 summarizes our grammar of formulas and gives two examples of index
functions. The length of a formula length ( f ) is the number of symbols occurring in the
C
=
{
1 , 2 , 3 , 5 , 7
}
formula. For example, the length of r k +2 /t k is 5 and the length of r k + 2
ln ( t ) /t k
is 9. Let L be a given maximal length. Θ is the subset of formulas whose length is no
more than L : Θ =
×
and Π Θ is the set of index-based policies whose
index functions are defined by formulas f
{
f
|
length ( f )
L
}
Θ .
5.2
Optimisation Algorithm
We now discuss the optimization of Equation 7 in the case of our symbolic parame-
terization. First, notice that several different formulas can lead to the same policy. For
example, any increasing function of r k defines the greedy policy, which always selects
the arm that is believed to be the best. Examples of such functions in our formula search
space include r k , r k ×
r k or r k .
Since it is useless to evaluate equivalent policies multiple times, we propose the
following two-step approach. First, the set Θ is partitioned into equivalence classes, two
formulas being equivalent if and only if they lead to the same policy. Then, Equation
7 is solved over the set of equivalence classes (which is typically one or two orders of
magnitude smaller than the initial set Θ ).
2 , r k ×
Partitioning Θ . This task is far from trivial: given a formula, equivalent formulas can
be obtained through commutativity, associativity, operator-specific rules and through
any increasing transformation. Performing this step exactly involves advanced static
analysis of the formulas, which we believe to be a very difficult solution to implement.
 
Search WWH ::




Custom Search