Information Technology Reference
In-Depth Information
E j,t ( a j )= Π j,t ( a j )
·
(1
e ) a j = a j
(6)
e
M 1
Π j,t ( a j )
·
a j
= a j
where e
[0 , 1] is the experimentation parameter which assigns different weights
between the played strategy and the non-played strategies and Π j,t ( a j ) is the reward
obtained by playing strategy ( a j ) at round t . Propensities are then normalized so to de-
termine the probability for the strategy selection policy π j,t +1 ( a j ) for the next auction
round as:
S j,t ( a j )
a j S j,t ( a j )
π j,t +1 ( a j )=
(7)
The modified Roth and Erev learning model (hereafter referred to as MRE algorithm) by
[18] proposed a solution for the case of zero payoffs by modifying the experimentation
function in equation 6 according to:
E j,t ( a j )= Π j,t ( a j )
·
(1
e ) a j = a j
(8)
e
M 1
S j,t− 1 ( a j )
·
a j
= a j
It is worth remarking that MRE and RE are identical for a positive reward Π j,t ( a j ) ,
whereas for null payoff MRE introduces an implicit premium for non-played strate-
gies with respect to the ineffective (i.e. with negative Π j,t ( a j ) ) played strategy. MRE
represents a first but not final extension of the Roth and Erev algorithm as neither MRE
algorithm nor the later VRE algorithm proposed by [22] are able to cope with negative
payoffs. In order to overcome such limitation of the Roth-Erev algortihm, we propose
to extend the MRE algorithm by enhancing the experimentation mechanism for non-
played strategies according to:
E j,t ( a j )= G [ Π j,t ( a j )]
·
(1
e )
a j = a j
(9)
e
M 1
F [ Π j,t ( a j )]
·
S j,t− 1 ( a j )
·
a j
= a j
where
G [ x ]=
γ
·
tanh( x ) x
0
(10)
0
x
0
and
F [ x ]= α
·
tanh( x )+1 x
0
(11)
1
x
0
Figure 6 shows functions G [ ... ] and F [ ... ] . It is worth noting that the proposed enhanced
version represents an extension of the MRE. In particular, in the case of negative payoff,
the experimentation function for the played strategies is calculated as in MRE proposed
by [17] for the case of null payoffs, whereas the experimentation function of the non-
played strategies is enhanced by a larger amplification the more negative is the payoff
Π j,t ( a j ) . This leads to an Enhanced Roth and Erev algorithm (hereafter referred to as
ERE algorithm). In the simulations discussed hereafter, we have adopted the values of
0 . 12 and 0 . 20 for the parameters e and r , respectively. Moreover, the value of 3 . 0 and
 
Search WWH ::




Custom Search