The Genoa Artificial Power-Exchange - Agents and Artificial Intelligence

Information Technology Reference

In-Depth Information

E j,t ( a j )= Π j,t ( a j )

−

e ) a j = a j

(6)

M − 1

Π j,t ( a j )

a j

= a j

where e

[0 , 1] is the experimentation parameter which assigns different weights

between the played strategy and the non-played strategies and Π j,t ( a j ) is the reward

obtained by playing strategy ( a j ) at round t . Propensities are then normalized so to de-

termine the probability for the strategy selection policy π j,t +1 ( a j ) for the next auction

round as:

∈

S j,t ( a j )

a j S j,t ( a j )

π j,t +1 ( a j )=

(7)

The modified Roth and Erev learning model (hereafter referred to as MRE algorithm) by

[18] proposed a solution for the case of zero payoffs by modifying the experimentation

function in equation 6 according to:

E j,t ( a j )= Π j,t ( a j )

−

e ) a j = a j

(8)

M − 1

S j,t− 1 ( a j )

a j

= a j

It is worth remarking that MRE and RE are identical for a positive reward Π j,t ( a j ) ,

whereas for null payoff MRE introduces an implicit premium for non-played strate-

gies with respect to the ineffective (i.e. with negative Π j,t ( a j ) ) played strategy. MRE

represents a first but not final extension of the Roth and Erev algorithm as neither MRE

algorithm nor the later VRE algorithm proposed by [22] are able to cope with negative

payoffs. In order to overcome such limitation of the Roth-Erev algortihm, we propose

to extend the MRE algorithm by enhancing the experimentation mechanism for non-

played strategies according to:

E j,t ( a j )= G [ Π j,t ( a j )]

−

e )

a j = a j

(9)

M − 1

F [ Π j,t ( a j )]

S j,t− 1 ( a j )

a j

= a j

where

G [ x ]=

−

tanh( x ) x

≥

(10)

≤

and

F [ x ]= α

tanh( x )+1 x

≤

(11)

≥

Figure 6 shows functions G [ ... ] and F [ ... ] . It is worth noting that the proposed enhanced

version represents an extension of the MRE. In particular, in the case of negative payoff,

the experimentation function for the played strategies is calculated as in MRE proposed

by [17] for the case of null payoffs, whereas the experimentation function of the non-

played strategies is enhanced by a larger amplification the more negative is the payoff

Π j,t ( a j ) . This leads to an Enhanced Roth and Erev algorithm (hereafter referred to as

ERE algorithm). In the simulations discussed hereafter, we have adopted the values of

0 . 12 and 0 . 20 for the parameters e and r , respectively. Moreover, the value of 3 . 0 and

Agents and Artificial Intelligence

Search WWH ::

Custom Search

Home