Adaptive Learning and Quasi Fictitious Play in “Do-It-Yourself Lottery” with Incomplete Information - Advances in Social Simulation

Information Technology Reference

In-Depth Information

After a turn, she updates her propensities as

w i,k ( t +1)=(1

−

φ a ) w i,k ( t )+1 {s i ,s i ( t ) } (1

−

φ a ) R

where R is a digital payoff, φ a is positive constants called learning parameter,

and s i ( t ) is player i 's actually chosen strategy at t .

- QFP agents

Cheung and Friedman propose a generalized belief-based learning model

(usually called weighted fictitious play ) in which players firstly expect what

the others will do based on their own prior beliefs, which are usually ratios

of the number of submitted plays to the whole moves [6].

Let L i ( t ) be the total possible counts of past plays for player i ( i =

1 ,

,N )and B −i ( t ) be her belief about her opponents will submit

s −i =( s k 1 ,

···

,s k i− 1

i− 1 ,s k i +1

,s k N )where

s −i is a vector of the other

player(s)' submission s.t. s −i ∈ S = −i { 1 , ··· ,M} with s i being player

i 's k -th strategy and

···

i +1 ,

···

s −i ( t ) is the strategy vector probably chosen at turn t

s.t.

s −i ( t )=( s 1 ( t ) ,

···

,s i− 1 ( t ) ,s i +1 ( t ) ,

···

,s N ( t )). Then we write them

as L i ( t )= s ∈ S

L −i ( t )and B −i ( t )= L s −i ( t )

respectively, with L −i ( t )

≥

L ( t )

and L ( t )

0. Note that the possible combination of other player(s)' submis-

sion is obtained by s i ( t ) and winning integer v ( t ). For instance, when player

i submits 1 and the winning integer is 0 (no winner) in ( N,M )=(3 ,

≥

)DIY-

L, she can imagine that the others also choose 1. Or, when player i submits

1 and the winning integer is 1 (player i wins) in ( N,M )=(3 , 3) DIY-L,

she will expect that the possible submissions are { 2 , 2 } w.p. 1/4, { 2 , 3 } w.p.

1/2, and

w.p. 1/4. Indeed, the numbers of combinations for each set

ups are 6 in ( N,M )=(3 , 3) DIY-L, 10 in ( N,M )=(3 , 4) , (4 , 3), and 20 in

( N,M )=(4 , 4) respectively.

After a turn, their beliefs are updated as

{

3 , 3

}

L −i ( t

−

φ f )

−

1) + φ f ·

G ( s i ( t ) ,v ( t ) ,

s −i ,

s −i ( t ))

B −i ( t )=

s −i ∈ S

L s

−i ,

−i ( t ))

−

φ f )

−i ( t

−

1) + φ f ·

G ( s i ( t ) ,v ( t ) ,

where φ f is a learning parameter and G ( s i ( t ) ,v ( t ) ,

s −i ( t )) is a function

which determines the probability that the others probably choose a set of

integers.

Then the expected payoff for an integer k at turn t is calculated as

E i ( t )=

s ∈ S

s −i ,

B i ( t ) π i ( s i ,

s −i ( t ))

where π i ( s i ,

s −i ( t )) is player i 's payoff for choosing integer k at turn t .

Finally, strategy j at turn t is selected based on the exponential choice rule,

E i ( t ))

exp( λ f ·

p i ( t )=

E k i ( t )) ,

where p i ( t ) is the probability that player i selects strategy k at turn t ,and

λ f is the sensitivity of probabilities to expected payoffs.

k =1 exp( λ f ·

Advances in Social Simulation

Search WWH ::

Custom Search

Home