Biology Reference
In-Depth Information
Fig. 5.4 The imitation interpretation of the RD
then switch their strategies according to the following rule: if in a population with
state x ( t ) the agent i 's payoff is u i ( x ), and the agent samples an agent j with payoff
u j ( x ), the agent switches with probability 8
q i ¼
max 0
bu j ð
x
Þ
u i ð
x
Þ
(5.3)
;
(Schlag 1998 , p. 150, cf. also Weibull 1995 , pp. 152-161). That is, she retains her
strategy if her realised payoffs are greater than that of the sampled player. Other-
wise, she adopts the strategy of the sampled player with a probability proportional
to the difference between her and the sampled payoff. For this reason, such models
are sometimes seen as closely related to the meme concept (B¨rgers 1996 ). The
resulting population dynamic - in a large but finite population - is approximated by
a deterministic dynamic that is analogous to the discrete RD (Schlag 1998 , p. 152).
Schlag furthermore points out that his model arrives at this result solely based on
individual information and induced performance, while reinforcement learning
models discussed above 'contain axioms concerning the functional form of a
desirable learning curve' (Schlag 1998 , p. 153).
The imitation interpretation of the RD model can be graphically presented as
shown in Fig. 5.4 .
This interpretation differs in a number of features from BRD. Although agents
here also play pure strategies, it drops the heritability of strategies. Like the interpre-
tation of Fig. 5.3 , it does not interpret payoffs as fitness, but as subjectively evaluated
outcomes. But unlike the reinforcement schema, the imitation schema models agents
as evaluating not only their own but also others' outcomes. It is these subjective
evaluations that may cause the agent to adopt another agent's strategy if she finds it
more successful than her own. And it is this conditional adoption, and not differential
reproduction, that constitutes differential representation in the population.
The previous two kinds of models cast learning as an influence of past payoffs
(either of the player herself or of other players) on future behaviour. Belief learning ,
in contrast, models learning as experience influencing beliefs, and only through this
influence, there is an indirect effect on behaviour. Hopkins ( 2002 , p. 2144) has
termed the particular kinds of belief learning modelled with EGT 'hypothetical
reinforcement'. This is because players are modelled as calculating what they
8 The function b ensures that the difference are normalised - that is, for any payoffs u i , u j in the
population, 0
b ( u j ( x )
u i ( x ))
1.
Search WWH ::




Custom Search