Information Technology Reference
In-Depth Information
3 Agents Playing Colored Trails
In our simulations, we consider repeated single-shot Colored Trails games, in
which the set of players is divided into distinct sets of allocators and responders.
Each allocator can offer to trade any subset of his own chips for any subset of
chips belonging to one of the responders. For example, an allocator can give all
his chips to the responder, or ask that the responder give all her chips to the
allocator. The responder chooses whether or not to accept any of these offers.
3.1 Agent Types
We consider two types of theory of mind agents. Both types of theory of mind
agents play the best-response given their beliefs about the behaviour of others,
but they differ in the way they form these beliefs. Agents with iterated best-
response beliefs (IBR) maximize their own expected payoff under the assumption
that other agents do the same. This behaviour is similar to the iterated best-
response models such as cognitive hierarchy models [20] and level- n theory [21].
IBR agents believe that other players will only choose an action that maximizes
their expected score, and assign probability zero to the event that a co-player
will perform any other action. This approach guarantees the best outcome when
the agent's beliefs are correct. However, this approach ignores that other players
may have different beliefs or a different understanding of the situation.
The assumption of iterated best-response models can be weakened by as-
suming that players choose better actions with higher probabilities, such as
in t -solutions [22], quantal response equilibria [23], or utility proportional be-
liefs [16]. In addition to the iterated best-response agents described above, we
also consider utility-proportional beliefs (UPB) agents in the setting of Colored
Trails. The UPB agent believes that other allocators may choose any offer that
would increase the allocator's score, but that the probability that he will make
a certain offer is proportional to the expected utility of that offer. As a result, a
UPB agent may perform better than an IBR agent when his beliefs are incorrect.
The following subsections illustrate the different orders of theory of mind
reasoning involved in the game of Colored Trails. To avoid confusion, we will
refer to allocators as if they were male, and responders as if they were female.
3.2 Responders
In the Colored Trails game, a responder is a player that does not offer to trade
chips herself. Instead, she receives offers from other players, and decides whether
to accept any of these offers. We assume that a responder refuses any offer that
strictly decreases her score. If a responder is offered more than one acceptable
trade, we assume that she chooses in a utility-maximizing way. That is, the
responder selects the offer that allows her to reach her goal location as closely as
possible without considering the score of the allocator. If multiple offers satisfy
this condition, she selects one of these offers at random.
 
Search WWH ::




Custom Search