Personas versus Clones for Player Decision Modeling - Entertainment Computing-ICEC 2014

Information Technology Reference

In-Depth Information

tries to avoid risk, a Monster Killer (MK) persona who tries to kill all monsters

and escape the level, and a Trea s u re Collector (TC) persona who attempts to

collect all treasures and escape the level. The decision making styles are defined

by the utility weights presented in Table 1, and serve as a metaphor for the rela-

tive importance of the affordances to the archetypical player represented by the

persona. When assigned to personas, utility points from a level are normalized by

the maximally attainable utility for the same level. Personas are evolved by, for

each generation, exposing them to 9 of the 10 levels of MiniDungeons, yielding

50 agents in total. For each generation, their fitness is computed as the average of

the normalized utility scores from the seen levels. All subsequent evaluations

presented in this paper are done using 10-fold cross validation, i.e., a persona is

evaluated on the level which it was not exposed to during evolution.

Clones: Clones, like personas, are evolved by exposing them to 9 of the 10 levels

of MiniDungeons. Their fitness value is computed as the average nor- malized

AAR across all 9 seen levels. One clone per player per map is evolved, yielding

380 agents in total. All subsequent tests are done using 10-fold cross validation,

evaluating the clones on unseen levels.

Baseline Agents: In order to evaluate the limits of the perceptron-based

representation, a set of baseline agents is evolved, one agent for each human

playtrace, 380 total. These are exposed to a single level of MiniDungeons. Their

fitness scores are computed directly from AAR in an attempt to establish the

closest fit to each human player that the representation can achieve.

5

Results

This section compares the two presented evaluation metrics, and compares the

ability of personas, clones, and baseline agents to represent human decision mak-

ing styles in MiniDungeons. Table 2 shows the mean of the agreement ratios for

each kind of agent evolved, using both the AAR and TAR metrics. The ratios

indicate that all agents achieve higher agreement with human playtraces when

evaluated with the AAR metric than with the TAR metric. Additionally, they

indicate that when using AAR clones perform only slightly better than personas (t

= −3.23, df = 753.00, p < while when using TAR the clones perform

substantially better than the personas (t = −39.26, df = 721.51, p < 0.001), as

tested using Welch's t test. Using AAR, the baseline agents perform significantly

better than both personas and clones (df = 2, F = 62.59, p < 0.001), but when

using TAR they perform significantly worse than the clones (df = 2, F = 59.1, p <

0.001), as tested using ANOVA. Table 3 shows which personas exhibited the best

ability to represent human playtraces, for each MiniDungeons level and in total.

For each human playtrace, the personas with the highest AAR and TAR,

respectively, are identified. Both metrics generally favor the Treasure Collector

persona as the best match for most playtraces, although there is some discrepancy

between the two measures in terms of which personas represent the human

playtraces best.

Entertainment Computing-ICEC 2014

Search WWH ::

Custom Search

Home