Game Development Reference
In-Depth Information
Data
As an input data set, we used the music tag database of the LastFM service (the
largest music organization and tagging portal). From it, we took popular 100 music
tracks of the pop/rock/rap music domain, 40 top tags (most relevant according to
the portal). From these, we cut the top 10 (to remove the best) so we ended up with
3,000 relatively good, but still noisy tags. For the purpose of playback, we used the
7digital music database, from which we used samples lasting from 30 s, to 1min.
To create a golden standard, each of these tags was evaluated by three indepen-
dently working experts—people frequently and actively dealing with music, who
were asked a dichotomic question, whether the assigned tag to the music is correct
or not. The majority vote then determined the validity of the tag.
Participants
The number of players that participated in the experiment was 78. The players were
recruited through social networks and email calls. The participation was voluntary
and players received no reward. No prior knowledge about the demographics of the
participants was considered.
Environment
The game was deployed as a web application. There were no known application
accessibility issues.
Process
The game was deployed online for 10days. During this time 875 games were played
(featuring 4,933 questions). Out of the 3,000 tags, 1,492 was used in the game at least
once. 17.75 implicit and 5.29 explicit feedback actions were collected averagely for
one tag. After the live experiment was closed, the players remained active for several
weeks.
During the experiment, the support threshold values
upper ,
ε lower ) were set
to 5 and
5. The parameters ans c , ans w and tag w were handpicked to 0.05, 0.10
and 0.35. After the live experiment was conducted, we run the simulations through
combinations of different values of ans c , ans w and tag w (with range between 0 and
1 and the step size 0.1). For each setup, we computed the validity of tags according
to game logs and computed the correctness and confidence of the method.
Results
The measurement of correctness and confidence with the original values of method
parameters yielded the following:
The method was only confident that 487 tags was correctly assigned to their
resources (out of 1,492). In 77% the method was right.
No tags were marked as wrongly assigned.
Not surprisingly, these results (total confidence 33% and correctness 77%) were not
good results. Therefore, we proceeded to the simulations where:
Search WWH ::




Custom Search