Game Development Reference
In-Depth Information
we found a optimal configuration of the parameters ans c =
.
2, ans w =
.
0
0
3 and
tag w =
.
0
9.
Using these parameters, the confidence raised to 51% with the correctness
decreased to 68%.
The number of positives (method stated the tag is valid) was still much higher
(736) than negatives (method stated the tag is not valid) (39). However, all of these
were true negatives, i.e. no correct tag were ruled out by our method.
6.3 Discussion
With CityLights, we demonstrate a game-based method for music tag validation. The
design, deployment and experiments with this game have shown us that the method is
able to validate the existing music tag sets by harnessing human labor in an engaging
game, which maintained its players even after the experiment has ended.
The resulting numbers though, have space for improvement: the method was con-
fident in only one half of the tag cases and reached only 68%correctness. Fortunately,
for the metadata set cleanup scenario, this does not represent a problem. Since the
method has zero false negative rate, it does not damage the dataset by removing
correct tags. It merely does the job partially: along with the correct tags, it passes
some wrong ones too, but those tags would be there anyway if no cleanup is used.
We can also look at thewhole problemnot as on filtering problem, but as on ranking
problem, where task is to re-order the existing tags according to their relevancy
(useful for information retrieval methods). For this, the real values of tag support
from the game could be used directly (after some minimum feedback is collected on
the particular tag).
Nevertheless, the results of our method are not completely satisfying and will be
subject for future work. The correctness and confidence of the method are moderate.
We see several reasons and possible improvements for this:
Many tags received not enough feedback from players to reach any of the two
thresholds, lowering the method confidence. These tags may be of two kinds. One
is the case of a tag recently introduced to the game. Another case is a tag, the
support value of which “oscillates” since players see its validity differently. While
in the first case, the tag just needs to be featured more in the game, in the second
case, the “validation” could go on for long time (or forever), consuming much of
the available player “working capacity”.
To distinguish between these two cases, a sort of support changemomentum(trend)
measure could be imposed to detect a “stalemate” situations. Detecting these cases
would allow the method to cease featuring of the respective tags and save player
work for other. A trend may be defined, for example, as linear approximation
of the last n support values. In such case, if the trend function is not “steep”
enough (either positively or negatively), a “stalemate” situation would be declared
for a particular tag. The process may also be two phased. In a first phase, the
 
Search WWH ::




Custom Search