Game Development Reference
In-Depth Information
5.2 Deployment and Tag Correctness in General Domain
We have deployed and evaluated the PexAce game with general domain images.
The game was deployed as a web browser game. The acquired tags were evaluated
against a gold standard as well as by a posteriori judge evaluation.
5.2.1 Game Deployment: Experimental Dataset Acquisition
For acquisition of data used in experiments, we deployed the PexAce as web browser
game. As input image set, we used 5K Corel dataset, commonly used in image
metadata acquisition application (the dataset comes also with a set of image tags).
The game was deployed publicly, propagated through social network and word of
mouth, yet predominantly, it was played by students of information technology during
a tournament we organized during a conference.
The collected game logs comprised 107 players that played 814 games, in which
22,176 annotations were assigned to total 2,792 images out of 5,000 images available.
The tag extraction procedures produced 5,723 tags. Out of all tagged images, 1,373
were tagged sufficiently (either received five or more tags or were annotated 15
times). For evaluation, we randomly selected a subset of 400 out of these images. The
distribution of annotations was not uniform due to the greedy approach to annotation
collecting.
Figure 5.3 shows how the average number of tags extracted for an image slows
with constantly increasing number of annotations for that image. Because this anno-
tation/tag number trade-off becomes inconvenient with higher numbers of annota-
tions, a constant threshold may, in the future, be included in the strategy for deciding
whether to exclude the image from the further in-game processing.
Our expectations about a lack of topic diversity in some of the images (often due
to the presence of some dominant feature, e.g. an image of a horse on the meadow)
were also confirmed. This can be observed in the Fig. 5.4 showing the distribution
of tag count among the images annotated by same number of players.
5.2.2 Experiment: Validation of Annotation Capabilities
We have performed four experiments in which we measured the precision of the
image tags acquired through PexAce. Three experiments measured precision against
gold standard, fourth by a posteriori expert evaluation:
1. First, we measured precision against the original Corel 5K tags (each image has
1-4 tags per image).
2. Second, we created our own gold standard, using expert work (each image
received 2-12 tags this way) and measured precision against it.
 
Search WWH ::




Custom Search