PexAce: A Method for Image Metadata Acquisition - Semantic Acquisition Games: Harnessing Manpower for Creating Semantics

Game Development Reference

In-Depth Information

5.2 Deployment and Tag Correctness in General Domain

We have deployed and evaluated the PexAce game with general domain images.

The game was deployed as a web browser game. The acquired tags were evaluated

against a gold standard as well as by a posteriori judge evaluation.

5.2.1 Game Deployment: Experimental Dataset Acquisition

For acquisition of data used in experiments, we deployed the PexAce as web browser

game. As input image set, we used 5K Corel dataset, commonly used in image

metadata acquisition application (the dataset comes also with a set of image tags).

The game was deployed publicly, propagated through social network and word of

mouth, yet predominantly, it was played by students of information technology during

a tournament we organized during a conference.

The collected game logs comprised 107 players that played 814 games, in which

22,176 annotations were assigned to total 2,792 images out of 5,000 images available.

The tag extraction procedures produced 5,723 tags. Out of all tagged images, 1,373

were tagged sufficiently (either received five or more tags or were annotated 15

times). For evaluation, we randomly selected a subset of 400 out of these images. The

distribution of annotations was not uniform due to the greedy approach to annotation

collecting.

Figure 5.3 shows how the average number of tags extracted for an image slows

with constantly increasing number of annotations for that image. Because this anno-

tation/tag number trade-off becomes inconvenient with higher numbers of annota-

tions, a constant threshold may, in the future, be included in the strategy for deciding

whether to exclude the image from the further in-game processing.

Our expectations about a lack of topic diversity in some of the images (often due

to the presence of some dominant feature, e.g. an image of a horse on the meadow)

were also confirmed. This can be observed in the Fig. 5.4 showing the distribution

of tag count among the images annotated by same number of players.

5.2.2 Experiment: Validation of Annotation Capabilities

We have performed four experiments in which we measured the precision of the

image tags acquired through PexAce. Three experiments measured precision against

gold standard, fourth by a posteriori expert evaluation:

1. First, we measured precision against the original Corel 5K tags (each image has

1-4 tags per image).

2. Second, we created our own gold standard, using expert work (each image

received 2-12 tags this way) and measured precision against it.

Search WWH ::

Custom Search

Home