State-of-the-Art: Semantics Acquisition and Crowdsourcing - Semantic Acquisition Games: Harnessing Manpower for Creating Semantics

Game Development Reference

In-Depth Information

Many approaches aim to identify semantics relevant to content of static images

via identification of visual features. All of these approaches involve some degree

of supervision. Duygulu and Barnard [ 22 ] employed segmentation of the image

and associated identified features within individual segments with words from

a large vocabulary. The vocabulary was used afterward to identify the semantics of

the image. Their evaluation over Corel 5K dataset yielded 70 % correct prediction.

Better results were achieved when a probabilistic model was employed by Lavrenko

et al. [ 37 ].

Feng et al. [ 25 ] proposed enhancement to the segmentation approach, which

employed the co-occurrence of terms related to images (e.g., tiger—grass occur-

ring more frequently than tiger—building), which also improved output correctness

but was more bound to the training data set of images. Improvements were also

achieved when information about global and local features were used together [ 10 ].

Various approaches use machine learning for image or image region categoriza-

tion. Techniques such as SVM [ 17 ] or Bayes point machine [ 16 ] performwell (preci-

sions over 90 % in Corel 5K dataset), but are limited to a small number of categories

and lack of training sets to be used effectively for acquisition of more specific meta-

data.

Due to its non-textual nature, metadata acquisition for image resources is often

performed via analysis of their context (e.g., in the web environment) which may

contain text or already annotated resources [ 52 , 69 , 70 ]. The acquisition of the

semantics of multimedia content (visual or aural) may also involve OCR or speech

recognition approaches [ 13 ].

Similarly to images, the raw audio resources are extensive and syntactically com-

plex. Automated acquisition of their semantics is complicated. With images, we are

usually satisfied with metadata telling us about physical features in them. The palette

of metadata types is wider comprising not only track names, authors, publishers but

also lyrics, melody, style, tonality, rhythm, motives or even mood the track evokes

on listening. For music information retrieval, the latter group is just as important

as the first group. They are used for “querying by example”, which have prolifer-

ated next to the standard textual querying [ 42 ]. Music metadata are also much more

abstract and a potential approach for their acquisition needs to perform sophisticated

interpretations of the raw music track.

Many music metadata acquisition approaches involve as a first step a transforma-

tion of raw music stream to more symbolic representation, such as musical score or

rhythm transcription. An approach of Lu and Hanjalic [ 41 ] identifies audio elements

(natural semantic sound clusters, e.g., a sequence of chords). Authors point out the

similarity of these elements to the words in texts (e.g., a sequence of tones can be

understand as a sequence of characters). Thus, the music track can be mined for key-

words , i.e. the most prominent audio elements. Still, these audio “keywords” cannot

be used as normal textual keywords (for textual query formulation). Nevertheless,

they provide a basis for effective music track comparison.

A different pre-processing technique was devised by Magistrali et al., who trans-

formed the raw music tracks to an extensive XML and then RDF files. These were

then interpreted by rules expertly prepared in an ontology and transformed to more

Semantic Acquisition Games: Harnessing Manpower for Creating Semantics

Search WWH ::

Custom Search

Home