Database Reference
In-Depth Information
among Wikipedia topics and the disambiguation of terms in documents, according
to the Wikipedia categories. Wikipedia information can also be used to build more
efficient text representations in terms of classification performance. Different app-
roaches have been proposed, based on (a) the bag-of-word representation, (b) the
analysis of Wikipedia taxonomies, and (c) the analysis of the Wikipedia graph
structure. Moreover, some works have been devoted to developing search engines
which retrieve documents according to (a) the semantic analysis of terms in the
documents, based on Wikipedia taxonomies, or (b) the employment of ontologies
extracted by Wikipedia infoboxes.
The proposed taxonomies are useful for categorizing the works discussed in
Sects. 2.5 and 2.6 .
2.5 Media Annotation
An interesting challenge when dealing with knowledge collected on a large scale is
that of making it searchable and thus usable. Despite the growing level of interest in
multimedia Web search, most major Web search engines still offer limited search
functionality and exploit keywords as the only means of media retrieval [ 24 ]. In the
context of media (e.g., video, images, documents), this requires content to be
annotated, which can be done manually or automatically.
In the first case, the process is an extremely time-consuming, and hence costly,
one. As pointed out in [ 25 ], a potential drawback of manual annotation is its
subjective nature as an indicator of content. The same media may produce rather
disparate reactions from different users or groups of users, who may also have
varying motivations for annotating it. This would result in the media being annotated
very differently. However, automatic annotation of media content may require
content analysis algorithms to extract descriptions from media data.
Community-built media collections are typically designed in such a way as to
enable user queries on the content, and thus provide varying levels of media
annotation. Tagging is the most popular form of annotation and has proved suc-
cessful over the past years, as shown in [ 26 - 31 ]. In addition, it is available at
virtually no cost, because the annotation task is spread across the entire community.
2.5.1 Photo Annotation
Photos uploaded on Flickr can be enriched by different kinds of metadata, in the
form of tags, notes, number of views, comments, number of people who mark the
photo as their favorite, and even geographical location data.
The analysis of “how users tag photos” and “what kind of tags they provide” is
presented in [ 31 ]. By analyzing 52 million photos collected between February 2004
and June 2007, authors show that the tag frequency distribution can be modeled by
Search WWH ::




Custom Search