Information Technology Reference
In-Depth Information
ONTECTAS: Bridging the Gap between
Collaborative Tagging Systems and Structured
Data
Ali Moosavi, Tianyu Li, Laks V.S. Lakshmanan, and Rachel Pottinger
University of British Columbia, Vancouver, BC, Canada
{ amoosavi,lty419,laks,rap } @cs.ubc.ca
Abstract. Ontologies define a set of terms and the relationships (e.g.,
is-a and has-a ) between them; they are the building block of the emerg-
ing semantic web. An ontology relating the tags in a collaborative tag-
ging system (CTS) makes the CTS easier to understand. We propose an
algorithm to automatically construct an ontology from CTS data and
conduct a detailed empirical comparison with previous related work on
four real data sets - Del.icio.us, LibraryThing, CiteULike, and IMDb.
We also verify the effectiveness of our algorithm in detecting
is-a and
has-a relationships.
Keywords: ontology, taxonomy, tag, collaborative tagging systems.
1
Introduction
Ontologies organize information in content management systems and are the
core building blocks of the emerging Semantic Web. Substantial work has been
done in extracting ontologies automatically from large repositories like text cor-
pora, databases, and the web. This paper focuses on collaborative social tagging
systems (CTSs) such as Del.icio.us (for tagging bookmarks), Flickr (for tagging
photos), IMDb (for tagging movies), LibraryThing (for tagging topics) and Ci-
teULike (for tagging publications). These systems permit users to tag and share
resources (documents, photos, videos, etc.). Our goal is to create a generic on-
tology of the tags from a CTS. By ontology, we mean a set of concepts from a
domain, represented by the tags, and their (
is-a
has-a
) relationships.
Learning an ontology from a CTS can help make the CTS more useful. For
example, browsing an ontology of tags from a CTS can help users better refine
their queries, either to find more items by using a more general term or to find
fewer items by using a more specific term. This is especially important in a
CTS since the resources are typically labeled by a small, sparse, set of tags —
so discovering content in CTSs by simple keyword search is much harder than
in document and web search. Another application of domain specific ontology
builders is to enhance search engines with ontologies. E.g., the prototype Clever
Search system [15] merges words and their word senses in the general ontology,
WordNet 1 , and returns more relevant result items to the user.
1 http://wordnet.princeton.edu
and
 
Search WWH ::




Custom Search