Database Reference
In-Depth Information
INTRODUCTION
With the increasing availability and capacity of recording equipments, managing the huge amount of
multimedia data generated has been more and more challenging. Without a proper retrieval mechanism,
such data is usually forgotten on a storage device and most of them are never touched again.
As the information embedded into multimedia data is intrinsically complex and rich, the retrieval
approaches for such data usually rely on its contents. However, Multimedia Objects (MO) are seldom
compared directly, because their binary representation is of little help to understand their content.
Rather, a set of predefined features is extracted from the MO, which is thereafter used in place of the
original object to perform the retrieval. For example, in Content-Based Image Retrieval (CBIR), images
are preprocessed by specific feature extraction algorithms to retrieve their color or texture histograms,
polygonal contours of the pictured objects, etc. The features are employed to define a mathematical
signature that represents the content of the image regarding specific criteria. The features are employed
in the search process.
Although many progress have been achieved in the recent years to handle multimedia content, the
development of large-scale applications has been facing problems because existing Database Manage-
ment Systems (DBMS) lack support for such data. The operators usually employed to compare numbers
and small-texts in traditional DBMS are not useful to compare MO. Moreover, MO demand specific
indexing structures and other advanced resources, for example, maintaining the query context during a
user interaction with a multimedia database.
The most promising approach to overcome these issues is to add support for similarity-based data
management inside the DBMS. Similarity can be defined through a function that compares pairs of MO
and returns a value stating how similar (close) they are. As it is shown later in this chapter, employing
similarity as the basis of the retrieval process allows writing very elaborated queries using a reduced set
of operators and developing a consistent and efficient query execution mechanism.
Although a number of works has been reported in the literature describing the basic algorithms to
execute similarity retrieval operations on multimedia and other complex object datasets (Roussopoulos
et al., 1995, Hjaltason and Samet, 2003, Bohm et al., 2001), there are few works on how to integrate
similarity queries into the DBMS core. Some DBMS provide proprietary modules to handle multimedia
data and perform a limited set of similarity queries (IBM Corp., 2003, Oracle Corp., 2005, Informix
Corp., 1999). However, such approaches are generalist and do not allow including domain-specific
resources, which prevent many applications from using them. Moreover, it is worth to note that it is
important considering the support of similarity queries in SQL as native predicates to allow representing
queries that mix traditional and similarity-based predicates and to execute them efficiently in a Relational
DBMS (RDBMS) (Barioni et al., 2008).
This chapter presents the key foundations toward supporting similarity queries as a native resource in
RDBMS, addressing the fundamental aspects related to the representation of similarity queries in SQL.
It also describes case studies showing how it is possible to perform similarity queries within existing
DBMS (Barioni et al., 2006, Kaster et al., 2009).
In the following sections, we describe related work and fundamental concepts, including the general
strategy usually adopted to represent and to compare MO, the kinds of similarity queries that can be
employed to query multimedia data and some adequate indexing methods. We also discuss issues regard-
ing the support of similarity queries in relational DBMS, presenting the current alternatives and also an
already validated approach to seamlessly integrate similarity queries in SQL. There is also a description
Search WWH ::




Custom Search