Database Reference
In-Depth Information
cept of dimensions do not apply (as for example shapes defined by polygons with distinct number of
vertices). Formally, a metric space is a pair ,d , where  is the set of all objects complying with the
properties of the domain and δ is a distance function that complies with the following three properties:
symmetry : δ( s 1 ,s 2 ) = δ( s 2 ,s 1 ); non-negativity : 0 < δ( s 1 ,s 2 ) < ∞ if s 1 s 2 and δ( s 1 ,s 1 ) = 0; and triangular
inequality : δ( s 1 ,s 2 ) ≤ δ( s 1 ,s 3 ) + δ( s 3 ,s 2 ), ∀
, , . A function that satisfies these properties is called
a metric . The Minkowski distances with p ≥ 1 are metrics, therefore vector spaces ruled by any of such
functions are special cases of metric spaces. Another important property of metric spaces is that they
allow developing fast indexing structures (see 'Indexing Methods for Multimedia' section). Other ex-
amples of metrics are the Canberra distance (Kokare et al., 2003) and the Weak Attribute Interaction
Distance (WAID), which allows users to define the influence between features according to their percep-
tion (Felipe et al., 2009).
Distance functions can be affected by weighting techniques, producing distinct similarity space in-
stances and tuning the evaluation. These techniques can be classified in: feature weighting and partial
distance weighting. Feature weighting has the goal of establishing the ideal balance among the relevance
of each feature for the similarity that best satisfies the user needs. The trivial strategy for weighting
features is based on exhaustive experimental evaluation. Nonetheless, there is an increasing number of
approaches dynamically guided by information provided in the query formulation and/or in relevance
feedback cycles (Liu et al., 2007, Wan and Liu, 2006, Lee and Street, 2002). Partial distance weighting
is employed when an object is represented by many feature vectors and the similarity evaluation between
two objects first computes the (partial) distance between each feature vector, usually employing distinct
distance functions, and then uses another function to aggregate these values to calculate the final distance.
The automatic partial distance weighting methods can be classified into supervised (e.g. (Bustos et al.,
2004)) and unsupervised (e.g. (Bueno et al., 2009)).
Now that we already know how to represent and compare the similarity of multimedia objects, it is
time to learn how to query these data. There are several types of similarity queries that can be employed
to query multimedia data. These types of queries are discussed in the next section.
s s s
1
2
3
Similarity Queries
Let us remember a few fundamental concepts of the relational model to provide a proper definition of
similarity queries following the database theory. It is worth to stress that every traditional concept of the
relational model remains valid when retrieving multimedia objects by the similarity of their contents.
Suppose R is a relation with n attributes described by a relational schema R = (
, ... ,
)
1 , composed
of a set of m tuples t i , such that R = {t 1 , …, t m } . Each attribute S j , 1 ≤ j n , indicates a role for domain
j , that is S j
S
S n
Ì  . Therefore, when  j is the multimedia domain from a metric space, each attribute
S j stores multimedia values. Each tuple of the relation stores one value for each attribute S j , where each
value s i , 1 ≤ i m , assigned to S j is an element taken from domain  j and the dataset S j is composed of
the set of elements s i that are assigned to the attribute S j in at least one tuple of the stored relation. Notice
that more than one attribute S j , S k from R can share the same domain, that is, it is possible to have  j
=  k . Regarding the multimedia domain from a metric space, the elements must be compared by simi-
larity, using a distance function δ defined over the respective domain. Elements can be compared using
the properties of the domain, regardless of the attributes that store the elements. Therefore, every pair
j
Search WWH ::




Custom Search