Database Reference
In-Depth Information
When
Q
is an unitary set, the similarity selection correspond to the traditional range and
k
-nearest-
neighbor queries. These two types of similarity selections can be represented in the following manner:
•
Range selection (
Rq
)
: given a maximum query distance
ξ
, the query
ˆ
(
σ
R
retrieves
S R
δ
(),
ξ
{ })
s
j
q
q
every tuple
t
i
from
R
whose value
s
i
from attribute
S
j
satisfies δ(
s
i
,s
q
) ≤
ξ
. Considering
R
to be a set
of images, an example is: “Select the images that are similar to the image
P
by up to 5 units”,
represented as
ˆ
(
σ
Images
;
image R
δ
(),
5
{ })
P
q
k
-Nearest Neighbor selection (
kNN
)
: given an integer value
k
≥ 1, the query
ˆ
(
•
σ
R
S kNN
δ
(),
k
{ })
s
j
q
retrieves the
k
tuples
t
i
from
R
whose values
s
i
from attribute
S
j
have the smallest distance from the
query element
s
q
, according to the distance function
δ
. An example is: “Select the 3 images most
similar to the image
P
”, represented as
ˆ
(
σ
Images
.
image kNN
δ
(),
3
{ })
P
q
Figures 2 (a) and (b) illustrate the range and
k
-nearest neighbor selections in a 2-dimensional Eu-
clidean space.
When the query centers set
Q
has more than one object, the similarity selection correspond to ag-
gregate similarity queries. In order to perform the comparison, these queries require the definition of a
similarity aggregation function
d
g
, which evaluates the aggregate similarity of each element
s
Î
S
i
regarding its similarity measured by the metric
δ
to every element
s
q
Î . The aggregate range and the
aggregate
k
-nearest neighbor selections can be represented as follows (Razente et al., 2008b):
Q
•
Aggregate Range selection (
ARq
)
: given a maximum aggregate query distance
ξ
, a similarity
aggregation function
d
g
and a set of query centers
Q
, the query
ARq
retrieves every tuple
t
i
from
R
whose value
s
i
from attribute
S
j
satisfies
d
g
(
s
i
,Q
) ≤
ξ
. An aggregate range selection can be ex-
pressed as
ˆ
(
σ
R
. An example is: “Select the images that are similar to the set of images
S AR d
(),
ξ
Q
)
j
q
g
composed by the images
P
,
M
and
N
by up to 5 units”, represented as
ˆ
(
σ
S AR d
Images
;
(),
5
{
P, M, N
}
)
j
q
g
•
Aggregate
k
-Nearest Neighbor selection (
kANNq
)
: given an integer value
k
≥ 1, the query
kAN-
Nq
retrieves the
k
tuples
t
i
from
R
whose values
s
i
from attribute
S
j
minimize the similarity aggre-
gation function
d
g
regarding the query centers
Q
. An aggregate
k
-nearest neighbor selection can
be expressed as
ˆ
(
σ
S kANN d
R
. An example is: “Select the 3 images most similar to the set of
(),
k Q
)
j
q
g
images composed by the images
P
,
M
and
N
”, represented as
ˆ
(
σ
S kANN d
Images
.
(),
3
{
P, M, N
}
)
j
q
g
The predicate of aggregate similarity selections uses the value of the similarity aggregation function
to rank the elements in
S
j
with respect to
Q
. There are several ways to define the similarity aggregation
function
d
g
. Herein, we consider the class of functions analyzed in (Razente et al., 2008b) and generated by:
∑
δ
g
d Q s
( ,
)
=
(
s
,
s
)
(1)
g
g
i
q
i
s
∈
Q
q