Database Reference
In-Depth Information
The problem of similarity matching has been extensively studied in
the past [5, 35, 53, 22]: given a user-supplied query sequence, a sim-
ilarity search returns the most similar data series according to some
distance function. More formally, given a collection of data series
C
=
{S
1
,...,S
N
}
,where
N
is the number of data series, we are interested in
evaluating the range query function
RQ
(
Q,C,
):
RQ
(
Q,C,
)=
{S|S ∈ C ∧ distance
(
Q,S
)
≤ }
(7.1)
In the above equation,
is a user-defined distance threshold. A survey
of representation and distance measures for data series can be found
elsewhere [33].
A similar problem arises also in the case of uncertain data series, and
the problem of probabilistic similarity matching has been introduced
in the last years. Formally, given a collection of uncertain data series
C
=
, we are interested in evaluation the probabilistic range
query function
PRQ
(
Q,C,,τ
):
{
T
1
,...,T
N
}
PRQ
(
Q,C,,τ
)=
{T|T ∈ C|Pr
(
distance
(
Q,S
)
≤
)
≥ τ}
(7.2)
In the above equation,
and
τ
are the user-defined distance threshold
and the probabilistic threshold, respectively.
In the recent years three techniques have been proposed to evaluate
PRQ
queries, namely MUNICH
7
[8], PROUD [105], and DUST [83]. We
discuss each one of these three techniques below, and offer some insights
in Section 4.2.
3.4.2 Proposed Techniques. MUNICH:
In [8], uncer-
tainty is modeled by means of repeated observations at each times-
tamp, as depicted in
Figure 7.5(a)
. Assuming two uncertain data se-
ries,
X
and
Y
, MUNICH proceeds as follows. First, the two uncer-
tain sequences
X,Y
are materialized to all possible certain sequences:
TS
X
=
(where
v
ij
is the
j
-th ob-
servation in timestamp
i
), and similarly for
Y
with
TS
Y
.Thesetofall
possible distances between
X
and
Y
is then defined as follows:
{
<v
11
,...,v
n
1
>,...,< v
1
s
,...,v
ns
>
}
dists
(
X,Y
)=
{L
p
(
x,y
)
|x ∈ TS
X
,y∈ TS
Y
}
(7.3)
The uncertain
L
p
distance is formulated by means of counting the
feasible distances:
7
We will refer to this method as
MUNICH
(it was not explicitly named in the original
paper), since all the authors were a
liated with the University of Munich.