Database Reference
In-Depth Information
Finding out whether a musical score is similar to one of a set of copy-
righted scores.
Finding portions of seismic waves that are not similar to spot geological
irregularities.
Applications range from medicine, through economy, to scientific disci-
plines such as meteorology and astrophysics [Faloutsos et al. (1994), Yi and
Faloutsos (2000)].
The running times of simple algorithms for comparing time sequences
are generally polynomial in the length of both sequences, typically linear or
quadratic. To find the correct offset of a query in a large database, a naive
sequential scan will require a number of such comparisons that is linear in
the length of the database. This means that, given a query of length
m
and
a database of length
n
, the search will have a time complexity of O(nm) ,
nm 2 ) or worse. For large databases this is clearly unacceptable.
Many methods are known for performing this sort of query in the domain
of strings over finite alphabets, but with time sequences there are a few extra
issues to deal with:
or even
O
(
The range of values is not generally finite, or even discrete.
The sampling rate may not be constant.
The presence of noise in various forms makes it necessary to support very
flexible similarity measures.
This chapter describes some of the recent advances that have been made
in this field; methods that allow for indexing of time sequences using flexible
similarity measures that are invariant under a wide range of transformations
and error sources.
The chapter is structured as follows: Section 2 gives a more formal
presentation of the problem of similarity-based retrieval and the so-called
dimensionality curse; Section 3 describes the general approach of signature
based retrieval, or shrink and search, as well as three specific methods using
this approach; Section 4 shows some other approaches, while Section 5
concludes the chapter. Finally, Appendix gives an overview of some basic
distance measures. 1
1 The term “distance” is used loosely in this paper. A distance measure is simply the
inverse of a similarity measure and is not required to obey the metric axioms.
Search WWH ::




Custom Search