Database Reference
In-Depth Information
1. Introduction
Multivariate time series classification is useful in those classification tasks
where time is an important dimension. Instances of these kind of tasks
may be found in very different domains, for example analysis of biomedi-
cal signals [Kubat et al. (1998)], diagnosis of continuous dynamic systems
[Alonso Gonzalez and Rodrıguez Diez (1999)] or data mining in tempo-
ral databases [Berndt and Clifford (1996)]. Time series classification may
be addressed like an static classification problem, extracting features of
the series through some kind of preprocessing, and using some conven-
tional machine learning method. However, this approach has several draw-
backs [Kadous (1999)]: the preprocessing techniques are usually ad hoc and
domain specific, and the descriptions obtained using these features can be
hard to understand. The design of specific machine learning methods for
the induction of time series classifiers allows for the construction of more
comprehensible classifiers in a more ecient way because, firstly, they may
manage comprehensible temporal concepts dicult to capture by the pre-
processing technique — for instance the concept of permanence in a region
for certain amount of time — and secondly, there are several heuristics
applicable to temporal domains that preprocessing methods fails to exploit.
The method for learning time series classifiers that we propose in this
work is based on literals over temporal intervals (such as increases or always
in region) and boosting (a method for the generation of ensembles of classi-
fiers from base or weak classifiers) [Schapire (1999)] and was first introduced
in [Rodrıguez et al. (2000)]. The input for this learning task consist of a
set of examples and associated class labels, where each example consists of
one or more time series. Although the series are often referred to as vari-
ables, since they vary over time, form a machine learning point of view,
each point of each series is an attribute of the example. The output of
the learning task is a weighted combination of literals, reflecting the fact
that the based classifiers consist of clauses with only one literal. These base
classifiers are inspired by the good results of works using ensembles of very
simple classifiers [Schapire (1999)], sometimes named stumps .
Although the method has already been tested over several data set
[Rodrıguez et al. (2001)] providing very accurate classifiers, it imposes two
restrictions that limits its application to real problems. On the one hand,
it requires that all the time series were of the same length, which is not the
case in every task, as the Auslan example (Australian sign language, see
Section 6.4) [Kadous (1999)] illustrates. On the other hand, it requires as
Search WWH ::




Custom Search