Information Technology Reference
In-Depth Information
11.1 Introduction
Time-series classification is one of the core components of various real-world
recognition systems, such as computer systems for speech and handwriting recog-
nition, signature verification, sign-language recognition, detection of abnormali-
ties in electrocardiograph signals, tools based on electroencephalograph (EEG)
signals (“brain waves”), i.e., spelling devices and EEG-controlled web browsers
for paralyzed patients, and systems for EEG-based person identification, see e.g.
[ 34 , 35 , 37 , 45 ]. Due to the increasing interest in time-series classification, vari-
ous approaches have been introduced including neural networks [ 26 , 38 ], Bayesian
networks [ 48 ], hidden Markov models [ 29 , 33 , 39 ], genetic algorithms, support
vector machines [ 14 ], methods based on random forests and generalized radial basis
functions [ 5 ] as well as frequent pattern mining [ 17 ], histograms of symbolic polyno-
mials [ 18 ] and semi-supervised approaches [ 36 ]. However, one of the most surprising
results states that the simple k -nearest neighbor ( k NN) classifier using dynamic time
warping (DTW) as distance measure is competitive (if not superior) to many other
state-of-the-art models for several classification tasks, see e.g. [ 8 ] and the references
therein. Besides experimental evidence, there are theoretical results about the opti-
mality of nearest neighbor classifiers, see e.g. [ 12 ]. Some of the recent theoretical
works focused on a time series classification, in particular on why nearest neighbor
classifiers work well in case of time series data [ 10 ].
On the other hand, Radovanovic et al. observed the presence of hubs in time-
series data, i.e., the phenomenon that a few instances tend to be the nearest neighbor
of surprising lot of other instances [ 43 ]. Furthermore, they introduced the notion
of bad hubs. A hub is said to be bad if its class label differs from the class labels of
many of those instances that have this hub as their nearest neighbor. In the context of
k -nearest neighbor classification, bad hubs were shown to be responsible for a large
portion of the misclassifications. Therefore, hubness-aware classifiers and instance
selection methods were developed in order to make classification faster and more
accurate [ 9 , 43 , 50 , 52 - 54 ].
As the presence of hubs is a general phenomenon characterizingmany datasets, we
argue that it is of relevance to feature selection approaches as well. Therefore, in this
chapter, we will survey the aforementioned results and describe the most important
hubness-aware classifiers in detail using unified terminology and notations. As a first
step towards hubness-aware feature selection, we will examine the usage of distances
from the selected instances as features in a state-of-the-art classifier.
The methods proposed in [ 50 , 52 - 54 ] were originally designed for vector classi-
fication and they are novel to the domain of time-series classification. Therefore, we
will provide experimental evidence supporting the claim that these methods can be
effectively applied to the problemof time-series classification. The usage of distances
from selected instances as features can be seen as transforming the time-series into
a vector space. While the technique of projecting the data into a new space is widely
used in classification, see e.g. support vector machines [ 7 , 11 ] and principal com-
ponent analysis [ 25 ], to our best knowledge, the particular procedure we perform is
 
Search WWH ::




Custom Search