Information Technology Reference
In-Depth Information
one that maximizes the margins, i.e., the distance from the nearest training points,
which has been found to increase the generalization capabilites (Burges 1998 ;
Bennett and Campbell 2000 ).
Generally, regarding classi
cation algorithms, it seems that very good recog-
nition performances can be obtained using appropriate off-the-shelf classi
ers such
as LDA or SVM (Lotte et al. 2007 ). What seems to be really important is the design
and selection of appropriate features to describe EEG signals. With this purpose,
speci
c EEG signal-processing tools have been proposed to design BCI. In the rest
of this chapter, we will therefore focus on EEG feature extraction tools for BCI. For
readers interested to learn more about classi
cation algorithms, we refer them to
(Lotte et al. 2007 ), a review paper on this topic.
7.2.2 Feature Extraction
As mentioned before, feature extraction aims at representing raw EEG signals by an
ideally small number of relevant values, which describe the task-relevant infor-
mation contained in the signals. However, classi
ers are able to learn from data
which class corresponds to which input features. As such, why not using directly
the EEG signals as input to the classi
er? This is due to the so-called curse-of-
dimensionality, which states that the amount of data needed to properly describe the
different classes increases exponentially with the dimensionality of the feature
vectors (Jain et al. 2000 ; Friedman 1997 ). It has been recommended to use from 5
to 10 times as many training examples per class as the input feature vector
dimensionality 1 (Raudys and Jain 1991 ). What would it mean to use directly the
EEG signals as input to the classi
er? Let us consider a common setup with 32 EEG
sensors sampled at 250 Hz, with one trial of EEG signal being 1 s long. This would
mean a dimensionality of 32 * 250 = 8,000, which would require at least 40,000
training examples. Obviously, we cannot ask the BCI user to perform each mental
task 40,000 times to calibrate the BCI before he/she could use it. A much more
compact representation is therefore needed, hence the necessity to perform some
form of feature extraction.
With BCI, there are three main sources of information that can be used to extract
features from EEG signals:
￿
Spatial information: Such features would describe where (spatially) the rele-
vant signal comes from. In practice, this would mean selecting speci
c EEG
channels, or focusing more on speci
c channels than on some other. This
amounts to focusing on the signal originating from specific areas of the brain.
Spectral (frequential) information: Such features would describe how the
power in some relevant frequency bands varies. In practice, this means that the
features will use the power in some speci
￿
c frequency bands.
1
Note that this was estimated before SVM were invented and that SVM are generally less
sensitive although not completely immune to this curse-of-dimensionality.
 
Search WWH ::




Custom Search