A Tutorial on EEG Signal-processing Techniques for Mental-state Recognition in Brain–Computer Interfaces - Guide to Brain-Computer Music Interfacing

Information Technology Reference

In-Depth Information

number, e.g., 48 channels in (Sannelli et al. 2010 ). However, simply using more

channels will not solve the problem. Indeed, using more channels means extracting

more features, thus increasing the dimensionality of the data and suffering more

from the curse-of-dimensionality. As such, just adding channels may even decrease

performances if too little training data is available. In order to ef

ciently exploit

multiple EEG channels, three main approaches are available, all of which contribute

to reducing the dimensionality:

Feature selection algorithm: These are methods to select automatically a subset

of relevant features, among all the features extracted.

Channel selection algorithms: These are similar methods that select automati-

cally a subset of relevant channels, among all channels available.

Spatial Filtering algorithms: These are methods that combine several channels

into a single one, generally using weighted linear combinations, from which

features will be extracted.

They are described below.

7.3.2.1 Feature Selection

Feature selection are classical algorithms widely used in machine learning (Guyon

and Elisseeff 2003 ; Jain and Zongker 1997 ) and as such also very popular in BCI

design (Garrett et al. 2003 ). There are too main families of feature selection

algorithms:

Univariate algorithms: They evaluate the discriminative (or descriptive) power

of each feature individually. Then, they select the N best individual features

( N needs to be de

ned by the BCI designer). The usefulness of each feature is

typically assessed using measures such as Student t-statistics, which measures

the feature value difference between two classes, correlation-based measures

such as R 2 , mutual information, which measures the dependence between the

feature value and the class label, etc. (Guyon and Elisseeff 2003 ). Univariate

methods are usually very fast and computationally ef

cient but they are also

suboptimal. Indeed, since they only consider the individual feature usefulness,

they ignore possible redundancies or complementarities between features. As

such, the best subset of N features is usually not the N best individual features.

As an example, the N best individual features might be highly redundant and

measure almost the same information. As such using them together would add

very little discriminant power. On the other hand, adding a feature that is

individually not very good but which measures a different information from that

of the best individual ones is likely to improve the discriminative power much

more.

Multivariate algorithms: They evaluate subsets of features together and keep the

best subset with N features. These algorithms typically use measures of global

performance for the subsets of features, such as measures of classi

cation

Guide to Brain-Computer Music Interfacing

Search WWH ::

Custom Search

Home