Information Technology Reference
In-Depth Information
3.2.3 Feature Selection (Step 2)
In data mining and classification applications, feature selection and reduction of
dimensionality in the feature space play a crucial role in the effective design by
regularizing and restricting the solution space. It is of practical concern that a large
number of features may actually degrade the performance of the classifier if the
number of training samples is limited in comparison to the number of features [12].
This study proposes a statistical method for mining the most significant channels,
resembling the way many clinical neurophysiological studies evaluate the brain
activation patterns.
Hence, the second step (Fig. 3.2 - Step 2) of our design involves the statisti-
cal test selection of features, which depends upon the feature-vector properties and
the experimental design. The distribution of features plays the most important role,
since it is the one to judge which statistical test is the most appropriate (Fig. 3.1).
Normality of the feature set may be tested using the D'Agostino-Pearson test [19].
Once normality is met and supposing that two classes are being discriminated,
t -test or analysis of variance (ANOVA) is the ideal test to use in our application.
The ANOVA test is superior for complex analyses for two reasons, the first being
its ability to combine complex data into one statistical procedure (more than two
groups may be compared). The second benefit over a simple t -test is the ANOVA's
ability to determine interaction effects. One of the common assumptions underly-
ing ANOVA is that the groups being compared are independent of each other. In
the case of a related studies design (the same subjects perform each task), either
matched pairs or repeated measures are more appropriate, e.g., a repeated measures
ANOVA [19] with common measures factors being the two tasks and the number
of channels, testing for significance at the level of 0.05. For those bands where
the significance criterion is fulfilled, follow-up post hoc tests for each channel are
performed to accentuate the best candidate channels to preserve as features, which
resemble the most significant brain areas in terms of activity.
3.2.4 Feature Refinement (Steps 3 and 4)
The aforementioned steps derive a significant channels' subset, based only on task
differentiation confidence intervals using Global PS measures. To further refine the
features and optimize the whole process, we propose to isolate only those time seg-
ments of the EEG signal where notable activity differences occur from the control
to the arithmetic task. The aim is to further map the EEG signal into a feature vec-
tor that best characterizes the EEG pattern of activity for the target task in terms
of significant temporal and spectral content. As we are interested in ongoing EEG
activity within various tasks, the temporal activity of EEG events is of interest.
Notice that we focus on significant (bursty and/or sequential) activations and not
on the evolution of brain operation during the task. Thus, we are mostly focused
on the time-localized EEG activity itself, without particular interest to the temporal
Search WWH ::




Custom Search