Online BCI Implementation of High-Frequency Phase Modulated Visual Stimuli (Universal Access in Human-Computer Interaction)

Abstract

Brain computer interfaces (BCI) that use the steady-state-visual-evoked-potential (SSVEP) as neural source, offer two main advantages over other types of BCIs: shorter calibration times and higher information transfer rates. SSVEPs elicited by high frequency (larger than 30 Hz) repetitive visual stimulation are less prone to cause visual fatigue, safer, and more comfortable for the user. However in the high frequency range there is a practical limitation because only few frequencies can elicit sufficiently strong SSVEPs for BCI purposes. We bypass this limitation by using only one stimulation frequency and different phases. To detect the phase from the recorded SSVEP, we use spatial filtering combined to phase synchrony analysis. We developed an online BCI implementation which was tested on six subjects and resulted on an average accuracy of 95.5% and an average bit rate of 34 bits-per-minute. Our approach has the advantage of entailing only minimal visual annoyance for the user.

Introduction

The steady state visual evoked potential (SSVEP) refers to the response of the cerebral cortex to a repetitive visual stimulus (RVS) oscillating at a constant stimulation frequency. The SSVEP manifests as peaks at the stimulation frequency and/or harmonics in the power spectral density (PSD) of EEG signals [1]. Because of its proximity to the primary visual cortex, the occipital EEG sites exhibit a stronger SSVEP response.

Among non-invasive EEG based brain computer interfaces (BCI), SSVEP based BCIs provide higher information transfer rates (ITR) and require shorter calibration times [2]. SSVEP based BCIs operate by presenting the user with a set of repetitive visual stimuli (RVSi). In most of current implementations, the RVSi distinguish from each other by their stimulation frequency [3,4,5]. The SSVEP corresponding to the RVS receiving the user’s attention is more prominent and can be detected from the ongoing EEG. Each RVS is associated with an action or command which is executed by the BCI when the corresponding SSVEP is detected.

The majority of current SSVEP-based BCIs use stimulation frequencies in the 4 t0 30 Hz frequency band [6]. RVS at these frequencies as compared to higher frequencies, have several disadvantages: 1) they are prone to visual fatigue which decreases the SSVEP strength, 2) they entail a higher risk of photic or pattern-induced epileptic seizure [7] and 3) they overlap with the frequency band of spontaneous brain activity. Higher stimulation frequencies are thus preferable for the sake of safety and comfort of the BCI user.

Only a limited number of frequencies above 30 Hz can elicit a sufficiently strong SSVEP for BCI purposes [8]. In the classical SSVEP based BCI design where each RVS has a unique stimulation frequency, this limitation implies a reduction in the number of possible BCI commands and consequently the bitrate. A possible way to tackle this limitation is to combine two or more frequencies to drive a particular RVS [9,10]. Thus, if N frequencies are available, k frequencies selected among these N can be used to drive each RVS. The total number of different RVSi is then (N) which is larger than N if N > k + 1 and k > 1.

An alternative way which is the one adopted in this paper, consists in using the same stimulation frequency but different phases [11,12,13]. Detecting the phase of the stimulus that receives the user’s focus of attention is possible because the SSVEP is phased-locked to the visual stimulus [1].

The SSVEP phase can be estimated using the Discrete Fourier Transform (DFT) [11,14] or the Short Time Fourier Transform (STFT) [15]. These methods require a relatively long signal segment with a duration that is a multiple of the stimulation period. In addition, only the absolute SSVEP phase is estimated. This means that the calibration stimuli (the ones used to train the estimation algorithm) and the operation stimuli should be synchronized. The need for such synchronization can be removed if the phase difference between the SSVEP and the stimulation signal, i.e. the one that drives the RVS, is considered instead of the absolute phase. Hereafter, for convenience reasons we refer to this phase difference as to the SSVEP phase.

The SSVEP phase can be estimated using the Hilbert transform on a spatially filtered signal. We propose in this paper, an online BCI implementation based on the phase detection of SSEVPs evoked by high frequencies. This paper is organized as follows. Section 2 describes the signal processing steps leading to the phase estimation. The experimental methods are presented in Section 3. Section 4 analyzes the results. The conclusions are finally presented in Section 5.

Signal Processing Methods

Signal processing methods are utilized to obtain a two-dimensional feature vector from a multi-channel EEG signal of a given duration (EEG epoch). Pattern recognition methods are then used to estimate the subject’s intended command from the feature vectors. A diagram illustrating this process is presented in Figure 1.

The multi-channel EEG is first spatially filtered. This consists in linearly combining the signals from all EEG channels into a single signal. The SSVEP energy, the first component of the feature vector, is estimated from this signal by applying a peak filtered centering at the stimulation frequency. The choice of linear weighting coefficients is based on [16].

Fig. 1. Signal processing methods

This is explained in Section 2.1.

The phase, the second component of the feature vector, is estimated by computing the average phase difference between the stimulation signal and the spatially filtered signal. We refer to this estimation process as to phase synchrony analysis [8].

The feature vectors are submitted to a probabilistic neural network to determine the RVS on which the user focuses her/his attention (see Section 2.3).

Spatial Filtering

We consider an EEG epoch X which can be written as a T x N matrix having as columns N, T-sample long signals Xj i = 1,…,N. The spatially filtered signal xw can be written as a linear combination of the {xj}. This implies:

where

The spatial filter coefficients are estimated so that the ratio between the SSVEP and background activity is maximized [16]. The maximum contrast combination method in [16] proposes to estimate w in a per EEG epoch basis using:

is a matrix which has as columns the signals where

is a matrix which has as columns the signals in the set

These are sinusoidal signals at the frequency of stimulation f and H harmonics. Since in this paper we consider high stimulation frequencies (> 30 Hz) and the EEG spectral content is restricted to 60 Hz, only the stimulation frequency is considered, i.e. H = 1.

In (1), the per-epoch covariance matrices X’X and (X — Q)’(X — QX) are used to estimate the SSVEP activity and the background activity respectively. A better and more stable estimate of the covariance matrix can be obtained if the covariance matrices of several EEG epochs are averaged. Thus, we propose to estimate the optimum spatial filter as follows.

where Xk is the k-th EEG epoch and K is the total number of epochs that are considered.

The SSVEP energy (first component of the feature vector) is estimated by applying to the signal xw (t), a 1-Hz narrow band FIR filter centered around the stimulation frequency (peak-filter). This results in the narrow band signal zt (t) from which the SSVEP energy E can be estimated in a time window

Phase Synchrony Analysis

The phase (second component of the feature vector) is estimated through a process termed phase synchrony analysis. This computes first the instantaneous phase difference Sf (t) between zf (t) and the stimulation signal filtered through the peak-filter centered around f. This signal is denoted as lf (t).

The analytical signals associated with

respectively can be written as:

are the Hilbert transforms of where

are the Hilbert transforms of

The phase is and 0{-} are the instantaneous amplitude and phase respectively. Thus, the in-stantaneous phase difference

estimated as the median of Sf (t) in a given time window, e.g. the epoch duration. 2.3 Feature Vector Classification

The phase is estimated as the median of Sf (t) in a given time window, e.g. the epoch duration. 2.3 Feature Vector Classification

A probabilistic neural network (PNN) is used to estimate the user’s focus of attention from the feature vector. A PNN is a radial basis network which estimates the probability density function of each class from labeled training data [17] using the Parzen window technique [18]. In our case, the training data correspond to feature vectors obtained from EEG epochs recorded while the BCI user was instructed to pay attention to a particular RVS.

Experimental Methods

Our BCI implementation uses the BCI2000 software platform [19]. The signal processing algorithms are coded in MATLAB™ . The BCI application consists in a computer-rendered 2D maze where a cursor can be moved along four possible directions (upper-left, upper-right, lower-left, and lower-right). The movement direction is decided depending on the RVS which receives the user’s focus of attention.

Fig. 2. (a) Software architecture of our BCI implementation. (b) The BCI application consists of a 2D maze in which the cursor moves according to the RVS which receives the user’s attention. The command sequence to successfully complete this maze configuration is "2232323344144111".

Four RVSi were arranged around a computer screen as illustrated in Figure 2b. RVSi were embodied in 10 x 10 cm boxes containing a (green) power LED shining through a diffusion screen. The stimulation signal consisted in a square wave (50 % duty cycle) at the stimulation frequency. Four phases (</>, <f> + ^, <f> + tt , <f> + ‘^f) were used to command the RVSi (see Figure 3b) where $ is the initial phase at the onset of the stimulation signal. The corresponding stimulation signals were generated using four synchronized function generators (from Agilent technologies, model 33220A).

The EEG signals were collected using a BioSemi Active-two acquisition device [20]. The signals from the 32 electrodes shown in Figure 3c were recorded with reference to Cz and were re-referenced to the average of all 32 signals. The impedance between scalp and electrodes was kept below 5 k^ using conductive gel. The device sampling frequency was set to 2048 Hz. During processing the signals were downsampled to 256 Hz. The participants were asked to sit still and try to complete the navigation task as fast as possible.

The stimulation signal was measured using a photodiode located near the RVS with phase The signal from the photodiode was recorded simultaneously to the EEG signals to perform the phase synchrony analysis (see Section 2.2).

Optimal Stimulation Frequency

The stimulation frequency eliciting the strongest SSVEP response (optimal stimulation frequency) is user dependent [21]. Thus, we implemented a procedure aiming at determining the optimal stimulation frequency in the range from 32 to 40 Hz.

Fig. 3. (a) Measured EEG sites. (b) Stimulation signals at four phases

For a given user, this procedure consisted in presenting RVSi at all the integer stimulation frequencies between 32 and 40 Hz. The presentation order was randomized.

For a particular stimulation frequency, the stimulation was presented in a sequence of four intervals each of them composed of a 4-second long period of stimulation followed by a 4-second long break. To determine which stimulation frequency elicited the strongest SSVEP potential, we used the first stimulation period to estimate the optimal spatial filter as explained in Section 2.1. We applied such filter to the whole EEG signal, i.e. the whole sequence, followed by the peak-filter at the stimulation frequency (see Section 2.1). We then estimated the SSVEP energy in one-second long (non-overlapping) windows as explained in Section 2.1. This resulted in 32 values (16 for the stimulation periods and 16 for the non-stimulated periods) on which a threshold based detection of the SSVEP was performed. Since this constitutes a detection problem with a single threshold a receiving-operator curve (ROC) [22] can be determined by progressively varying the threshold from the lowest SSVEP energy in the stimulation periods to the highest SSVEP energy in the break periods. The area under the ROC (AUC) is a good indicator of the detectability of the SSVEP at the stimulation frequency. The optimal stimulation frequency corresponded to the one which resulted in the highest AUC.

Parameter Calibration

The goal of the calibration is to determine the optimal BCI operation parameters for a particular user, i.e. the coefficients of the spatial filter and the classifier parameters.

During calibration the user was presented with a sequence of 16 intervals each of them composed of a 4-second long stimulation period (at the optimal subject’s frequency) followed by a 4-second long break. In each of the intervals, the user was instructed to pay attention to a particular RVS out of the four presented. Each RVS received the user’s attention four times. The sequence was randomized.

The spatial filters were determined on the data recorded during the first stimulation period while the PNN parameters were determined using data from the rest of the intervals.

BCI Operation and Information Transfer Rate

During operation the user was instructed to move the cursor in the 2D maze along a fixed path. There were no bifurcations so that there was a unique way to move in the maze. Backward moves were not allowed.

When a command was detected by the system, the cursor changed its orientation towards the targeted direction and move there. No movement happened when the detection indicated a non-allowed move.

Correct moves were accompanied by a low pitched tone while incorrect ones were signaled by a high pitched tone.

As shown in Figure 3b, the command sequence to successfully complete this configuration without any erroneous move was ’2232323344144111′, where 1,2,3 and 4 are associated with the left bottom, the left top, the right top, and the right bottom LEDs respectively. This sequence is balanced so that each direction has to be taken four times. This avoids biasing the results due to a preferred direction.

The bitrate was estimated based on the user’s proficiency in moving the cursor through the maze and along the specified path. We defined the accuracy as the ratio between the number of correct moves and the total number of moves.

To evaluate the performance of a BCI, consistent criteria are necessary. The most popular criterion is the information transfer rate (ITR) which measures the information transmitted by the system in a unit time and is calculated based on the popular bitrate definition provided in the seminal paper [23]. This definition suggests the following formula to obtain bitrate and ITR for C classes and classification accuracy p.

where t is the average time (in seconds) necessary to detect a symbol or to execute a command.

Results

During operation, a 1.5-second long window was used to take a decision. This window was subdivided into three sub-windows with 75% overlapping. Each sub-window would lead to a classification.

Table 1. Experimental results

Participant	Freq. Detected (Hz) command sequence		Accuracy Latency ITR (%) (second) (bits/minute)
S1	40	’2232323344144111′	100	4.17	28.78
S2	40	’2232323344144111′	100	2.95	40.64
S3	39	’223232233441434111′	88.9	2.29	34.53
S4	40	’2232323344144111′	100	3.05	40.63
S5	39	’2232232233441443111′	84.2	2.41	27.91
S6	40	’2232323344144111′	100	3.70	32.42
Mean S.D.	-	-	95.5 7.1	3.10 0.73	34.15 5.57

The command was decided if at least two classifications resulted in the same decision. For example, if the three classifications were ’101′, the detected command would be ’1′.

Six subjects (S1 to S6) participated in our experiments. Their performance is reported in Table 1. For each participant, the optimal stimulation frequency is reported in the second column.

The command sequence is reported in the third column of Table 1. Participants S1, S2, S4, and S6 were able to navigate through the maze with 100% accuracy. The erroneous commands are underlined for participants S3 and S5. The accuracy estimate was used as p in (4) to estimate the bits-per-symbol R when the number of classes C is equal to 4.

The average command latency in the fifth column results from dividing the total time it took the participant to complete the maze by the total number of commands. This term was used as t in (5) to estimate the information transfer rate in bits-per-minute. The ITR is reported in the sixth column of Table 1. The across-participant averages and corresponding standard deviations (SD) are reported in the last rows of Table 1. Our results show an average ITR of 34 bits-per-minute which constitutes a promising result considering that high-frequency repetitive visual stimulation has been applied and that the phase of the RVS was used to distinguish among four possible commands.

Conclusion

In this paper, we have presented an SSVEP-based BCI implementation which uses high frequency repetitive visual stimulation. The different stimulation targets which are attended by the user to generate commands are distinguished by their phase. Thus a single stimulation frequency and different phases are used.

This approach has clear advantages from the viewpoint of usability and visual comfort as high frequency stimulation is less prone to cause visual fatigue. In addition, selecting a single stimulation frequency considerably shortens the calibration procedure which is necessary to determine which stimulation frequencies are best suited for a given user.

The signal processing approach consisting in cascading spatial filters with the phase synchrony analysis proved to be successful to implement a phase variant SSVEP-based BCI. Indeed, our evaluation on six participants show a high SSVEP detection accuracy (95.5%±7%) and an (across subject) average information transfer rate of 34.15 ± 5.57 bits/minute.