Regularization Techniques for BMI Models - Brain-Machine Interface Engineering

Biomedical Engineering Reference

In-Depth Information

such inadequate filter outputs. On the other hand, when the correlation exceeds δ 1 , the variable selec-

tion algorithm runs until C max ( j ) becomes lower than the second threshold, δ 2 . We empirically search

certain values of [δ 1 , δ 2 ] with which the probability of selecting at least one channel is very low for both

surrogate data sets, but reasonably high for the original data.

To determine δ 1 , a correlation between ˆ

( )

and d ( n ) is recursively estimated such as

LMS

n d n

p n q n

( ) ( )

ξ µξ

( )

(

− +

)

LMS

(4.47)

where μ is a forgetting factor that is usually defined for the recursive least squares (RLS) [ 14 ]. p ( n )

and q ( n ) represent the power estimates for ˆ

LMS and d ( n ), respectively. Normalization by the

square root of p ( n ) q ( n ) prevents ξ( n ) from being biased to a large magnitude of d ( n ). The power is

estimated through similar recursions:

( )

p n

( )

p n

(

− +

)

( )

(4.48)

LMS

q n

( )

q n

(

− +

)

d n

( )

If ξ( n ) ≥ δ 1 , the online variable selection algorithm is activated. If ξ( n ) < δ 1 , an empty subset

is yielded and ˆ

is set equal to ˆ

LMS .

Once the online variable selection algorithm is started, δ 2 is used to stop the algorithm until

C max ( j ) < δ 2 at the j th iteration. Here, we describe how C max ( j ) represents the correlation of inputs

with a desired output. In the LAR, if two successively selected inputs, x j −1 and x j , have similar cor-

relations with a desired output, d , the decrease from C max ( j − 1) to C max ( j ) will be small. On the

other hand, if x j −1 has more correlation than x j , C max ( j ) will be much smaller than C max ( j − 1). This is

illustrated in Figure 4.13 . Consider the data [ X , d ], where X is an input matrix whose rows are input

samples, and d is an output vector. Suppose that X has two columns, x 1 and x 2 . We assume that x 1 ,

x 2 , and d are standardized with zero mean and unit variance. Suppose x 1 has more correlation with

d . The LAR starts to move in the direction of x 1 . It finds the coefficient β 1 for x 1 such that | x 1 T r 1 |

= | x 2 T r 1 |, where r 1 = d - β 1 x 1 ≡ d - y 1 . Then, the maximum correlation changes from C max (0) = | x 1 T d |

to C max (1) = | x 1 T r 1 | = | x 2 T r 1 |. From this, we can see that the angle between x 1 and r 1 is equal to the

angle between x 2 and r 1 the equiangular property of the LAR. C max ( j ) is related with the angles

such that C max ( j ) ≈ cosθ j , where θ j represents the angle at the j th step. The diagram on the left side

of Figure 4.13 illustrates the case when x 1 and x 2 have similar correlations with d . In this case, a

small difference between θ 0 and θ 1 leads to a small decrease from C max (0) to C max (1). On the other

hand, the diagram on the right side illustrates the case when x 2 is considerably less correlated with

d than x . In this case, a large difference between θ 0 and θ 1 causes C max (1) to decrease significantly.

( )

LAR

Brain-Machine Interface Engineering

Search WWH ::

Custom Search

Home