Regularization Techniques for BMI Models - Brain-Machine Interface Engineering

Biomedical Engineering Reference

In-Depth Information

indicates that there are a few neurons that are significantly more sensitive than the rest of the en-

semble. For HP and HV, the most important neurons are again mostly located in the primary motor

cortex (M1). For the GF, we can also see that initially during first recording session multiple brain

areas are contributing (30% premotor dorsal, 10% supplementary motor associative, 10% somato-

sensory), but many of these cells are replaced during session two with cells from M1.

4.2.2 L 1 -norm Penalty Pruning

Recent studies on statistical learning have revealed that the L 1 -norm penalty sometimes provides

a better solution as a regularization method in many applications than the L 2 -norm penalty [ 27 ].

Basically, the L 1 -norm based regularization methods can select input variables more correlated with

outputs as well as provide sparser models, compared to the L 2 -norm based methods. Linear Abso-

lute Shrinkage and Selection Operator (LASSO) has been a prominent algorithm [ 28 ] among the

L 1 -norm based regularization methods. However, its implementation is computationally complex.

Least Angle Regression (LAR), which was recently proposed by Efron et al., provides a framework

to incorporate LASSO and forward stagewise selection [ 29 ]. With LAR, the computational com-

plexity in the learning algorithm can be significantly reduced.

The LAR algorithm has been recently developed to accelerate computation and improve perfor-

mance of forward model selection methods. It has been shown in Efron et al. that simple modifications

to LAR can implement the LASSO and the forward stagewise linear regression [ 29 ]. Essentially, the

LAR algorithm requires the same order of computational complexity as ordinary least squares (OLS).

The selection property of LAR, which leads to zeroing model coefficients, is preferable for

sparse systems identification when compared to regularization methods based on the L 2 -norm pen-

alty. Also, the analysis of the selection process often provides better insights into the unknown sys-

tem than the L 2 -norm based shrinkage methods.

The LAR procedure starts with an all-zero coefficients initial condition. The input variable

having the most correlation with desired response is selected. We proceed in the direction of the se-

lected input with a step size which is determined such that some other input variable attains as much

correlation with the current residual as the first input. Next, we move in the equiangular direction

between these two inputs until the third input has the same correlation. This procedure is repeated

until either all input variables join the selection, or the sum of coefficients meets a preset threshold

(constraint). Note that the maximum correlation between inputs and the residual decreases over suc-

cessive selection step in order to decorrelate the residual with inputs. Table 4.4 summarizes the details

of the LAR procedure [ 29 ].

The illustration in Figure 4.10 (taken from Efron et al. [ 29 ]) will help to explain how the LAR

algorithm proceeds. In this figure, we start to move on the first selected input variable x 1 until the next

variable ( x 2 in this example) has the same correlation with the residual generated by x 1 . μ 1 is the unit

Search WWH ::

Custom Search

Home