Biomedical Engineering Reference
In-Depth Information
indicates that there are a few neurons that are significantly more sensitive than the rest of the en-
semble. For HP and HV, the most important neurons are again mostly located in the primary motor
cortex (M1). For the GF, we can also see that initially during first recording session multiple brain
areas are contributing (30% premotor dorsal, 10% supplementary motor associative, 10% somato-
sensory), but many of these cells are replaced during session two with cells from M1.
4.2.2 L 1 -norm Penalty Pruning
Recent studies on statistical learning have revealed that the L 1 -norm penalty sometimes provides
a better solution as a regularization method in many applications than the L 2 -norm penalty [ 27 ].
Basically, the L 1 -norm based regularization methods can select input variables more correlated with
outputs as well as provide sparser models, compared to the L 2 -norm based methods. Linear Abso-
lute Shrinkage and Selection Operator (LASSO) has been a prominent algorithm [ 28 ] among the
L 1 -norm based regularization methods. However, its implementation is computationally complex.
Least Angle Regression (LAR), which was recently proposed by Efron et al., provides a framework
to incorporate LASSO and forward stagewise selection [ 29 ]. With LAR, the computational com-
plexity in the learning algorithm can be significantly reduced.
The LAR algorithm has been recently developed to accelerate computation and improve perfor-
mance of forward model selection methods. It has been shown in Efron et al. that simple modifications
to LAR can implement the LASSO and the forward stagewise linear regression [ 29 ]. Essentially, the
LAR algorithm requires the same order of computational complexity as ordinary least squares (OLS).
The selection property of LAR, which leads to zeroing model coefficients, is preferable for
sparse systems identification when compared to regularization methods based on the L 2 -norm pen-
alty. Also, the analysis of the selection process often provides better insights into the unknown sys-
tem than the L 2 -norm based shrinkage methods.
The LAR procedure starts with an all-zero coefficients initial condition. The input variable
having the most correlation with desired response is selected. We proceed in the direction of the se-
lected input with a step size which is determined such that some other input variable attains as much
correlation with the current residual as the first input. Next, we move in the equiangular direction
between these two inputs until the third input has the same correlation. This procedure is repeated
until either all input variables join the selection, or the sum of coefficients meets a preset threshold
(constraint). Note that the maximum correlation between inputs and the residual decreases over suc-
cessive selection step in order to decorrelate the residual with inputs. Table 4.4 summarizes the details
of the LAR procedure [ 29 ].
The illustration in Figure 4.10 (taken from Efron et al. [ 29 ]) will help to explain how the LAR
algorithm proceeds. In this figure, we start to move on the first selected input variable x 1 until the next
variable ( x 2 in this example) has the same correlation with the residual generated by x 1 . μ 1 is the unit
 
Search WWH ::




Custom Search