Biomedical Engineering Reference
In-Depth Information
Let the penalty function be the entropy or L 0 penalty, namely,
j) = 1
2 2 I(j j
p j (j j
j6= 0);
where I() is an indicator function and all j = . The penalized likelihood
with the entropy penalty can be rewritten as
X
n
1
n
` i (; ) + 1
2 2 jMj;
(2.6)
i=1
P
j I(j j j 6= 0), the size of the candidate model. The AIC 2
and AIC C (the nite sample correction of AIC, see [23]), have been ex-
tended to linear mixed eects model in [46], and the BIC 42 was extended
for the linear mixed eects model in [38], in which two modications of the
BIC were further proposed by considering an arbitrary, possibly informa-
tive prior and the generalized Cauchy prior of Jereys 24 . Both AIC and
BIC can be written as the penalized likelihood (2.6) with certain values of
. Specically, the AIC corresponds to =
wherejMj=
p
2=n in (2.6). Classical BIC
p
corresponds to =
log(n e )=n, where n e is the eective number of obser-
vations and may be taken to be either n or N, based on the model structure
of interest 38 . Theorem 1 of Jiang and Rao 25 gives conditions on , under
which the resulting criterion is asymptotically consistent. Using Jiang and
Rao's results, it may be veried that if n i is uniformly bounded, the BIC
(by either formula) is asymptotically consistent, while the AIC is not.
Many other penalties have been considered in the penalized least squares
case, i.e., for linear regression models with iid error, and they can be ex-
tended to the longitudinal case. The form of p () determines the general be-
havior of the estimator. Dene the L p penalty to be p j (j j j) = j p 1 j j j p ,
p > 0. It is well known that the L 2 penalty with least squares results
in a ridge regression estimator 21 . The L p penalty with 0 < p < 2 yields
bridge regression 17 , with properties intermediate between best-subset and
ridge regression. With the L 1 penalty specically, the penalized likeli-
hood estimator is the LASSO of Tibshirani 45 . Antoniadis and Fan derived
characterizations of penalized least squares with orthonormal design ma-
trix, and Li, Dziak and Ma 28 extended these to non-orthogonal design
matrices and explored the insights they provide into choice of penalty
functions. Fan and Li 15 suggested using the smoothly clipped absolute
deviation (SCAD) penalty, dened by
Search WWH ::




Custom Search