Biomedical Engineering Reference
In-Depth Information
Suppose that we have a sample of n subjects. For the i-th subject,
the response variable y i (t) and the covariate vector x i (t), are collected at
times t = t i1 ;; t in i , where n i is the total number of observations on
the i-th subject. The partial linear model for longitudinal data has the
following form:
y i (t ij ) = (t ij ) + T x i (t ij ) + " i (t ij )
(6.1)
for i = 1;; n, and j = 1;; n i . As before, variable selection is important
in the partial linear model, because the number of available x variables in
(6.1) can be large.
Fan and Li 16 proposed a class of variable selection procedures via the
nonconvex penalized quadratic loss
X
n
X
d
1
2n
X i ) T W i (y i
(y i
i
i
X i ) +
p j (j j
j):
(6.2)
i=1
j=1
where i = [(t i1 );; (t in i )] T . We can then implement entropy, L q , or
SCAD penalties for the p j (j j
j). Since (t) is an unknown nonparametric
smooth function, (6.2) cannot directly be minimized in . Therefore, Fan
and Li 16 proposed eliminating the nuisance function () using a proling
technique; see [16] for details. Then the resulting estimate of (6.2) is a
penalized prole least squares estimate. The sampling properties of the
penalized prole least squares estimate were studied by Fan and Li 16 , who
demonstrated that with a proper choice of regularization parameters and
penalty functions, the proposed variable selection procedures perform as
well asymptotically as an oracle estimator.
More research is needed to study how to choose signicant variables for
other existing models for longitudinal data. For example, there is little or
no existing work in the literature on variable selection for generalized linear
mixed eects models and generalized partial linear models for longitudinal
data.
Acknowledgements
This research was supported by a NSF grant DMS-03048869 and a National
Institute on Drug Abuse (NIDA) grant P50 DA10075.
Search WWH ::




Custom Search