Biomedical Engineering Reference
In-Depth Information
the traditional variable selection criteria, including C p , AIC, and BIC (see
[31], [2], and [42], respectively), have been extended for longitudinal data.
This chapter will give a systematic introduction of variable selection for
longitudinal data.
In Section 2 we give an overview on variable selection for linear
mixed eects models. Selecting signicant xed eect variables is relatively
straightforward, but identication of signicant random eects variables is
very challenging; existing works dealing with this issue include Chen and
Dunson 10 and Vaida and Blanchard 46 . Selection of signicant random ef-
fects is closely related to covariance selection. Thus, we review some recent
work on covariance selection in Section 3.
Generalized estimation equations (GEE) are very popular for analyzing
binary, count and categorical longitudinal data. Penalized generalized esti-
mating equations have recently been proposed for variable selection under
the GEE framework (e.g., by Pan 36;37 , Fu 18 , and Dziak 13 ). In Section 4 we
present an overview of variable selection methods for GEE, and we explore
their performance empirically in Section 5. In Section 6 we give an intro-
duction to variable selection for partial linear models, which are useful for
modeling longitudinal data semiparametrically.
2. Variable Selection for Linear Mixed Eects Models
Suppose that we have a sample of n subjects. For the i-th subject, we
collect the response variable y ij , the d1 covariate vector x ij , and the
q1 covariate vector z ij , at various times t ij , j = 1;; n i , where n i is
the number of observations on the i-th subject and N =
P
i n i is the total
number of observations. Covariates may be constant within each subject,
or may change over time.
For succinct presentations, we will use matrix notation. Let y i =
(y i1 ;; y in i ) T , X i = (x i1 ;; x in i ) T and Z i = (z i1 ;; z in i ) T . In gen-
eral, the linear mixed effects model is dened as
y i = X i + Z i i + " i ;
(2.1)
where is the xed eect parameter vector, i is subject-specic ran-
dom eects with i
N(0; A), and " i is a random error vector following
N(0; 2 I). In the context of (2.1), model selection is a broader issue than
variable selection; for example, one may choose the best among several can-
didate mean structures 50 . However, for simplicity we focus only on variable
selection in this section, and in Section 3 we will review some methods for
covariance selection problems.
Search WWH ::




Custom Search