Biomedical Engineering Reference
In-Depth Information
CHAPTER 1
AN OVERVIEW ON VARIABLE SELECTION FOR
LONGITUDINAL DATA
John J. Dziak a and Runze Li b
Department of Statistics and the Methodology Center,
The Pennsylvania State University,
University Park, PA 16802-2111, USA
E-mails: a jdziak@stat.psu.edu
b rli@stat.psu.edu
During the past two decades, there have been many new developments
in longitudinal data analysis. Authors have made many eorts on devel-
oping diverse models, along with inference procedures, for longitudinal
data. More recently, researchers in longitudinal modeling have begun ad-
dressing the vital issue of variable selection. Model selection criteria such
as AIC, BIC, C p , LASSO and SCAD can be extended to longitudinal
data, although care is required to adapt the classical ideas and formulas
to deal with within-subject correlation. This chapter presents a review on
recent developments on variable selection criteria for longitudinal data.
1. Introduction
Since the 1980s, there has been considerable literature on the topic of longi-
tudinal data analysis. Researchers have invested much eort in developing
diverse models and proposing statistical inference procedures for longitudi-
nal data (see, e.g., [12]). However, although variable selection is an essential
part of statistical analysis, it has only recently received adequate attention
in the context of longitudinal data analysis.
Often in longitudinal studies, many variables are measured. The num-
ber of potential predictors can be large, especially when nonlinear terms
and interactions between covariates are introduced to reduce possible mod-
eling biases. It is common in practice to include only a subset of important
variables in the model, to enhance predictability and model parsimony.
There are many existing subset selection criteria and procedures for linear
regression models; for critical reviews see [5], [43], [19], and [33]. Some of
Search WWH ::




Custom Search