Biomedical Engineering Reference
In-Depth Information
to be type-I optimal, then a concurrent optimal subset is obtained. In the
above sense, our question is more statistical computing than prediction.
In traditional approaches of subset selection, researchers try to answer
the questions regarding the consistency of variable selection, as well as the
optimal accuracy rate in submodel prediction. There is a large scope of ex-
isting eorts. It is impossible and unnecessary for us to give a comprehensive
survey here. We will just list some publications that have been informative
and inspiring to us. Papers 14;50;13;45;54 , and the references therein give some
interesting results in model estimation, integrating the prediction accuracy.
Consistency of variable selection has been studied in Zheng and Loh 53 .
Nowadays, due to the rapid rising of data sizes, it becomes increasingly
important to develop statistical principles that can be realized in computa-
tionally ecient ways. Our idea of nding ecient sucient conditions for
otherwise unsolvable (i.e., NP-hard) subset selection principle is an incar-
nation of this ideology.
5.2. Other Works in Variable Selection
Despite their generality, the formulations of (P0) and (P1) do not cover
all the existing works in statistical model selection. We review some recent
works that have attracted our attention.
Fan and Li 15 propose a family of new variable selection methods based
on a nonconcave penalized likelihood approach. The criterion is to minimize
kxk
X
0
Fan&Li = RSS(x) + 2n
p (j j
j);
j=1
where p () is a penalty function which is symmetric, nonconcave on (0;1)
and has singularities at origin. With proper choice of , Fan and Li show
that the estimators would have good statistical properties, such as sparsity
and asymptotic normality. The oracle property that they established is very
interesting.
Shen and Ye 46 suggest an adaptive model selection procedure to esti-
mate the algorithmic parameter from the data. In detail, the optimal
value of is obtained by minimizing
Shen&Ye = RSS(x) + g 0 ( 0 ) 2 ;
which is derived from the optimal estimator of the loss l(; ^ ). Quantity
g 0 ( 0 ) is the estimator of g 0 ( 0 ), which is independent of the unknown
Search WWH ::




Custom Search