Information Technology Reference
In-Depth Information
14.4 Discussion
The use of the lasso regularization (Sect. 14.1.1 ) also appears in many different
fields, for instance in compressed sensing [ 4 , 5 , 8 ] which solves an optimization
problem of the form of ( 14.1 ) with least squares loss function and an
1 regularizer.
In compressed sensing, a signal of length p is recovered from few observations (small
n ), under the assumption that the original signal is sparse. Exact recovery with a high
probability is guaranteed under certain conditions.
For group lasso (Sect. 14.2.1 ) with the ordinary least squares loss function, there
are alternative methods to group lasso, including group LARS (Least Angle Regres-
sion Selection) and group non-negative garrotte [ 24 ]. These methods have slightly
different characteristics on their solution path. Also, group LARS usually scales
much better than group lasso.
In Sect. 14.2.2 , we have introduced a naive approach to reformulate overlapping
group lasso to group lasso, by replicating features that belong to multiple groups.
However, this approach increases the dimension of optimization, and therefore may
not be preferable when p is large. There exist several optimization algorithms that
do not require such replication [ 13 , 25 , 28 ].
When the dimension in data is much larger than the size of a sample ( p
n ),
the solution of the optimization problem in ( 14.1 ) can vary even by small changes
in the sample. Denoting an estimate by ʲ
n
which we obtain by solving ( 14.1 ) with a
sample of size n , and denoting a true unknown parameter by
ʲ , we can define the
notion of consistency in terms of variable selection,
P
: ʲ
: ʲ j
n
j
{
j
=
0
}={
j
=
0
}
1
,
as n
ₒ∞ .
When a method is consistent in terms of variable selection and the convergence above
is fast enough, then small n may not matter much as an estimate ʲ
n
ʲ .
Lasso produces consistent estimates when some strong conditions hold [ 17 , 27 ].
Unfortunately features from high-throughput genomic profiling are typically highly
correlated, for which these conditions often break. Reference [ 2 ] has shown that
under a fixed p and a specific choices of
will be close to
, the intersection of features selected by
bootstrapped lasso estimates is consistent under less restrictive conditions. Refer-
ence [ 18 ] has proposed the randomized lasso method, which potentially has better
consistency. This issue has been studied in bioinformatics in terms of stable feature
selection [ 1 , 7 ]. With the development of high-throughput profiling technologies, the
dimension in data keeps growing. Therefore consistency remains as a challenging
topic for research.
ʻ
14.5 Conclusion
The rapid growth of dimensionality in modern high-throughput measurement
technology requires us to consider extra information on features, in order to avoid
averse effects of high dimensionality such as overfitting. Information on groupings
 
Search WWH ::




Custom Search