Information Technology Reference
In-Depth Information
Measurements from sensors deployed in different parts of a machine. A task will
be finding parts that affect sensitivity in operation, as well as individual sensors
that matter inside,
Gene expression measured in exon level. Exons correspond to coding regions of
genes, and they are translated to proteins that are functional in cells. Identifying
clinically important genes, as well as detecting their different usage of exons, is
an important task in biomedical studies,
Genes that belong to different cellular components, different biological functions,
or different molecular functions according to the Gene Ontology. 1
In the examples above, groups of features are used to represent associations of
features that come from our prior knowledge.
Another type of groups comes from our design on features, for instance, when
we perform feature selection on multinomial covariates. Suppose that a feature z
{
A
,
B
,
C
}
is represented with dummy variables x 1 and x 2 , so that
(
x 1 ,
x 2 ) = (
0
,
1
)
,
(
represent A , B , and C , respectively. When z is relevant, then it would
make sense to select both x 1 and x 2 ; otherwise, both variables should not be selected.
Therefore dummy variables that correspond to the same multinomial variable have
to be considered as a group.
For both scenarios, the same methods can be applied for grouped feature selection.
We will focus on the first type where groups represent our knowledge on features.
In the chapter we discuss feature selectionmethods that can extract features in both
individual and group levels. We focus on a popular shrinkage method called lasso ,
and its extensions to handle grouped features. These methods are often referred to as
embedded feature selection methods in machine learning, or penalized (regularized)
regression methods in statistics. A characteristic of them is that feature selection is
integrated with learning predictors, so there is no need to perform each separately.
1
,
0
)
, and
(
1
,
1
)
14.1.1 Regularized Regression
The methods we will discuss in this chapter can be described as optimization
problems with a canonical convex minimization formulation,
min
ʲ ∈R
f
( ʲ 0 ) + ʨ( ʲ ).
(14.1)
p
0 ∈R
Here the first part f
of the objective function represents the amount of loss or
error by making incorrect predictions. The second part
( ʲ 0 )
is called a regularizer
or a penalty term, which is used to induce certain structure (for example, sparsity)
on the coefficient vector
ʨ( ʲ )
ʲ
.
1 http://www.geneontology.org
 
Search WWH ::




Custom Search