Information Technology Reference
In-Depth Information
difference of learning methods focus on the optimisation of the different cost
functions. According to the “sum-of-squared errors + penalty” criterion, these
optimisation problems can be formulated into the following form as
β =argmi β {
2 + f ( λ,β )
y
} ,
(3)
where f ( λ,β ) is usually L 1 -norm penalty function or L 2 -norm penalty function
or Elastic Net penalty function (both L 1 -norm penalty and L 2 -norm penalty).
The two-stage stepwise selection method [9] only consider the optimisation of
the sum-of-squared errors, which can not select the gene with high correlation.
To extend the gene selection ability of the recently proposed two-stage stepwise
selection, the L 2 -norm penalty is added into the cost function as follows
2 + λ 2
2 ,
J ( β,λ 2 )=
y
β
(4)
j =1
M
2 =
β j .
where λ 2 is the regularisation parameter, and
β
The estimator β is the minimizer of (4)
β =argmin
{
J ( β,λ 2 )
}
(5)
3 Two-Stage Gene Selection Method
3.1 The Grouping Effect of the Ridge Penalty
Qualitatively speaking, a regression method exhibits the grouping effect if the
regression coecients of a group of highly correlated variables tend to be equal
(or a change of sign if negatively correlated) [8]. In fact, the (4) is the ridge
optimisation problem. The L 2 penalty in ridge regularisation can provide the
grouping effect as shown in the following.
After the regression matrix X is standardised, then obviously
1 ρ 12 ···
ρ 1 M
1
···
ρ 2 M
X T X =
(6)
.
.
.
. . .
∗ ∗ ···
1
where ρ i,j is the sample correlation between i th regressor and j th regressor, '
represents the symmetrical structure. The ridge estimator is expressed by
β =( X T X + λ 2 I ) 1 X T y
(7)
Theorem 1: Suppose that the response y is centred, the regression matrix X
are standardised, and β is the solution of (4). If β i β j
=0,then
2(1
β j
λ 2
y
β i
ρ ij )
(8)
 
Search WWH ::




Custom Search