An Efficient Two-Stage Gene Selection Method for Microarray Data - Intelligent Computing for Sustainable Energy and Environment

Information Technology Reference

In-Depth Information

difference of learning methods focus on the optimisation of the different cost

functions. According to the “sum-of-squared errors + penalty” criterion, these

optimisation problems can be formulated into the following form as

β =argmi β {

2 + f ( λ,β )

−

Xβ

} ,

(3)

where f ( λ,β ) is usually L 1 -norm penalty function or L 2 -norm penalty function

or Elastic Net penalty function (both L 1 -norm penalty and L 2 -norm penalty).

The two-stage stepwise selection method [9] only consider the optimisation of

the sum-of-squared errors, which can not select the gene with high correlation.

To extend the gene selection ability of the recently proposed two-stage stepwise

selection, the L 2 -norm penalty is added into the cost function as follows

2 + λ 2

2 ,

J ( β,λ 2 )=

−

Xβ

(4)

j =1

2 =

β j .

where λ 2 is the regularisation parameter, and

The estimator β is the minimizer of (4)

β =argmin

{

J ( β,λ 2 )

}

(5)

3 Two-Stage Gene Selection Method

3.1 The Grouping Effect of the Ridge Penalty

Qualitatively speaking, a regression method exhibits the grouping effect if the

regression coecients of a group of highly correlated variables tend to be equal

(or a change of sign if negatively correlated) [8]. In fact, the (4) is the ridge

optimisation problem. The L 2 penalty in ridge regularisation can provide the

grouping effect as shown in the following.

After the regression matrix X is standardised, then obviously

⎡

⎣

⎤

⎦

1 ρ 12 ···

ρ 1 M

∗

···

ρ 2 M

X T X =

(6)

. . .

∗ ∗ ···

where ρ i,j is the sample correlation between i th regressor and j th regressor, '

∗

represents the symmetrical structure. The ridge estimator is expressed by

β =( X T X + λ 2 I ) − 1 X T y

(7)

Theorem 1: Suppose that the response y is centred, the regression matrix X

are standardised, and β is the solution of (4). If β i β j

=0,then

2(1

β j ≤

λ 2

β i −

−

ρ ij )

(8)

Intelligent Computing for Sustainable Energy and Environment

Search WWH ::

Custom Search

Home