Semi-supervised Dimension Reduction with Kernel Sliced Inverse Regression - Technologies and Applications of Artificial Intelligence

Information Technology Reference

In-Depth Information

SIR extracts the effective dimension reduction ( e.d.r ) subspace via using the

known class information as the responses in the regression formula. A noted semi-

supervised dimension reduction method was proposed in [13] which exploited

the pairwise constraint as the semi-supervised information and formulated as

an object function for optimization. Using pairwise constraint for dimension

reduction could be found in[10].

In this paper, the notation of data is defined as: A =[ x 1 ;

; x n ]

n×p be

···

∈ R

n be the corresponding

response, the labels. In semi-supervised problems, large portion of Y is unknown

but fixed.

the data matrix of input attributes and y =[ y 1 ; ... ; y n ]

∈ R

2 Supervised Kernel SIR for Dimension Reduction

Sliced inverse regression (SIR) [8] shows that the e.d.r. subspace can be estimated

from the leading directions, the most informative directions in the input pattern

space, in the central inverse regression function with the largest variation. SIR

finds the dimension reduction directions by solving the following generalized

eigenvalue problem:

ʣ E ( A|Y J ) ʲ = ʻʣ A ʲ,

(1)

where ʣ A is the covariance matrix of A , Y J denotes the membership in J slices,

and ʣ E ( A|Y J ) denotes the between-slice covariance matrix based on sliced means

given by

ʣ E ( A|Y J ) = 1

x ) .

n j ( x j

x )( x j

−

(2)

j =1

i∈S j x i is the mean value of

the j th slice, S j is the index set for j th slice, n j is the size of j th slice. Note that

the slices are sliced from A according to responses Y .

In supervised problems, x j is simply the class mean of input attributes for

the j th class in which the slices are replaced by the classes. An equivalent way

to modeling SIR by the following optimization problem:

n i =1 x i is the grand mean, x j =

Here x = 1

n j

ʲ ʣ E ( A|Y J ) ʲ subject to ʲ ʣ A ʲ =1

max

β∈ R p

(3)

The solution, denoted by ʲ 1 , givens the first e.d.r. direction such that slice means

projected along ʲ 1 are most spreading out, where ʲ 1 is normalized with respect to

the sample covariance matrix ʣ A . Repeatedly solving this optimization problem

with the orthogonality constraints ʲ k ʣ A ʲ l = ʴ k,l , where ʴ k,l is the Kronecker

delta, and the sequence of solution ʲ 1 , ...ʲ d forms the e.d.r. basis. Some insightful

discussion on the SIR methodology and applications can be found in [12, 8].

Since the classical SIR is designed to find a linear transformation from input

space to a low dimensional subspace which retains as much information as pos-

sible for the output variable y , it may perform poorly in the non-linear tasks.

To solve the linearity problem, the kernel sliced inverse regression (KSIR)[11]

Technologies and Applications of Artificial Intelligence

Search WWH ::

Custom Search

Home