Semi-supervised Dimension Reduction with Kernel Sliced Inverse Regression - Technologies and Applications of Artificial Intelligence

Information Technology Reference

In-Depth Information

Semi-supervised Dimension Reduction

with Kernel Sliced Inverse Regression

Chiao-Ching Huang 1 and Kuan-Ying Su 2

1 Department of Computer Science and Information Engineering,

National Taiwan University,

2 National Taiwan University of Science and Technology, Taipei, Taiwan

Abstract. This study is an attempt to draw on research of semi-super-

vised dimension reduction. Many real world problems can be formulated

as semi-supervised problems since the data labeling is much more chal-

lenging to obtain than the unlabeled data. Dimension reduction benefits

the computation performance and is usually applied in the problem with

high dimensional data. This paper proposes a semi-supervised dimension

reduction achieved with the kernel sliced inverse regression (KSIR). The

prior information is applied to estimate the statistical parameters in the

KSIR formula. The semi-supervised KSIR performs comparably to other

established methods but much more ecient.

1 Introduction

Dimension reduction is one of the most important research topic in the machine

learning and data mining domains for improving computation performance and

achieving data visualization. In machine learning, tasks can be classified ac-

cording to the availability of labeled data, so called the supervised information.

Semi-supervised learning (SSL) investigate the problems which only part of the

original input data is labeled. Although the supervised learning, so called as

classification, is learned from the totally labeled data then predicting unlabeled

data, SSL focus on the problem consisted of most unlabeled data and few la-

beled ones. Real world physical problems usually belong to SSL since the data

is readily available but the labeled data is fairly expensive to obtain.

Existing dimension reduction methods can also be categorized into super-

vised, semi-supervised, and unsupervised according to the existence of labeled

data. Fisher Linear Discriminant (FLD) [3] is one of the supervised dimension

reduction methods which tries to extract the optimal discriminant vectors with

the supervised information. An example of unsupervised dimension reduction

method is the well-known Principle Component Analysis (PCA) [5]. It repeat-

edly finds the principle components, the orthogonal projections, with preserving

the covariance structure of data without any labeled information.

Although the PCA [5] is an unsupervised dimension reduction method, which

tries to retain the covariance arrangement of original data without the class

information, the Sliced Inverse Regression (SIR)[8] exploits the prior knowledge

for the dimension reduction. As an supervised dimension reduction method,

Search WWH ::

Custom Search

Home