Information Technology Reference
In-Depth Information
Semi-supervised Dimension Reduction
with Kernel Sliced Inverse Regression
Chiao-Ching Huang 1 and Kuan-Ying Su 2
1 Department of Computer Science and Information Engineering,
National Taiwan University,
2 National Taiwan University of Science and Technology, Taipei, Taiwan
Abstract. This study is an attempt to draw on research of semi-super-
vised dimension reduction. Many real world problems can be formulated
as semi-supervised problems since the data labeling is much more chal-
lenging to obtain than the unlabeled data. Dimension reduction benefits
the computation performance and is usually applied in the problem with
high dimensional data. This paper proposes a semi-supervised dimension
reduction achieved with the kernel sliced inverse regression (KSIR). The
prior information is applied to estimate the statistical parameters in the
KSIR formula. The semi-supervised KSIR performs comparably to other
established methods but much more ecient.
1 Introduction
Dimension reduction is one of the most important research topic in the machine
learning and data mining domains for improving computation performance and
achieving data visualization. In machine learning, tasks can be classified ac-
cording to the availability of labeled data, so called the supervised information.
Semi-supervised learning (SSL) investigate the problems which only part of the
original input data is labeled. Although the supervised learning, so called as
classification, is learned from the totally labeled data then predicting unlabeled
data, SSL focus on the problem consisted of most unlabeled data and few la-
beled ones. Real world physical problems usually belong to SSL since the data
is readily available but the labeled data is fairly expensive to obtain.
Existing dimension reduction methods can also be categorized into super-
vised, semi-supervised, and unsupervised according to the existence of labeled
data. Fisher Linear Discriminant (FLD) [3] is one of the supervised dimension
reduction methods which tries to extract the optimal discriminant vectors with
the supervised information. An example of unsupervised dimension reduction
method is the well-known Principle Component Analysis (PCA) [5]. It repeat-
edly finds the principle components, the orthogonal projections, with preserving
the covariance structure of data without any labeled information.
Although the PCA [5] is an unsupervised dimension reduction method, which
tries to retain the covariance arrangement of original data without the class
information, the Sliced Inverse Regression (SIR)[8] exploits the prior knowledge
for the dimension reduction. As an supervised dimension reduction method,
Search WWH ::




Custom Search