Information Technology Reference
In-Depth Information
In Section 2 we will review related work in dimensionality reduction, unsupervised
regression, and KNN regression. Section 3 presents the concept of UNN regression,
and two iterative strategies that are based on fixed latent space topologies. In Section 4
we extend UNN to robust loss functions, i.e., the
-insensitive loss. In Section 5 we
will show how the constructive variants can be extended to handle incomplete data.
Conclusions are drawn in Section 6.
2
Related Work
Dimensionality reduction is the problem of learning a mapping from high-dimensional
data space to a space with lower dimensions, while losing as little information as pos-
sible. Many dimensionality reduction methods have been proposed in the past, a very
famous one is principal component analysis (PCA), which assumes linearity of the man-
ifold [14,22]. An extension for learning non-linear manifolds is kernel PCA [25] that
projects the data into a Hilbert space. Further famous approaches for dimensionality
reduction are ISOMAP by Tenenbaum
et al.
[28], locally linear embedding (LLE) by
Roweis and Saul [23], and principal curves by Hastie and Stuetzle [12]. An introduction
to other dimensionality reduction methods can be found in machine learning textbooks
like [4], and [11].
2.1
Unsupervised Regression
The work on unsupervised regression for dimensionality reduction started with Meinicke
[20], who introduced the corresponding algorithmic framework for the first time. In this
line of research early work concentrated on non-parametric kernel density regression,
i.e., the counterpart of the Nadaraya-Watson estimator denoted as unsupervised kernel
regression (UKR) [21].
Unsupervised regression works as follows. Let
Y
=(
y
1
,...
y
N
)
with
y
i
∈
IR
d
be
the matrix of high-dimensional patterns in data space. We seek for a low-dimensional
representation, i.e., a matrix of latent points
X
=(
x
1
,...
x
N
)
, so that a regression
function
f
applied to
X
point-wise optimally reconstructs the patterns
, i.e., we search
for an
X
that minimizes the reconstruction in data space. The optimization problem can
be formalized as follows:
minimize
E
(
X
)=
1
2
.
N
Y
−
f
(
x
;
X
)
(1)
E
(
X
)
is called data space reconstruction error (DSRE). Latent points
X
define the low-
dimensional representation. The regression function
f
applied to the latent points should
optimally
reconstruct
the high-dimensional patterns. The regression model
f
induces its
characteristics to the mapping.
Most unsupervised regression approaches are based on the iterative improvement of
a spectral embedding solution, and work as follows: