Graphics Reference
In-Depth Information
In the case of classification, the class information can be included in the distance as
d rs = (
1
α)
d rs + α
c rs
where c rs is the “distance” between the classes x r and x s belong to. This interclass
distance should be supplied subjectively and
α
could be optimized using CV.
6.2.4 Locally Linear Embedding
Locally Linear Embedding (LLE) recovers global nonlinear structure from locally
linear fits [ 25 ]. Its main idea is that each local patch of the manifold can be approxi-
mated linearly and given enough data, each point can be written as a linear, weighted
sum of its neighbors.
The LLE algorithm is based on simple geometric intuitions. Suppose the data
consists of N real-valued vectors X i , each of dimensionality D , sampled from some
smooth underlying manifold. It is expected that each data point and its neighbors
to lie on or close to a locally linear patch of the manifold. The local geometry of
these patches can be characterized by linear coefficients that reconstruct each data
point from its neighbors. In the simplest formulation of LLE, the KNN are estimated
per data point, as measured by Euclidean distance. Reconstruction errors are then
measured by the cost function:
2
ε(
W
) =
X i
W ij X j
i
j
which adds up the squared distances between all the data points and their recon-
structions. The weights W ij summarize the contribution of the j th data point to the
i sth reconstruction. To compute the weights W ij , it is necessary to minimize the cost
function subject to two constraints: first, that each data point X i is reconstructed only
from its neighbors, enforcing W ij =
0if X j does not belong to this set; second, that
the rows of the weight matrix sum to one: j W ij =
1 s. The optimal weights W ij
subject to these constraints are found by solving a least squares problem.
The constrained weights that minimize these reconstruction errors are invariant
to rotations, scaling, and translations of that data point and its neighbors. Suppose
the data lie on or near a smooth nonlinear manifold of dimensionality d
D .
To achieve a good approximation, then, there exists a linear mapping that maps the
high dimensional coordinates of each neighborhood to global internal coordinates on
the manifold. By design, the reconstruction weights W ij reflect intrinsic geometric
properties of the data that are invariant to exactly such transformations. We therefore
expect their characterization of local geometry in the original data space to be equally
valid for local patches on the manifold. In particular, the same weights W ij that
 
Search WWH ::




Custom Search