Information Technology Reference
In-Depth Information
Fig. 3.35 The Swiss-roll data set, illustrating how Isomap exploits geodesic paths for nonlinear
dimensionality reduction. Straight lines in the embedding (the blue line in part a ) now represent
simpler and cleaner approximations to the true geodesic paths than do the corresponding graph
paths (the red line in part b ) (Reproduced from Tenenbaum et al. ( 2000 )Fig.3. http://www.
sciencemag.org/cgi/content/full/290/5500/2319/F3 )
approximation is to transform a non-linear data into many smaller linear data and
then re-construct a global solution from local solutions of linear structures. Both
algorithms explained below are tested on a Swiss-roll-like non-linear data structure
of 20,000 data points.
The Isomap algorithm extracts meaningful dimensions by measuring the distance
between data points along the surface (Tenenbaum et al. 2000 ). Isomap works best
for shapes that can be flattened out, like cylinders or Swiss rolls. Isomap measures
the distance between any two points on the shape, then uses these geodesic distances
in combination with the classic MDS algorithm in order to make a low dimensional
representation of that data. Figure 3.35 demonstrates how Isomap unfolds data
shaped like a Swiss roll.
In the Isomap algorithm, the local quantities computed are the distances between
neighboring data points. For each pair of non-neighboring data points, Isomap finds
the shortest path through the data set connecting them, subject to the constraint
that the path must hop from neighbor to neighbor. The length of this path is
an approximation to the distance between its end points, as measured within the
underlying manifold. Finally, the classical method of MDS is used to find a set
of low-dimensional points with similar pairwise distances. The Isomap algorithm
worked well on several test data, notably face images with three degrees of freedom,
up-down pose, left-right pose, and lighting direction (Fig. 3.36 ) and hand images
with wrist rotation and finger extension as two degrees of freedom (Fig. 3.37 ). In
other words, the true dimension of the face image data is 3 and that of the hand data
is 2. The residual variance of Isomap drops faster than PCA and MDS, which means
that PCA and MDS tend to overestimate the dimensionality, in contrast to Isomap
(Tenenbaum et al. 2000 ).
3.3.5
Locally Linear Embedding
The Locally Linear Embedding (LLE) algorithm uses linear approximation to model
a non-linear manifold (Roweis and Saul 2000 ). It is like using a lot of small pieces
Search WWH ::




Custom Search