Mapping Associations - Mapping Scientific Frontiers: The Quest for Knowledge Visualization

Information Technology Reference

In-Depth Information

Fig. 3.35 The Swiss-roll data set, illustrating how Isomap exploits geodesic paths for nonlinear

dimensionality reduction. Straight lines in the embedding (the blue line in part a ) now represent

simpler and cleaner approximations to the true geodesic paths than do the corresponding graph

paths (the red line in part b ) (Reproduced from Tenenbaum et al. ( 2000 )Fig.3. http://www.

approximation is to transform a non-linear data into many smaller linear data and

then re-construct a global solution from local solutions of linear structures. Both

algorithms explained below are tested on a Swiss-roll-like non-linear data structure

of 20,000 data points.

The Isomap algorithm extracts meaningful dimensions by measuring the distance

between data points along the surface (Tenenbaum et al. 2000 ). Isomap works best

for shapes that can be flattened out, like cylinders or Swiss rolls. Isomap measures

the distance between any two points on the shape, then uses these geodesic distances

in combination with the classic MDS algorithm in order to make a low dimensional

representation of that data. Figure 3.35 demonstrates how Isomap unfolds data

shaped like a Swiss roll.

In the Isomap algorithm, the local quantities computed are the distances between

neighboring data points. For each pair of non-neighboring data points, Isomap finds

the shortest path through the data set connecting them, subject to the constraint

that the path must hop from neighbor to neighbor. The length of this path is

an approximation to the distance between its end points, as measured within the

underlying manifold. Finally, the classical method of MDS is used to find a set

of low-dimensional points with similar pairwise distances. The Isomap algorithm

worked well on several test data, notably face images with three degrees of freedom,

up-down pose, left-right pose, and lighting direction (Fig. 3.36 ) and hand images

with wrist rotation and finger extension as two degrees of freedom (Fig. 3.37 ). In

other words, the true dimension of the face image data is 3 and that of the hand data

is 2. The residual variance of Isomap drops faster than PCA and MDS, which means

that PCA and MDS tend to overestimate the dimensionality, in contrast to Isomap

(Tenenbaum et al. 2000 ).

3.3.5

Locally Linear Embedding

The Locally Linear Embedding (LLE) algorithm uses linear approximation to model

a non-linear manifold (Roweis and Saul 2000 ). It is like using a lot of small pieces

Search WWH ::

Custom Search

Home