Grand Tours, Projection Pursuit Guided Tours, and Manual Controls - Data Visualization

Graphics Reference

In-Depth Information

procedure is an important part of a PP guided tour. he purpose of PP optimization

is to find all of the interesting projections, so an optimization procedure needs to be

flexible enough to find global and local maxima. It should not doggedly search for

a global maximum, but it should spend some time visiting local maxima.

Posse ( ) compared several optimization procedures and suggest a random

searchforfindingtheglobalmaximumofaPPindex.Cooketal.( )usedaderiva-

tive-based optimization, always climbing the nearest hill, which when merged with

a grand tour was a lot like interactive simulated annealing. Klein and Dubes ( )

showed that simulated annealing can produce good results for PP.

Lee et al. ( ) use the modified simulated annealing method. It uses two differ-

enttemperatures,oneforneighborhooddefinitionandtheother(coolingparameter)

for the probability that guards against getting trapped in a local maximum. his al-

lows the algorithm to visit a local maximum and then jump out and look for other

maxima. he temperature of the neighborhood is rescaled by the cooling parameter

enablingescapefromthelocalmaximum.heoptimizationalgorithmusedinGGobi

follows these steps:

. Fromthecurrentprojection, A a ,calculatetheinitialPPindexvalue, I

(

XA a

)

. Generate new projections, A

c A i , from a neighborhood of the current

projection wherethe sizeof theneighborhoodisspecified bythecooling param-

eter, c,and A i is a random projection.

. Calculate the index value for each new projection, I i

A a

. Settheprojectionwiththehighestindexvaluetobethenewtarget, A z

(

)

A max i I i ,

and interpolate from A a to A z .

Figure . (top two plots) shows a PP guided tour path ( -D in three dimensions). It

looks very similar to a grand tour path, but there is a big difference: the path repeat-

edly returns tothe same projection andits negative counterpart (both highlighted by

large solid black circles). he middle plot traces the PP index value over time. he

path iterates between optimizing the PP function and random target basis selection.

he peaks (highlighted by large solid black circles) are the maxima of the PP index,

and for the most part, these are at the same projection. he corresponding data pro-

jections (approximately positive and negative of the same vector) are shown in the

bottom row. he index is responding to a bimodal pattern in the data.

here are numerous PP indices. Here are a few that are used in GGobi. For sim-

plicity in the formula for holes, central Mass, and PCA indices, it is assumed that X

is sphered using PCA, that is, the mean is zero and the variance-covariance is equal

to the identity matrix. his assumption is not necessary for the LDA index.

Holes:

i = exp

′

−

(−

i y i

)

I Holes

(

−

exp

(−

)

T is a n

where XA

y , y ,

, y n

]

d matrix of the projected data.

Data Visualization

Search WWH ::

Custom Search

Home