HugeMultidimensional Data Visualization: Back to the Virtue of Principal Coordinates and Dendrograms in the New Computer Age - Data Visualization

Graphics Reference

In-Depth Information

Principal Axis Methods and Classiication:

aUniiedView

4.5

he actual knowledge base in numerical analysis and powerful modern PCs allow

us to successfully make use of the computational aspects in multidimensional data

analysis (MDA). However, there are many analysis strategies that, without loss of

e ciency, offer good solutions.

Intheprevioussectionswehaveshownthecentralityofthedistanceinfacto-

rial and clustering methods. his common element has been largely used to perform

two-step analysis, namely, using both factorial andclusteranalysis. Automatic classi-

fication techniques are used to group objects described by a set of variables; they do

not make any claim to optimality. Nevertheless, they give relatively fast, economical,

and easily interpretable results. PCA and other factorial methods rarely provide an

exhaustive analysis of a set of data. herefore, it is useful to perform a clustering of

the observations because this helps to reduce the FAcomplexity. Additionally, it is of

value touseclassification analysis tosummarize the configuration of points obtained

from a principal axis analysis. In other words, a further reduction in the dimension-

ality of the data is valuable and leads to results that are easier to analyze. So-called

“tandem analysis” represents a unified approach in which FA and clustering criteria,

bothbased onthe samenotion ofdistance, aresimultaneously satisfied inaniterative

model (Vichi and Kiers, ).

Allmethodsofmultivariate descriptive statistical analysis areusedinthesamesit-

uation wheretheuserisfacedwitharectangular matrix.hismatrixmaybeacontin-

gency table, a binary matrix (with values of or according to whether an object has

a certain attribute), or a matrix of numerical values. he use of automatic classifica-

tion techniques implies some basic underlying concepts with respect to the purpose

oftheanalysis. Eitheritisassumedthatcertaingroupsmustexistamongtheobserva-

tionsor,onthecontrary,theanalysisrequiresagroupingoftheobservations.Inother

words, a -D continuous visualization of the statistical relationships is not enough.

here is also an interest in uncovering groups of individuals or of characteristics.

A given set of results might be reached through different steps and might lead to

different interpretations. For example, the problem may be to discover a partition

that really exists and that was hypothesized before carrying out the statistical analy-

sis. Conversely, it may be useful to employ partitions as tools or as surrogates in the

computations that make it easier to explore the data. In any case, using principal axis

methods in conjunction with classification makes it possible to identify groups and

to determine their relative positions.

Oten partitions or tree structures are used to amplify the results of preliminary

principal axis analysis during the exploratory phases of data analysis. here are sev-

eralfamiliesofclassificationalgorithms:agglomerativealgorithms,inwhichtheclus-

tersare builtbysuccessive pairwise agglomeration ofobjects andwhichprovideahi-

erarchy of partitions of the objects; divisive algorithms, which proceed by successive

dichotomizations ofentire setsofobjects andwhichalsoprovideahierarchyofparti-

Data Visualization

Search WWH ::

Custom Search

Home