Graphics Programs Reference
In-Depth Information
data set can easily be explored by visual inspection of a 2D histogram or an
xy plot, the graphical display of a three variable data set requires a projection
of the 3D distribution of data points into 2D. It is impossible to imagine or
display a higher number of variables. One solution to the problem of visu-
alization of high-dimensional data sets is the reduction of dimensionality. A
number of methods group highly-correlated variables contained in the data
set and then explore a small number of groups.
The classic methods to reduce dimensionality are the principal compo-
nent analysis (PCA) and the factor analysis (FA). These methods seek the
directions of maximum variance in the data set and use these as new coordi-
nate axes. The advantage of replacing the variables by new groups of vari-
ables is that the groups are uncorrelated. Moreover, these groups often help
to interpret the multivariate data set since they often contain valuable infor-
mation on process itself that generated the distribution of data points. In a
geochemical analysis of magmatic rocks, the groups defi ned by the method
usually contain chemical elements with similar ion size that are observed in
similar locations in the lattice of certain minerals. Examples for such behav-
ior are Si 4+ and Al 3+ , and Fe 2+ and Mg 2+ in silicates, respectively.
The second important suite of multivariate methods aim to group ob-
jects by their similarity. As an example, cluster analysis (CA) is often
applied to correlate volcanic ashes as described in the above example.
Tephrochronology tries to correlate tephra by means of their geochemical
fi ngerprint. In combination with a few radiometric age determinations of
the key ashes, this method allows to correlate sedimentary sequences that
contain these ashes (e.g., Westgate 1998, Hermanns et al. 2000). More
examples for the application of cluster analysis come from the fi eld of
micropaleontology. In this context, multivariate methods are employed to
compare microfossil assemblages such as pollen, foraminifera or diatoms
(e.g., Birks and Gordon 1985).
The following text introduces the most important techniques of multivari-
ate statistics, principal component analysis and cluster analysis (Chapter 9.2
and 9.3). A nonlinear extension of the PCA is the independent component
analysis (ICA) (Chapter 9.4). Firstly, the chapters provide an introduction to
the theory behind the techniques. Subsequently, the use of these methods in
analyzing earth sciences data is illustrated with MATLAB functions.
9.2 Principal Component Analysis
The principal component analysis (PCA) detects linear dependencies be-
Search WWH ::




Custom Search