Biology Reference
In-Depth Information
variables. CVA requires that the individuals be grouped, because the objective of the
method is to analyze the relative positions of those groups. Consequently, the sample
must be divided into groups before the analysis begins. The description of differences
between groups is optimized relative to the variation within those groups. That optimiza-
tion requires a few more computational steps than PCA, but none of the steps in CVA
introduce new mathematical concepts. CVA will be just a new application of ideas you
have already encountered in the discussion of PCA.
As we will see, optimization of between groups differences with respect to within
group variation has implications for the relative positions of group means and the dis-
tances between them, which are discussed in detail by Mitteroecker and Bookstein (2011) .
In the third section, we discuss an alternative method suggested by them in which PCA is
used to analyze differences between group means without altering their positions or dis-
tance. This Between Groups PCA (BGPCA) may be more appropriate than CVA for some
particular exploratory applications.
PRINCIPAL COMPONENTS ANALYSIS
Geometric shape variables are neither biologically nor statistically independent.
For example, the shape variables produced by the thin-plate spline describe variation in
overlapping regions of an organism or structure. Because the regions overlap, they are
under the influence of the same processes that produce variation; and therefore we expect
them to be correlated. Even when they do not describe overlapping regions, morphometric
variables (both geometric and traditional) are expected to be correlated because they
describe features of the organism that are functionally, developmentally or genetically
linked. Their patterns of variation and covariation are often complex and difficult to inter-
pret. The purpose of PCA is to simplify those patterns and make them easier to interpret
by replacing the original variables with new ones (principal components, PCs) that are lin-
ear combinations of the original variables and independent of each other.
One might wonder why it would be a worthwhile exercise to take simple variables that
covary with each other and replace them with complex variables that do not covary. Part
of the value of this exercise arises from the fact that the new complex variable is a function
of the covariances among the original variables. It thus provides some insight into the cov-
ariances among variables, which can direct future research into the identity of the causal
factors underlying those covariances. Another useful purpose served by PCA is that most
of the variation in the sample usually can be described with only a few PCs. Again this is
useful, because it simplifies and clarifies what needs to be explained. Another important
benefit of PCA is that the presentation of results is simplified. It is much easier to produce
and explain plots of the three PCs that explain 90% of the variation than it is to plot sepa-
rately and explain the variation on each of 30 original variables.
An indirect benefit of PCA that is useful (but often misused) is that it simplifies the
description of differences among individuals. Clusters of individuals are often more
apparent in plots of PCs than in plots of the original variables. Finding such clusters can
be quite valuable, but those clusters do not represent evidence of statistically distinct
Search WWH ::




Custom Search