Statistics - Geometric Morphometrics for Biologists

Biology Reference

In-Depth Information

where

Σ P is the variance covariance matrix of the predicted values of Y at a value of X in

the data set, det is the determinant of the matrix, and

is the variance covariance matrix

for the original set of variables (e.g. partial warp scores or principal component scores or

any other complete set of scores for our data). Wilk's lambda can also be computed as a

function of the eigenvalues of the inverse of the variance covariance matrix of the

residual

Σ

Σ R ) 2 1

(

times the variance

covariance matrix of

the predicted values (

Σ P ).

2 value.

Several other conventional multivariate test criteria can also be used, including Roy's max-

imum root test, Pillai's trace, Hotelling

Approximations are available to convert the

Λ

value into an F-statistic or a

χ

Lawley trace) ( Rencher, 1995 ), all of which give

the same results when there is only one independent variable and the sample size is large.

When the sample size is small (relative to the number of landmarks), the authors' experi-

ence is that Wilk's

can substantially overestimate the variance explained by the regres-

sion model, perhaps due to difficulties associated with estimating variance covariance

matrices at small sample sizes. The other analytic tests could be expected to share this

behavior. It is not clear what sample size must be used to obtain consistent results but the

general rule of thumb is that there should be five times as many observations as estimated

parameters. The number of parameters rises rapidly for landmark data, especially for

three-dimensional landmark data, and even more so for semilandmark data, making sam-

ple size a substantial concern. For that reason, results of these analytic tests should be

viewed with caution unless sample sizes are large relative to the number of parameters.

Λ

DIST ANCE-BASED METHODS OF HYPOTHESIS TEST ING

There is an alternative approach to testing statistical hypotheses, which uses variances

expressed in units of Procrustes distance rather than variance covariance matrices of the

partial warp scores or other variables ( Goodall, 1991 ). By this approach, we use the

Procrustes distance between each individual's observed shape and its expected value

given that individual's value on the independent variable. The summed squared

Procrustes distances give a measure of the variance in shape that is not explained by X.

Thus, it is a measure of the residual, i.e. the variance not explained by the regression,

because the distances being squared and summed are the deviations from the regression,

hence they are not explained by the model. This distance-based approach has the advan-

tage of expressing deviations in terms of the familiar (and meaningful) units of Procrustes

distance and it also has the large advantage of expressing variance as a simple squared

Procrustes distance a scalar rather than a matrix. Using distance metrics in statistics has

proven quite useful in other contexts as well (e.g. Anderson, 2001a,b ).

The generalized form of this test is an F-ratio of the variance explained by the regres-

sion model relative to that not explained by the regression model, in which the variance is

expressed as a summed square Procrustes distance. Goodall's (1991) original derivation (to

be discussed later), was an F-test for the difference in the means of the two groups relative

to the variance within each one. The test does make restrictive assumptions about the vari-

ance at each landmark, specifically, that it is normally, independently and identically dis-

tributed at each landmark. Rather than use a test that depends on this restrictive model,

we can instead use permutation tests based on the concept of exchangeability of the

Geometric Morphometrics for Biologists

Search WWH ::

Custom Search

Home