Statistics - Geometric Morphometrics for Biologists

Biology Reference

In-Depth Information

data and for data consisting of landmarks plus semilandmarks, but for the remainder of

this discussion we will mention only the two-dimensional landmark case. The adjustments

are straightforward except that, in the case of landmark-only data, the dimensionality of the

data equals the number of partial warps (including the two uniform components), which

will not be the case for data that include semilandmarks as well as landmarks. That distinc-

tion, however, is not important when using permutation or bootstrap methods based on

Procrustes distances, as we will see later. Regression models can be framed in terms of par-

tial warp scores or principal component scores or coordinates of landmarks because all the

mathematics involved is linear so a rotation of the data will not alter the answers, so long as

the mathematics is done correctly.

To regress shape on an independent (scalar) variable, we regress the shape data on the

independent variable. For example, suppose we have P partial warp and uniform compo-

nents, which we can write as a row vector {Y 1 , Y 2 , Y 3 ,

Y P }. Then the (linear) model for

...

the regression of that vector on a scalar (X) is:

Y 1

;

Y 2

;

Y 3

; ...

Y P g 5 f

m 1

;

m 2

;

m 3

; ...

m P g

1 f

b 1

;

b 2

;

b 3

; ...

b P g 1 fε

; ε

; ... ε

(8.13)

where {m 1 , m 2 , m 3 ,

... ε P } are vectors of slope and

intercept coefficients and residuals, respectively. Although this expression looks far more

complicated than the one for a bivariate regression, it actually is not. In fact, we can deter-

mine the ith component of the slope and intercept terms using the same m i and b i values

that minimize the residuals in the corresponding bivariate model. Each observation Y is

now a vector, as are the slope, intercept and each of the errors.

Estimating slope and intercept coefficients is no more complex in the multivariate case

than it was in the bivariate case. But in one important respect, the analysis actually is

more complex checking the assumption of linearity. There are at least two ways to check

this assumption for multivariate data, although neither is ideal. One is to look at the rela-

tionship between each individual component of shape and the independent variable, such

as by regressing each partial warp on size. If one or more exhibits a strong and highly

non-linear relationship, such as shown in Figure 8.1A , then it is unlikely that shape and

size are linearly related. This method for checking linearity is not ideal because it falls

back on inspecting multiple bivariate regressions when it is multivariate linearity that

really matters. Another approach is to estimate the Procrustes distance between each spec-

imen and the shape at the lowest value on the independent variable. Regressing that dis-

tance on the independent variable may show if that relationship is non-linear (as in

Figure 8.1B ). If it is not, it is unlikely that shape and size are linearly related. This method

is again not ideal, because the Procrustes distance measures only the magnitude of the dif-

ference between each specimen and the reference, not its direction. Two specimens that

differ a great deal from each other in shape may be equally distant from the reference.

Despite the deficiencies of these two less than ideal methods, we can use them to check

whether it is unlikely that shape is linearly related to size. The results shown in Figure 8.1

both indicate a non-linear relationship of shape and centroid size, and both suggest that

shape might be linearly related to the log of centroid size. That linear relationship to the

log of centroid size is suggested by the shape of the curves because they depict a very

rapid change in shape relative to size over the smaller values of size. So we can try a log

m P }, {b 1 , b 2 , b 3 ,

b P } and {

ε 1 ,

ε 2 ,

ε 3 ,

...

Geometric Morphometrics for Biologists

Search WWH ::

Custom Search

Home