General Linear Models - Geometric Morphometrics for Biologists

Biology Reference

In-Depth Information

Y contains the deviations of each individual from the overall mean. B is the matrix of coef-

ficients of the model, which will be fitted to the data, X is the centered design matrix and

is the matrix of residuals or error terms. If Y is a matrix with N rows (one per specimen)

and Q columns, the matrix of residuals,

, will also have N rows and Q columns. The size

of the design matrix, X, C

N, depends on the design, i.e. on the number of factors, the

number of levels of each factor and the number of distinct combinations of factors in any

interaction terms, as well as the number of interaction with covariates. It can also depend

on how the model is coded because the number of columns, ignoring interaction terms for

the moment, could either equal G

1 (where G is the number of groups) or G. It takes

1 columns to specify the design, so using G columns makes the coding scheme redun-

dant (and the X matrix is then not invertible). We will therefore focus on design matrices

that have G

1 columns.

To understand the codes, it is important to remember that we are using regression to

analyze categorical factors. We therefore need values for the categorical factors that make

the results of the regression interpretable. One coding method is called “dummy coding”.

According to this method, all individuals are coded as either a zero or one to indicate each

individual's level on each categorical factor; including all interaction terms. Which group

is coded as zero or one is arbitrary, but the interpretation depends on the codes because

the intercept is the mean of the group coded as zero. Usually, the control group is the one

coded zero and the null hypothesis is that the means of the other groups do not differ

from the mean of the control group. The coefficients for the other groups give the devia-

tions from the control group mean. If there are three groups, it takes two columns to

encode a single factor; all individuals belonging to the first group will have ones in the

first column and zeros in the second, all individuals belonging to the second group will

have zeros in the first column and ones in the second and all individuals belonging to the

third group will have zeros in both columns. To obtain the codes for the interaction terms,

the columns of codes for the factors are multiplied by each other. Coding can become

complex when factors are nested, so the X matrices for these more complex designs are

discussed later in context of the more complex models.

An alternative coding method is called “effect coding”; according to this method, all

individuals are coded as negative one, zero, or positive one. If there are only two groups,

the first one is coded as

1, the other as 1, and if the design is balanced, the mean for the

column is zero. If there are three groups, the first is coded as

1, the last as 1, and the sec-

ond by 0; in the second column, the first group is coded as 0, the second as

1, and the

third as 1. Using this method, the intercept is the grand mean and X 1 is the deviation of

the first group from that mean, X 2 is the deviation of the second group from that same

mean, etc. So, when testing the statistical significance of the coefficients for X , we are test-

ing the null hypothesis that one group does not differ from the grand mean by more than

expected by chance. As mentioned above, the codes for interaction terms are obtained by

multiplying the columns for the interacting factors.

Coding is more difficult when the design is unbalanced for a reason that may become

obvious if you consider that the grand mean will not be zero when there are different

numbers of positive and negative ones. The codes will therefore have to be modified to

ensure that the grand mean is still zero and that the columns of X are mutually orthogo-

nal. One approach is to code the first group as (N

n i )/N where n i is the number of

Geometric Morphometrics for Biologists

Search WWH ::

Custom Search

Home