Biology Reference
In-Depth Information
observations in one or more cells, or because there are observations in all cells but the
sample sizes are unequal. We do not consider the first case here; this section concerns the
case in which the sample sizes are not equal for all cells. The difficulty posed by unbal-
anced designs arises from the fact that the factors are not orthogonal even if there are no
actual interactions among factors. That causes two problems. First, it is not possible to par-
tition the variance cleanly into the main effects of the factors. The estimates of the main
effects are ambiguous because we get different estimates depending on whether the means
are weighted by the sample size. Second, the hypotheses tested using F-ratios become
complex functions of the distribution of the sample sizes within the cells rather than being
simple statements about the impact of the factors. As a result, there is some controversy
over the meaning of the hypotheses underlying the F-tests as well as about the appropriate
sums of squares to use in the tests. Minor departures from a balanced design have less
severe consequences than more drastic departures, thus modest variation in sample sizes
is probably not a concern, particularly when using permutation methods for testing the
statistical significance of F.
There are at least six distinct approaches to calculating the sums of squares in unbal-
anced designs (for an overview see http://www.statsoft.com/textbook/general-linear-
models ). We will discuss only three of them because they are the ones applicable to cases
in which every cell has at least one observation. These three types of sums of squares are
routinely called Type I, Type II, and Type III Sums of Squares (following SAS usage). We
discuss these three using a two factor model of the form:
Y
5
A
1
B
1
A
3
B
1 ε
(9.25)
Type I Sums of Squares
Type I sums of squares are also called sequential or hierarchical sums of squares. The
estimates for the sums of squares are obtained for each term by computing the sums of
squares for two models, the model that lacks that term and the model that includes it. The
sums of squares for the model lacking the term are subtracted from the sums of squares
for the model including it. So, for the two factor model, we would compute the sums of
squares for the model containing only A:
Y
5
A
1 ε
(9.26)
which are calculated as they would be for a balanced design. We would then compute the
sum of squares for the model containing both A and B:
Y
A
B
1 ε
(9.27)
5
1
which we denote as SS A 1 B . The Type 1 sum of squares for B is then SS B j A 5
SS A 1 B 2
SS A ,
where B
A means the sum of squares due to B given the sum of squares due to A. Next
we would calculate the sum of squares due to all terms: SS A 1 B 1 AB ,
Y
j
A
B
A
B
1 ε
(9.28)
5
1
1
3
Search WWH ::




Custom Search