Biology Reference
In-Depth Information
calculations because most modern approaches to calculating sums of squares are based on
matrix algebra. Thus, they use the difference between sums of squares explained by differ-
ent models expressed in terms of design matrices. The simple summation methods are eas-
ier to understand at an introductory level, but harder to scale up to large problems and
probably more prone to rounding errors. Researchers interested in programming their
own GLM methods will need to consult more advanced texts to develop a complete
understanding of these approaches ( Rencher, 1995; Searle, 1997, 2006; Anderson, 2001a,b;
Rencher and Schaalje, 2008 ).
Let us suppose that Y has J distinct levels and n j specimens per cell. We won't require a
balanced design at this point but we will require that Y be centered (i.e. the mean value is
zero), removing one degree of freedom. For the ith specimen in cell j we have the model
Y ij
5 α
1 ε
(9.12)
j
ij
where
α j is the contribution of the jth level of the factor to the value, and
ε ij is the error.
Notice that we require that the mean value of the residual terms
ε ij be zero, with variance
σ e 2 , and the mean value of Y is zero (because Y is centered). Consequently, the n j α j terms
summed over all the cells must also equal zero:
X
J
n j
α
5
0
(9.13)
j
j
1
5
α j but only (J
1) are independent because, as men-
tioned above, constraining the sum to be zero removes one degree of freedom.
We can now look at variance partitioning to understand how to form F -ra tios. We start
by looking at the summed square values around the mean value, which is Y, then we split
these sums of squa re s into two terms, one due to the scatter about the mean of each group
(level of the factor Y j ), and the other due to the scatter of the group means about the grand
mean. The total sum of squares is:
This means that there are J values
2
n j
X
J
X
X
J
X
k
X
J
2
2
2
SS total 5
1 ð
Y ij 2
Y
Þ
1 ð
Y ij 2
Y j Þ
n j ð
Y j 2
Y
Þ
(9.14)
5
1
j
5
1
i
5
j
5
1
i
5
j
5
1
Note that Y must be zero; we have included Y here so that our expression for the SS
will be consistent with the other standard presentations of these ideas.
The first term represents the error and the second is the SS due to A (the between
groups of factor sum of squares).
X
J
X
k
2
SS error
5
1 ð
Y ij
2
Y j
Þ
(9.15)
j
1
i
5
5
The SS error term, also called the residuals, has an expected value equal to the degrees of
freedom times
σ 2 . The degrees of freedom in the error term are given by:
X
J
df error
n j
J
(9.16)
5
2
j
5
1
Search WWH ::




Custom Search