Biology Reference
In-Depth Information
Before we derive the measures of uncertainty, it will be useful to introduce some short-
hand notation. The sums of squares of the deviations x i and y i will be:
X
N
x i
s xx 5
(8.8)
i
1
5
and
X
N
y i
s yy 5
(8.9)
i
5
1
Similarly, the sum of the products of the deviations will be:
X
N
s xy
x i y i
(8.10)
5
i
5
1
In testing whether the regression is significant, it is important to keep in mind that we are
asking whether the relationship between X and Y explains a significant proportion of the vari-
ance in Y. If we knew the values of the error terms,
ε i , we could compute their variance and
use those estimates to determine the proportion of variance in Y that is explained by the
regression of Y on X. More often than not,
ε i are unknown, so we need a different approach.
What we can do is to compute an F-ratio from the information that we have. F is a ratio
of variances (or mean squared deviations) that are sums of squared deviations divided by
the appropriate degrees of freedom for the terms in the model. The degrees of freedom
of the model are simply equal to the number of estimated or fitted parameters in the
model. The ratio of the sum of squared deviations explained by the regression is S XY =
S XX :
This has one degree of freedom, so the proportion of the variance explained is also
S XY =
S XX :
Recall that the slope is s XY /s XX , so the explained variance can also be written as
m
s XY . The unexplained or residual sum of squared deviations is s YY 2
m
s XY , which has
N
2 degrees of freedom, so the unexplained variance is (s YY 2
m
s XY )/(N
2). F is the
2
2
explained variance divided by the unexplained, so F is (N
2)m
s XY /(s YY 2
m
s XY ) with
2
1 and N
2 degrees of freedom. The corresponding p-value indicates the likelihood that
such a high F is due to chance, meaning that such a large proportion of the variance in Y
explained by the regression of Y on X is due to chance.
2
THE CORRELATION COEFFICIENT
The correlation coefficient (r), which ranges from minus one to one, expresses the
strength of the linear relationship between X and Y. Its squared value (r 2 ), which ranges
from zero to one, indicates the fraction of the variance in Y that is explained by X. The
expression for r 2 is:
s XY
s XX s YY
r 2
(8.11)
5
Search WWH ::




Custom Search