Biology Reference
In-Depth Information
where x i 5
X i 2,
X
.
(the difference between an observed value of X i and its expected
value
,
X
.
, which is the sample mean) and y i 5
Y i 2,
Y
.
(the difference between an
observed value of Y i and its expected value
), x i and y i are called centered versions of
the original variables. Thus, we are summing residuals, or deviations from expected
values, over all N individuals in a population. By minimizing this function, we will obtain
the best estimates for m and b.
To find the values of m and b that minimize the sum of squared residuals, we set the
derivative to zero (for both m and b). As you recall from calculus, the derivative of a func-
tion is zero at the maximum and minimum. We then solve for m and b. Using this optimi-
zation method, the equation for the slope, m, can be written as:
,
Y
.
P xy
P x 2
m
(8.3)
5
which is the sum of the products of the deviations divided by the sum of the squared
deviations of the X values (each sum is taken over all individuals). In other words, the
slope is the ratio of the deviations of Y to the corresponding deviations of X. When the
corresponding deviations are identical, the slope is one; when the deviations of Y are a
consistent multiple of the deviations of X, the slope will be that multiple.
Substituting the X i 2,
for y i allows us to compute m directly
from the observed values. The sum of the products can be written as:
X xy
X
for x i and Y i 2,
Y
.
.
X ð X i
X
. Þð Y i
Y
. Þ
(8.4)
5
2,
2,
which can be simplified to:
N X X i Y i 2
X X i X Y i
(8.5)
After applying a similar substitution and simplification to the sum of the squared devia-
tions, we can write:
N P i 5 1 X i Y i
2 P i 5 1 X i P i 5 1 Y i
m
5
(8.6)
2
N P i 5 1 X i
2 P i 5 1 X i
Now that we have an expression for the slope, we can solve for the intercept, b, and
complete the equation for the regression. When b
0,
Y
m
X
, so we can calculate
5
,
.5
,
.
b from the observed values, X i and Y i , and the sample size, N:
P i 5 1 Y i 2
m P i 5 1 X i
N
b
Y
m
X
(8.7)
5,
.2
,
.5
In addition to an estimate of the value of m, we will also need measures of the uncer-
tainty of that estimate. These measures will be used to test whether m is significantly dif-
ferent from zero (because if we cannot say that, we cannot claim that Y depends on X),
and to test whether the value of m differs between samples (whether the relationship
between X and Y is different).
Search WWH ::




Custom Search