Geoscience Reference
In-Depth Information
Table 4.1 Analysis of variance table for linear regression
Source
Sum of squares
Degrees of freedom
Mean square
Linear regression
SSR
1
SSR
Residuals
RSS
n
2
RSS
/(
n
2)
Total
TSS
n
1
Source: Agterberg (
1974
, p. 255)
sum residual sum of squares (RSS). The multiple squared correlation coefficient
(
r
2
but often written as
R
2
) satisfies
r
2
¼
SSR/TSS.
Box 4.3: Degree of Fit and Analysis of Variance
TSS
SSR + RSS signifies that the sum of squares of
n
deviations of a
dependent variable
Y
from their mean: TSS
¼
X
Y
i
2
can be decom-
¼
Y
posed as
the so-called sum of squares due to regression: SSR
¼
X
Y
i
X
Y
i
2
plus the residual sum of squares:
Y
i
Y
¼
Y
Y
X
Y
i
Y
i
2
,or
X
Y
i
X
Y
i
2
þ
Y
i
RSS
¼
Y
¼
Y
Y
X
Y
i
Y
i
2
. f
Y
has
normal
distribution,
then:
SSR
¼
X
Y
i
2
X
X
i
2
2
2
2
1
Y
¼
β
X
¼
˃
ˇ
ðÞ
. The so-called residual variance
is:
s
res
¼
2). Analysis of variance as developed primarily by Fisher
(
1960
) uses the following
F
-test. Because chi-square statistics can be added:
˃
RSS
/(
n
2
2
(
n
2
2
(1) +
2
2
(
n
ˇ
1)
¼
˃
ˇ
˃
ˇ
2). Consequently,
2
ðÞ
ˇ
SSR
Þ
¼
Þ
¼
F
1,
n
ð
2
Þ
.
RSS
=
ð
n
2
ˇ
2
n
ð
2
Þ=
ð
n
2
Linear regression results often are summarized in an analysis-of-variance table
(Table
4.1
). The ratio of the two mean squares in the last column of Table
4.1
provides an estimate of
F
(1,
n
2). If this
F
-ratio is significantly greater than 1, it
may be assumed that there is a significant linear association between
X
and
Y
and
that the slope of the regression line differs significantly from zero.
It can be shown that
the variance of
the calculated values
satisfies:
Y
i
¼
2
X
X
i
X
ð
Þ
X
i
X
s
2
s
res
1
R
i
;
with
R
i
¼
n
þ
2
. From this result, the following four
ð
Þ
types of confidence belt can be derived:
R
p
; This belt consists of two hyperbolas that enclose an area
about the best-fitting straight line;
t
(
n
Y
i
1.
tn
ð
2
Þ
s
res
2) is Student's
t
for (
n
2) degrees of
freedom. A 95-% confidence belt has
t
(
n
2). The purpose of
this belt is to set confidence intervals on all single values of
2)
¼
t
0.975
(
n
Y
i
that could be
estimated for given values of
X
i
.
Search WWH ::
Custom Search