Geoscience Reference
In-Depth Information
Table 4.1 Analysis of variance table for linear regression
Source
Sum of squares
Degrees of freedom
Mean square
Linear regression
SSR
1
SSR
Residuals
RSS
n
2
RSS /( n
2)
Total
TSS
n
1
Source: Agterberg ( 1974 , p. 255)
sum residual sum of squares (RSS). The multiple squared correlation coefficient
( r 2 but often written as R 2 ) satisfies r 2
¼
SSR/TSS.
Box 4.3: Degree of Fit and Analysis of Variance
TSS
SSR + RSS signifies that the sum of squares of n deviations of a
dependent variable Y from their mean: TSS
¼
X Y i
2 can be decom-
¼
Y
posed as
the so-called sum of squares due to regression: SSR
¼
X Y i
X Y i
2
plus the residual sum of squares:
Y i
Y
¼
Y
Y
X Y i Y i
2 ,or X Y i
X Y i
2
þ
Y i
RSS
¼
Y
¼
Y
Y
X Y i Y i
2 . f Y
has
normal
distribution,
then:
SSR
¼
X Y i
2 X X i
2
2
2
2 1
Y
¼ β
X
¼ ˃
ˇ
ðÞ
. The so-called residual variance
is: s res ¼
2). Analysis of variance as developed primarily by Fisher
( 1960 ) uses the following F -test. Because chi-square statistics can be added:
˃
RSS /( n
2
2 ( n
2
2 (1) +
2
2 ( n
ˇ
1)
¼ ˃
ˇ
˃
ˇ
2). Consequently,
2 ðÞ
ˇ
SSR
Þ ¼
Þ ¼
F 1, n
ð
2
Þ
.
RSS
=
ð
n
2
ˇ
2 n
ð
2
Þ=
ð
n
2
Linear regression results often are summarized in an analysis-of-variance table
(Table 4.1 ). The ratio of the two mean squares in the last column of Table 4.1
provides an estimate of F (1, n
2). If this F -ratio is significantly greater than 1, it
may be assumed that there is a significant linear association between X and Y and
that the slope of the regression line differs significantly from zero.
It can be shown that
the variance of
the calculated values
satisfies:
Y i ¼
2
X X i X
ð
Þ
X i X
s 2
s res
1
R i ;
with R i ¼
n þ
2 . From this result, the following four
ð
Þ
types of confidence belt can be derived:
R p ; This belt consists of two hyperbolas that enclose an area
about the best-fitting straight line; t ( n
Y i
1.
tn
ð
2
Þ
s res
2) is Student's t for ( n
2) degrees of
freedom. A 95-% confidence belt has t ( n
2). The purpose of
this belt is to set confidence intervals on all single values of
2)
¼
t 0.975 ( n
Y i that could be
estimated for given values of X i .
Search WWH ::




Custom Search