Geoscience Reference
In-Depth Information
Table 4.2 Analysis of variance table to test p explanatory variables for statistical significance
Source of variation
Sum of squares
Degrees of freedom
Mean square
Regression
SSR
p
SSR / p
Residuals
RSS
n
p
1
RSS /( n
p
1)
Total
TSS
n
1
Table 4.3 Analysis of variance table to test q additional explanatory variables for statistical
significance
Source
Sum of squares
Degrees of freedom Mean square
First regression ( p var.)
SSR 1
p
Difference between 1 and 2
Δ
SSR ¼ SSR 2
SSR 1 q
Δ
SSR / q
Second regression ( p + q var.) SSR 2
p + q
Residuals
RSS
n
p
1
RSS /( n
p
q
1)
Total
TSS
n
1
Box 4.4 (continued)
q
X 0
k
X 0 X 1 X k p
Y k
ð
þ
1
Þ
s 2 F 0 : 95 p
ð
þ
1, n
p
1
Þ
should be used.
The squared multiple correlation coefficient R 2
SSR/TSS provides a
measure of the degree of fit. The residual variance becomes: s res ¼
¼
RSS /
( n
p
1). The analysis of variance table (Table 4.1 ) becomes as is
shown
in Table
4.2 with
application
of
the
following F-test:
^ Fp , n
SSR=p
RSS= np 1
.
However, it is more common to apply analysis of variance by adding
q new explanatory variables to the p explanatory variables already consid-
ered. Then the analysis of variance becomes as is shown in Table 4.3 with:
^ Fp , n
ð
p
1
Þ ¼
ð
Þ
ΔSSR=q
RSS= npq 1
ð
p
q
1
Þ ¼
.
ð
Þ
Various techniques of sequential regression analysis are useful. These are forward
selection, stepwise regression and backward elimination (Draper and Smith 1966 ).
When there are p explanatory variables, forward selection begins by finding the
variable that has the largest squared correlation coefficient with the dependent
variable. It is selected first. At the next and later steps the variable that most increases
R 2 is included. The forward selection is stopped when none of the remaining
explanatory variable significantly increases the degree of fit. Stepwise regression
does the same as forward selection except for one refinement. After completing a
single step, one goes back one step whereby the variables already included in the
equation are again checked for statistical significance. Finally, backward elimination
consists of first including all variables in the regression equation and eliminating
them one by one until the F -ratio exceeds a predetermined level.
 
Search WWH ::




Custom Search