Information Technology Reference
In-Depth Information
Fig. 7.10 The plot of the F -distribution F [ r , n ( k + 1 )] . The full right-tailed area is the one which
represents the rejection region for the null hypothesis H 0 defined in Table 7.8.
The value which is considered in the partial F -test is given by:
F [ r , n ( k + 1 )] = (
SSE R
SSE F ) /
r
(7.27)
MSE F
where SSR R is the sum of error squares of the reduced model, SSE F is the sum of
error squares of the full model, MSE F is the mean square error of the full model,
k is the number of independent variables in the full regression model, and r is the
number of variables removed from the full model when it is reduced. The difference
SSE R
SSE F is called the extra sum of squares associated to the reduced model.
Since this sum of squares refers to r variables, it has r degrees of freedom. Also the
partial F -test is based on an F -distribution (like the test defined in Table 7.7 in the
previous section). After having fixed a significance level
for the test, we can reject
the null hypothesis H 0 (concluding that the full model is statistically better than the
reduced one) when the value of the partial F -statistic is greater than the threshold
value F [ α ; r , n ( k + 1 )] (see Fig. 7.10).
The algorithm of stepwise regression carries out partial F -tests by using the no-
tion of p-value . The p-value is the probability that the null hypothesis H 0 of Ta-
ble 7.8 is correct, and is given by the size of the right-tailed area, in Fig. 7.10, of
the considered F -distribution beyond the critical value given by the partial F -test.
Using of p-values to carry out the test is perfectly equivalent to use critical values
already considered in F -statistics.
The regression algorithm is made of two different parts. The first one, called for-
ward selection (steps 2 and 3, below), is devoted to increasing the set of independent
variables used by the regression model, by adding the most significant ones among
those which do not occur in the model. The second phase is called backward elimi-
nation (steps 4 and 5, below) and is devoted to the elimination of variables that have
been previously inserted in the regression model, but which are become statistically
α
Search WWH ::




Custom Search