Information Technology Reference
In-Depth Information
“scale factor” in the specific estimation of the confidence interval of c i ). In Fig. 7.9
the t -distribution t ( 14 ) , with 14 degrees of freedom, is depicted with significance
level
α =
0
.
05.
7.7.2
The Classical Stepwise Regression
In the previous section we have defined main concepts about multiple regression,
by indicating some statistical tests which are used to understand the correctness of
a given regression model. In this section, we address the problem of variable selec-
tion , that is, the problem of deciding which independent variables have to enter into
a multiple regression model, among a given set of possible variables. The simplest
method we can define consists of running all possible regressions, for all possible
choices of independent variables, and then choosing the best model by selecting the
one having the highest R 2 or the lowest MSE. This brute-force method has the prob-
lem that it considers a number of models which increases exponentially with respect
to the number of possible variables. In fact, the number of different models that we
can define by means of k independent variables is 2 k .
The stepwise regression algorithm provides a method for variable selection which
allows us to obtain good regression models with a lower complexity in time. This
algorithm does not necessarily find the best model among all possible 2 k models, but
it allows us to find a good model in a feasible time even when we need to consider
a high number of independent variables. The method uses a statistical test, again
based on F -distribution, which is called partial F-test , as it evaluates the relative
significance of a subset of all possible variables.
Suppose that a regression model of Y with k independent variables is postulated:
= β
+ β
+ β
+ ... + β k X k + ε .
Y
1 X 1
2 X 2
(7.25)
0
We will call this model the full model in the sense that it includes the maximal set
of independent variables. Now, suppose that we want to test the relative significance
of a subset of r of the k independent variables in the full model. The partial F -test
provides a statistical criterion for evaluating if the full model given in Eq. (7.25) is
better than the reduced model with only k
r variables:
Y
= β 0 + β 1 X 1 + β 2 X 2 + ... + β k r X k r + ε .
(7.26)
This corresponds to comparing the two hypotheses given in Table 7.8.
Ta b l e 7 . 8 Hypothesis of the partial F -test
H 0 : β k r + 1 = β k r + 2 = ... = β k = 0
H 1 : β k r + 1 , β k r + 2 ,..., β k are not all zero.
 
Search WWH ::




Custom Search