Information Technology Reference
In-Depth Information
in comparing the performance of the complete model with the performances
of models whose inputs are subsets of the inputs of the complete model, and in
choosing the best model with respect to an appropriate selection criterion. If
q candidate variables are available, 2 q different combinations of inputs can be
generated, hence at least 2 q models, whose performances should be compared:
such an approach, whose complexity increases exponentially with the number
of variables, is optimal but generally too demanding.
Two simpler, suboptimal strategies are used in practice:
an elimination strategy (stepwise backward regression), whereby the less
significant input is eliminated from the complete model: all submodels
with q
1 inputs are compared, and the best of them (according to an
appropriate criterion) is compared to the complete model. If the submodel
is better than the complete model, that submodel is kept and the procedure
is iterated; otherwise, the complete model is kept;
a constructive strategy (stepwise forward regression), which starts with
the simplest model, whose output is just the mean of the measured output
values in the data set, hence is independent of the inputs: it is thus a model
with zero variables; it is compared to the q models with 1 input; the best
model is chosen, and the procedure is iterated until the addition of a new
input no longer improves the quality of the model.
For both strategies, the maximum number of models is 1 + [ q ( q +1) / 2]: it
grows as the square of the number of candidate variables, which is generally
acceptable for practical purposes.
2.4.2.2 Comparison Criteria
The strategies described in the previous section rely on comparisons between
models that have different numbers of inputs. Several comparison techniques
may be used. We discuss two of them: hypothesis testing, and Akaike's infor-
mation criterion.
Hypothesis Testing. Fisher's Test
The principle of hypothesis testing was discussed in a previous section. When
comparing a submodel to the complete model in an elimination strategy, a
model with q parameters is compared to a model with q <q parameters,
which can be described as testing the null hypothesis “ q − q parameters are
equal to zero” to the alternative hypothesis. This can be done with Fisher's
test, which is described in the additional material at the end of the chapter.
If the comparison to be performed is not between a complete model and
a submodel, i.e., if the set of parameters of a model is not included in the set
of parameters of the other, other tests may be used, such as the likelihood
ratio test [Goodwin et al. 1977] and the LDRT test (logarithm determinant
ratio test) [Leontaritis et al. 1987]. Those tests are asymptotically equivalent
to Fisher's test [Soderstrom 1977].
Search WWH ::




Custom Search