Information Technology Reference
In-Depth Information
3
Modeling Methodology: Dimension Reduction
and Resampling Methods
J.-M. Martinez
3.1 Introduction
This chapter provides additional elements of methodology for neural network
design. It provides answers to methodological questions raised by neural net-
work modeling. As explained in the previous chapter, there is more to the
design of a neural model than choosing the number of hidden neurons and
implementing a training algorithm:
Before using a neural network or any other statistical model, it may be
necessary to construct new input variables to decrease their number whilst
losing as little information as possible concerning their distribution.
After estimating the parameters of the model (training if the model is a
neural network), the user should assess the risk of using the model thus
designed; that risk is linked to the generalization error, which cannot be
computed, hence must be estimated. In the previous chapter, we discussed
a method for estimating the generalization error by computation of the
virtual leave-one-out score. In this chapter, we describe another recent
statistical technique, based on resampling, which is used to estimate the
statistical characteristics of the generalization error.
Therefore, the aspects of the methodology described in this chapter are re-
lated to
the preprocessing to be performed on the data,
the techniques for reducing the number of inputs, based on principal com-
ponent analysis and curvilinear component analysis,
the estimation of the generalization error using statistical resampling tech-
niques, with emphasis on the bootstrap.
The reduction in size is not only intended to decrease the number of vari-
ables describing each example: it also attempts to design more compact data
representations, thus making their analysis easier. In the context of linear
Search WWH ::




Custom Search