Information Technology Reference
In-Depth Information
Fig. 2.2. Data dimensionality reduction
2.4.2 Choice of Relevant Variables
In the present chapter, the second task of input selection, i.e., the rejection of
inputs whose influence on the output can be neglected, is described in more
detail.
When modeling a physical or chemical process, the variables that have
an influence on the quantity to be modeled are generally analyzed in detail,
from first principles, by the experts; therefore, a systematic variable selection
procedure is not necessary. By contrast, when modeling an economic, social, or
financial process, or when modeling a very complex physical system, experts
may give opinions about the relevant variables, but those are often more or
less subjective, and need rigorous testing. Then the selection process starts
with a large number of candidate variables, among which the factors that are
really relevant should be selected. The results of the selection may disagree
with current beliefs.
A large number of selection techniques were suggested (see for instance
[McQuarrie et al. 1998], and, for a recent review, [Guyon et al. 2005]).
The principles of the most popular technique are first described; then a tech-
nique that is intuitive and based on first principles is explained: the probe
feature method.
2.4.2.1 Input Selection Strategies
The most natural strategy, for the choice of a set of inputs, consists in starting
with an oversize set of candidate inputs (the model is said to be “complete”),
Search WWH ::




Custom Search