Modeling Methodology: Dimension Reduction and Resampling Methods - Neural Networks: Methodology and Applications

Information Technology Reference

In-Depth Information

Fig. 3.1. Data centering and reduction

That simple preprocessing, applied to all components, is often used to

detect anomalies in the database. A standard deviation that is too low may

mean that the corresponding variable has too small variability to actually

have an influence on model. Variables with zero standard deviation should

of course be ignored, since they do not provide any information in the design of

the model. For a more extensive diagnosis of such “anomalies”, the advice of

the process expert must be sought.

3.2.2 Preprocessing Outputs for Supervised Classification

Preprocessing of outputs is link to output encoding. For supervised classifi-

cation (described in detail in Chap. 6), the encoding of outputs is associated

with posterior probabilities, so that the problem of preprocessing is irrele-

vant: the encoding of posterior probability leads to representing each class by

an output neuron with a logistic activation function. The associated cost is

cross-entropy rather than the least-squares cost. For two-class discrimination,

where y and y ∗ are the network output and the desired class code respectively,

cross-entropy is defined by

J = y ∗ ln y +(1

y ∗ )ln(1

−

y ) .

Search WWH ::

Custom Search

Home