Information Technology Reference
In-Depth Information
Fig. 1.27.
A multilayer Perceptron with
C
outputs for classification. The activation
functions of the output neurons are sigmoids
There are several important differences between a multilayer perceptron
for classification and a multilayer perceptron for regression.
•
The activation functions of the output neurons of neural networks for mod-
eling is usually linear; by contrast, the output neurons of neural networks
for classification have nonlinear activation functions such as sigmoids: since
the outputs of the neural network are probabilities, they must lie between
0 and 1 (readily amenable to [
1
,
+1]); in Chap. 6, a theoretical justifica-
tion for the use of the tanh function as an activation function of output
neurons will be given,
−
•
For classification, minimizing the cross-entropy cost function is more nat-
ural than minimizing the least squares cost function [Hopfield 1987; Baum
1988; Hampshire 1990]; the training algorithms that will be described in
Chap. 2 can readily be applied to this cost function,
γ
i
Log
g
i
(
x
k
)
+(1
γ
i
)Log
1
−
g
i
(
x
k
)
1
.
C
J
=
−
−
γ
i
γ
i
−
i
=1
k
where
γ
i
is the desired value (0 or 1) for output
i
when the classifier's
input is example
k
, described by feature vector
x
k
,and
g
i
(
x
k
) is the value
of output
i
of the classifier. That function is minimum when all examples
are correctly classified.
After training, it is safe to check that the sum of the outputs is equal to 1
for all examples. The Softmax technique [Bridle 1990] guarantees that the
above condition is fulfilled automatically. Of course, that is not a problem for
pairwise classifiers, which have a single output.
The question of overfitting, which we have encountered in nonlinear re-
gression, is also valid for discrimination. If the classifier is overparameterized,
it separates very accurately the patterns of the training set and has a poor
generalization ability. Model selection techniques, such as those described in
Chap. 2, must be used in order to select the best model. Essentially, one must
Search WWH ::
Custom Search