Information Technology Reference
In-Depth Information
Let U =( X ds ,T ds ) denote the training set extracted from a given dataset
and U j the subsets obtained by randomly dividing U into several groups with
an equal number of samples, such that
L
j =1 |
|
U
|
= r +
U j |
,
(6.10)
where L is the number of subsets and r the remainder. This division is per-
formed in each epoch of the learning phase. The partition of the training
set into subsets reduces the probability of the algorithm getting trapped in
local minima since it is performed in a random way. The subsets are sequen-
tially presented to the learning algorithm, which applies to each one, in batch
mode, the respective back-propagation and subsequent weight update.
One of the advantages of using the batch-sequential algorithm is the de-
crease of algorithm complexity. The complexity of the original MEE algo-
rithm is O (
2 ); with the MEE-BS algorithm the complexity is proportional
|
U
|
/L ) 2 , which means that, in terms of computational time, one achieves
a reduction proportional to L . The number of subsets, L , is influenced by the
size of the training set. One should avoid using subsets with a number of
sampleslessthan40 [202].
Classifier performance using the MEE-BS algorithm seems to be quite
insensitive to "reasonable" choices of the number of subsets, L . Table 6.3
shows the best results in experiments reported in [202] on four real-world
datasets. No statistically significant variation of the error rate statistics (mean
and standard deviation in 20 repetitions of the hold-out method) were found
for two different values of L . The same conclusion was found to hold when
the number of epochs and of hidden neurons used in these experiments were
varied. In what concerns the processing time per epoch, as also shown in
Table 6.3, the MEE-BS algorithm was found to be up to six times faster than
the MEE-VLR algorithm.
The batch-sequential algorithm can also be implemented with variable
learning rate. However, the simple "global" updating rule described in the
previous section cannot be applied. The reason for this is easy: since in MEE-
VLR we compare the error entropy of a certain epoch with its value in the
previous one for the same samples, we cannot apply it to the batch-sequential
algorithm because, in each epoch, we use different sets.
Instead of using the simple procedure described in the preceding section
we may employ the variation of the learning rate in such a way that it is done
by comparing the respective gradient in consecutive iterations. Two learning
rate updating rules can then be incorporated into the MEE-BS algorithm:
either the Silva and Almeida's rule [210] (MEE-BS(SA)) or the resilient back-
propagation rule [186] (MEE-BS(RBP)). Both variants of the MEE-BS algo-
rithm are described in detail in [202]. An example of the training phase using
the three methods (MEE-BS, MEE-BS(SA), and MEE-BS(RBP)) is shown
in Fig. 6.9.
to L (
|
U
|
 
Search WWH ::




Custom Search