Information Technology Reference
In-Depth Information
that accrues to the new EF due to outliers would be lesser than that with the quadratic
EF resulting thereby in a better estimate. Similarly, other EFs possess advantageous
situations which is contingent upon the quality of problem concerned.
The presented EFs which were generally employed in statistical analysis, will
be useful for developing the real backpropagation ( R BP) algorithm and the com-
plex backpropagation ( C BP) algorithm for training ANN and CVNN respectively.
The update rule of the backpropagation algorithm demands the EF be at least once
differentiable. Finitely many discontinuities or countably many of them can always
be bypassed by defining the update rule accordingly by breaking the real line into
finitely many or countably many intervals and developing a form of the update rule
in each of the intervals separately. Of the many parameters that should be set for
running the algorithm, the weights, biases, architecture, must be kept fixed to study
the influence of the EF while EF-based training algorithms run. Each of the EFs has
its unique properties that the statistical analysis cashes while implementing them.
The EFs may be generalized to the complex variables in each case by retaining the
form yet extended to accommodate complex numbers in backpropagation algorithm.
3.3.3.1 Absolute Error Function
The Absolute EF is continuous through out the real line and is differentiable at all
points on the line except at the origin. As the real line is partitioned into two dis-
connected sets by the origin (the only point where the function is not differentiable),
the update rule has a two-step definition—when the error is positive and when the
error is negative. The absence of an index (the power, unlike the quadratic EF) is
a distinguishing feature of this EF as this enables smoothing out the ill-effects of
the outlier points that would otherwise have offset the best-fit of the optimisation
scheme. The contribution to the EF from the outlier points would be on the same
scale as the actual data points of the problem and hence the ill-effects due to spurious
points are nullified to a great extent. On the other hand, if the data were normalised
to a specific region so that all the entries in the data set are small real numbers lying
in [
1,1], the contribution from the outliers is once again on the same scale as the
actual data points. The gradient for both parts in the definition in the update rule is
directed toward the origin.
The complex EF as a whole is not differentiable at zero's as the function inside
the radical is always positive. It has all the complex weights of the CVNN in its
definition. The update rule for the CVNN steers the real part and the imaginary part
of the weights to the minima separately. The problem of local minima that existed in
the ANN repeats while studying the CVNN in general and this EF based algorithm
in particular. As is clear the initial weight and the learning parameter decide how the
training should progress. The dynamics of the real part depends not only on the real
parts of the weights but also on the imaginary parts, as the updates of the real and
imaginary parts are coupled (dependent on each other).
Search WWH ::




Custom Search