Information Technology Reference
In-Depth Information
problem, some algorithms for minimization used the ideas of random search tech-
niques, instead of using the derivatives (Brent 1973). In addition to this, methods to
prune synaptic links that are least sensitive to the training process have been proposed
(Karnin 1990). This process improves network generalization by decreasing the num-
ber of weights, results in a reduced network complexity, and decreases the needed
computation. As applied to differential equations, the networks have been used to
study chaos in dynamical systems (Aihara et al. 1990). Complex chaotic neural net-
works were studied by Hirose (1992). Differential Equations were modeled using
neural networks and solutions of equations were studied by inputting chaotic initial
conditions to the network. Along the same lines but for small architecture neural
networks with delay, Francois and Chauvet (1992) reported dynamics of neural net-
works classifying the regimes as stable, unstable, or oscillatory. Ishi et al. (1996)
has applied chaotic neural networks for information processing. Fletcher and Reeves
(1964) and Beale (1972) developed the idea of conjugate gradient algorithm. Fur-
ther, Battiti (1992) reported an efficient method to compute the conjugate gradient.
Charalambous (1992) furthered the step by developing a Conjugate Gradient-based
BPA incorporating the efficient methods discovered. The scaled conjugate gradient
algorithm was put forth by Moller (1993) in which a scaling based on the position in
the weights space was used in conjunction with the conjugate direction-based update
of weights. A highly efficient way of training using gradient descent, embedding the
good features of the second-order algorithms (that involve computing the Hessian)
by using an approximation to the Hessian and bypassing the actual computation of
it is the Lavenberg-Marquardt Algorithm put forth by Hagan and Menhaj (1994).
The neural network is initialized with a set of weights and the performance of the
final Network depends on these initial weights. The effect of adding noise during
BP training and the final network's performance were studied by An (1995) while
the generalization performance based on the weight initialization was studied by
Amir et al. (1997). In the same year, Dai and MacBeth reported their observations
on learning parameters and how they influence a BPA based training. In order to
accumulate the knowledge (training) in the neuron or neural network the variety of
learning processes starting fromHebb (1949), Minsky (1961), Ooyen (1992), Haykin
(1994) continuing by Riedmiller (1993), Nitta (1997), Tripathi (2011, 2012) were
determined to organize the correction of weights.
The size or structure of the neuron architecture that optimally suits a problem
remains an open problem in neural networks to this day [ 4 ]. Furthering the effort in
this direction, the new types of neuron or neural network architectures with multi-
variable functions and functionals [Cascade Correlation Networks by Fahlman and
Lebiere (1990), Pi-sigma Network by Shin and Ghosh (1991), polynomial neural
networks by Chen and Manry (1993), higher-order neuron by Schmidt and Davis
(1993), the SumOf Product Neural Networks by Chun (2000)], Quadratic and Cubic
neural units by Gupta (2003), multivaluedneuron by Aizenberg (2007) and other non-
conventional neural units by Homma et al. (2009), Triron by Tripathi (2010) are few
of themwere proposed. The compensatory neuron structure were employed by Kalra
in 2000 for control problems of the aerospace industry and to determine satellite orbit
motion. The idea of compensatory structure was better established by B.K. Tripathi
Search WWH ::




Custom Search