Information Technology Reference
In-Depth Information
describing how to select an architecture to ensure convergence (Mirchandani and
Cao 1989) were also reported. It should be however pointed out that most results in
the area of neural networks are existential in nature; for instance, to state conver-
gence obtained by an algorithm the theoretical results assure the existence of weights
(Rudin 1976) that can approximate the data with functions of the specific form but
how the weights can be obtained is not explicitly mentioned.
Artificial neurons offer learning capability which demonstrates artificial intelli-
gence analogous to the human intelligence. ANN-based techniques are robust espe-
cially ill-defined problems and can handle uncertainty with ease. Moreover, these
techniques are most suitable and provide effective solutions for various real world
problems where conventional methods are hard to apply. The earliest history of ANN
is limited to the study of McCulloch and Pitts [ 1 ], Hebb [ 2 ], and Rosenblatt [ 3 ] con-
tributions in neuron designing. The historical notes reveal that the first successful
attempt to develop the ANN architecture was done by McCulloch and Pitts (1943),
hence from the incipient stages since 1943 with the discovery of the perceptron the
theory of neural networks traveled vicissitudes of up and down as decades rolled
by. The first successful model of artificial neuron did not appear until the Adaptive
LINEar combiner (ADALINE) entered the scene in the 1960s and the Widrow-
Hoff learning rule that trained the ADALINE-based networks. The ADALINE or
the Adaptive Linear Network is the basic element of the neural network that adds
the inputs incident onto it. The linear combination with many ADALINES devel-
oped MADALINES. The Widrow-Hoff rule minimized a sum-squared error in a
pattern classification problem as it trained the MADALINES-based neural network.
Research in the direction did not receive momentum as the computational power
was insufficient to support the load due to training algorithms; and as a whole, the
research in the area slowed down drastically due to the impediment. The idea of
multilayered networks was put forth by Nilsson (1965) but his concept could not
receive much attention because of lack of interest of researchers. Minsky and Papert
(1969) published a topic that further put neural networks in jeopardy as it exposed
the potential of ANN as computational tools and questioned on the limitations of
the perceptron. Research almost stopped after the publication of the work of Minsky
and Papert and not much research happened for about 20 years between 1965 and
1984 as the number of researchers in the area dwindled. But the few who pursued
Neural Networks during the period made lasting contributions: mathematical the-
ory of neural networks was developed by Sun-Ichi Amari (1972, 1977); Fukushima
(1980) developed the Neocognitron, Associative Memory was developed by Tuevo
Kohonen (1977, 1980); Cellular neural network by Yang (1988).
Employing the idea of gradient descent, Werbos (1974) worked out on a consis-
tent method (that worked universally) to obtain the weights much after Kolmogorov
published the first theoretical result. Werbos' discovery of the BP appeared an impor-
tant milestone in the history of ANN, though it was not fully free from bottlenecks.
The problem of local minima while training using the BP needs a special men-
tion. Depending on the initial condition the network was set to, BP might steer the
neural network into a local minimum at which the training process gets stuck (the
weights do not get updated but stay frozen even as the epochs run). To circumvent the
Search WWH ::




Custom Search