Biology Reference
In-Depth Information
5.5 Artificial neural networks
Artificial neural networks (ANNs) are a machine learning approach which dates back
to the 1940s ( McCulloch and Pitts, 1943 ), although they did not become widely use-
ful until the 1980s ( Hecht-Nielsen, 1989 ). Neural networks are inspired by the human
brain, and are composed of artificial “neurons”, which are connected by weighted
edges. There are far too many different neural network algorithms to address in detail
here, and there are many excellent textbooks on neural networks ( Bishop, 2007;
Haykin, 2008 ). Here we restrict ourselves to what is undoubtedly the most widely
used ANN algorithm, the multi-layer perceptron (MLP). In an MLP, the neurons
are arranged in layers, with input nodes which take the input data, output neurons
which output the results of the computation, and so-called hidden neurons, which
are neither input not output ( Figure 2.12 ). An MLP may have one or more layers
of hidden neurons.
Each neuron calculates the sum of all of its inputs, runs this sum through a squash-
ing function, and outputs the result. Learning in the ANN consists of iteratively mod-
ifying the weights on the edges in response to errors in classifying labelled training
data, a process which is usually carried out using the backpropagation algorithm
( Hecht-Nielsen, 1989 ).
A very common squashing function for the neurons is the sigmoid function,
which is described by the equation y
e x )( Figure 2.13 ).
A function such as the sigmoid can takes input values of any magnitude, and
output a value of between 0 and 1, thereby keeping the results of the computation
performed by the ANN within prescribed limits. In the case of the example network
in Figure 2.12 , for identifying protein location, the output of the single output node
would be between 0 and 1, and would usually be thresholded at 0.5. If the training
¼
1/(1
þ
Output node
Hidden nodes
Input nodes
FIGURE 2.12
A fully connected feed-forward multi-layer perceptron. This example has six input nodes, four
hidden nodes and one output node, but the architecture is flexible based upon the needs of
the problem.
Search WWH ::




Custom Search