Modeling with Neural Networks: Principles and Model Design Methodology - Neural Networks: Methodology and Applications

Information Technology Reference

In-Depth Information

In order to compute its gradient, one can compute the gradient of the

partial cost function J k ( w ) related to observation k , and subsequently sum

over all examples.

Backpropagation consists essentially in a repeated application of the rule of

chained derivatives. First, one notices that the partial cost function depends of

w ij only through the value of the output of neuron i , which itself is a function

of the potential of neuron i only; therefore, one has

∂J k

∂w ij

= ∂J k

∂v i

∂w ij

= δ i x j .

where

( ∂J k ) / ( ∂vi ) k is the value of the gradient of the partial cost function with

respect to the potential of neuron i when the inputs of the network are

the variables of example k .

•

( ∂v i ) / ( ∂w ij ) k is the value of the partial derivative of the potential of neu-

ron i with respect to parameter w ij

when the inputs of the network are

the variables of example k .

•

x j is the value of input j of neuron i when the inputs of the network are

the variables of example k.

The computation of the last two quantities is straightforward. The only

problem is the computation of δ i on the right-hand side of the equation. These

quantities can be advantageously computed recursively from the outputs to

the inputs, as follows.

•

For output neuron i ,

= ∂J k

∂v i

= ∂

∂v i

( y p −g ( x , w )) 2

= − 2 g ( x k , w ) ∂g ( x , w )

∂v i

δ i

The output g ( x , w ) of the model is the output y i

of the output neuron;

therefore the above relation can be written as δ i

2 g ( x k , w ) f ( v i )

where f ( v i ) is the derivative of the activation function of the output

neuron when the network inputs are those of example k . Usually, for a

feedforward neural network designed for modeling, the activation function

of the output neuron is linear, so that the above relation reduces to δ i

−

2 g ( x k , w ).

−

•

For hidden neuron i , the cost function depends on the potential of neuron

i only through the potentials of the neurons m that receive the value of

the output of neuron i , i.e., of all neurons that are adjacent to neuron i in

the graph of the connections of the network, and are located between that

neuron and the output:

∂J k

∂v i

∂J k

∂v m

∂v i

∂v m

∂v i

δ i ≡

δ k m

Neural Networks: Methodology and Applications

Search WWH ::

Custom Search

Home