Modeling with Neural Networks: Principles and Model Design Methodology - Neural Networks: Methodology and Applications

Information Technology Reference

In-Depth Information

( ∂y m /∂w mj ) k =( ∂y m /∂v m ) k ( ∂v m /∂w mj ) k = f ( v j ) x j ,

where x j

is the value of input j of the network for example k ,

For a neuron m , which receives quantity x j from input j of the network,

or from neuron j , through other neurons of the network, located between

input or neuron j and neuron m ,

∂y m

∂w ij

•

= ∂y m

∂v m

∂w ij

∂v m

∂y l

∂w ij

= f ( v k m )

∂y l

∂w ij

= f ( v k m )

w ml

where subscript l denotes all neurons that are adjacent to neuron m in the

graph of connections, between neuron j (or input j ) and neuron m .

By using those relations recursively, the derivatives of the output of each

neuron with respect to the parameters can be computed, from the inputs to

the outputs of the network.

Once those derivatives are computed, the gradient of the partial cost func-

tion can be derived as

∂J k

∂w ij

= ∂

∂w ij

g ( x , w )) 2

g ( x k , w )) ∂g ( x , w )

∂w ij

( y p −

=2( y p −

Furthermore, g ( x , w ) is the output of a neuron of the network; therefore,

the last derivative can be computed recursively by the same procedure. The

gradient of the partial cost being computed for each example, the gradient of

the total cost function is obtained by summation over all examples.

Comparison Between Forward Computation of the Gradient of the Cost

Function and Backpropagation

The above discussion shows that backpropagation requires the evaluation of

one gradient per neuron, whereas the forward computation requires the com-

putation of one gradient per connection. Since the number of connections is

roughly the square of the number of neurons, the number of gradient evalua-

tions is larger for forward computation of the gradient than for backpropaga-

tion.

Therefore, backpropagation will be used for the evaluation of the gradient

of the cost function in the training of feedforward neural networks. For recur-

rent neural networks, however, forward computation is sometimes mandatory,

as shown in the section devoted to the training of recurrent neural networks.

Evaluation of the Gradient of the Cost Function under Constraint: The

Shared Weight Technique

When training recurrent neural networks-as discussed in the section devoted

to black-box dynamic modeling and in Chap. 4- and when training some

Neural Networks: Methodology and Applications

Search WWH ::

Custom Search

Home