Supervised Learning - Hierarchical Neural Networks for Image Interpretation

Information Technology Reference

In-Depth Information

Because the cost function (6.1) sums the error contributions of individual exam-

ples, its partial derivative with respect to a weight w jk from unit j to unit k is a sum

of components

∂E ( i )

∂w jk that can be computed for each example i separately. The key

idea of backpropagation is that

∂E ( i )

∂w jk

can be expressed in terms of a backpropagated

error δ ( i )

and the source activity o ( i )

present at the weight:

∂E ( i )

∂w jk

= o ( i )

· δ ( i k .

(6.4)

The backpropagated error δ ( i )

k of a hidden unit is a weighted sum of the errors

δ ( i l of all units l receiving input from unit k , multiplied with the derivative of the

transfer function f k , producing the output o ( i )

= f k ( ξ ( i )

) of unit k :

= df k

dξ ( i )

δ ( i )

w kl δ ( i )

(6.5)

[hidden unit] ,

with ξ ( i )

j w jk o ( i j describing the weighted sum of the inputs to k .

If unit k is an output unit, its error component can be computed directly:

= df k

dξ ( i )

δ ( i )

( o ( i )

− y ( i )

)

(6.6)

[output unit] ,

where y ( i k is the component of the target vector y i that corresponds to unit k .

The backpropagation technique can be applied to the Neural Abstraction Pyra-

mid architecture. Since the basic processing element, described in Section 4.2.1, is a

two-layered feed-forward neural network, directed acyclic graphs of such process-

ing elements form a large feed-forward neural network with shared weights.

A simple modification is needed for the update of shared weights: the sum of all

weight-updates, which have been computed for the individual instances of a weight,

is added to it. By replacing the weight-instances with multiplicative units that re-

ceive an additional input from a single unit which outputs the value of the shared

weight, one can show that this indeed modifies the weight in the direction of the

negative gradient [193].

When implementing error backpropagation in the Neural Abstraction Pyramid,

one must also take care to handle the border effects correctly. The simplest case

is when the border cells of a feature array are set to a constant value. Since the

derivative of a constant is zero, the error component arriving at these border cells

does not need to be propagated any further. In contrast, if the activity of a border

cell is copied from a feature cell, the error component arriving at it must be added

to the error component of that feature cell.

Because the weights of a projection unit are stored as an adjacency list in the

template of the unit, it is easiest to implement the sum in Equation 6.5 by accumulat-

ing contributions from the units receiving inputs from it. As the network is traversed

Hierarchical Neural Networks for Image Interpretation

Search WWH ::

Custom Search

Home