Geoscience Reference
In-Depth Information
indexed 1 through k and that k < i (where i is the total number of inputs). The corresponding
connection weights associated with edge ji between nodes j and i are w ij ( j = 1, …, k ). It is important
to understand the manner in which the subscript of the connection weight w ij is written. The first
subscript refers to the PE in question and the second subscript refers to the unit from which the
incoming connection originated from. The reverse of this notation is also used in the neural network
literature. We refer to the weights w i(ƒi(·))  = { w i 1 , …, w ik } as the incoming weights for unit u i . To simplify
the notation, W is used for the vector w i(ƒi(·)) .
Positive weights indicate reinforcement, negative weights represent inhibition, and convention
dictates that for each PE, there is an extra input unit u 0  whose output is always +1. The correspond-
ing weight for this input w i 0  is referred to as the bias θ i  for each unit i . The bias is otherwise treated
in the same manner as any other weight and its existence accounts for the difference between k and i
that was mentioned earlier. Thus, we can define the ( k + 1) input vector
u = [1, u 1 , u 2 , …, u k ] T
(13.1)
where T means 'the transpose of…' (in this case, signifying a column, not a row vector) and,
correspondingly, we can define the ( k + 1)-by-1 weight (also called connection weight or input
parameter ) vector
W = [θ, w i 1 , …, w ik ] T
(13.2)
where T again means 'the transpose of…' (and again signifying a column, not a row vector).
The basic operation that is performed within each PE is the computation of that unit's activation or
output signal u i . This involves the implementation of a transfer function φ i , which is itself composed
of two mathematical functions, an integrator function ƒ i ƒi and an activation (or output) function
i :
u i = φ i ( u ) =
i ( f i ( u ))
(13.3)
Typically, the same transfer function is used for all processing units within each individual layer of
the CNN although this is not a fixed requirement.
The job of the integrator function ƒ i   is to integrate the incoming activations from all other
units that are connected to the PE in question, together with the corresponding weights that have
been assigned to the various incoming connections to transform (reduce) the incoming multiple k
arguments into a single value (called net input or activation potential of the PE) termed v i . In most
but not all cases, ƒ i ƒi is specified as the inner product of the vectors, u and W , as follows:
v i
ƒ () ,
=
i uuW
=
=
w u
ij
(13.4)
j
j
=
01
,,...,
k
where W has to be predefined or learned during the training phase. In the basic case, the net
input to a PE is just the weighted sum of the separate inputs from each of the k connected units
plus the bias term w i 0 . Because of the individual multiple weighting process that is used to com-
pute v i , a degree of network tolerance for noise and missing data is automatic (Gallant 1993). The
bias term represents the offset from the origin of the k -dimensional Euclidean space ℜ k   to the
hyperplane normal to W defined by ƒ i . In other words, bias quantifies the amount of positive or
negative shift that is applied to the integrator function with respect to its zero marker in each PE.
This arrangement is called a first-order PE when ƒ i  is an affine (linear if w i 0 = 0) function of its input
vector u = [ u 1 , …, u k ] T . Higher-order PEs will arise when more complicated functions are used for
specifying ƒ i . For example, a second-order PE would be realised if ƒ i ƒi was specified in a quadratic
form, say u T   W u , in u . This might then be viewed as an alternative generalisation to that which was
considered in Equation 13.4.
The activation or output function denoted by
i i (·)) defines the output of a processing unit in
terms of its total input v i . There are various possibilities with regard to the exact specification of
i ,
Search WWH ::




Custom Search