Digital Signal Processing Reference
In-Depth Information
α
where
is the steepness parameter, the hyperbolic tangent function as special case
of the sigmoid function with additive offset, and the unit step
0 f u
<
0
T
(
u
) =
.
(7.40)
1 f u
0
The sigmoid function is particularly popular owing to its approximation of an
ideal threshold decision (cf. Fig. 7.6 ) while being differentiable. The latter will be
needed throughout the training of the network.
A multiplicity of different network topologies exist, of which the most important
will be introduced next.
7.2.3.1 Feed Forward Neural Networks
The most commonly used form of feed forward neural networks (FNN) is the mul-
tilayer perceptron (MLP) [ 18 ]: It consists of a minimum of three layers, one input
layer—typically without processing—, one or more hidden layers, and an output
layer. All connections feed forward from one layer to the next without any back-
ward connections. MLPs classify all input feature vectors over time independently.
In general, encoding of the outputs
y j with j
ˆ
=
1
,...,
M of the last layer that can
be written as vector
y is required. A popular way is to provide one output neuron
for regression and one per class in the case of classification. As an advantage, this
provides a measure of confidence of the network: The 'softmax' function as a transfer
function normalises the sum of all outputs to one in order to allow for interpretation
as posterior probability P
ˆ
(
j
|
x
)
of the final output:
e u
j = 1 e u .
P
(
j
|
x
)
y j
=
(7.41)
In the recognition phase the computation is processed step-wisely from the input
layer to the output layer. Per layer the weighted sum of the inputs from the previous
layer is computed for each neuron and weighted by the non-linearity. Using the soft-
max function at the outputs, and the named encoding, the recognised class is assigned
by maximum search. As an alternative, one can choose, e.g., a binary encoding of
the classes with the network's outputs.
7.2.3.2 Back Propagation
Among the multiplicity of learning algorithms for ANNs, the gradient descent-based
back propagation algorithm [ 19 ] is among the most popular ones and allowed for
the break-through of ANN. Let W
={ w j }
summarise the weight vectors
w j
of a
layer with j
=
1
,...,
J and J being the number of neurons in this layer. As target
 
 
Search WWH ::




Custom Search