Biomedical Engineering Reference
In-Depth Information
be trained using static backpropagation 3 with MSE as the learning criterion. In the TDNN static
backpropagation algorithm, the output of the first hidden layer is given by ( 3.24 ).
y
l
( )
n
=
f
(net ( ))
l
n
(3.24)
i
i
where
l
net ( )
l
n
=
w y
l
l
1
( )
n
(3.25)
i
ij
j
j
=
1
l
y = 0 or the neuronal inputs.
The output layer delta rule is given in ( 3.26 ). Note the derivative of output is equal to one for
a linear output as used here.
x
Here,
j
δ i L
L
( )
n
=
e n f
( )
'(net ( ))
n
(3.26)
i
i
1
δ
l
( )
n
=
f
'(net ( ))
l
n
δ
l
+
( )
n w n
l
+
1
( )
(hidden layer delta rule)
(3.27)
i
i
k
ki
k
The overall weight update for the TDNN is given in ( 3.28 )
w n
(
+
1
)
=
w n
( )
+
η
f
'(net ( ))(
n
e n f
( )
'(net ( ))
n w n y n
( ))
( )
(3.28)
ij
ij
i
k
k
ki
j
k
Although the nonlinear nature of the TDNN may seem an attractive choice for BMIs, put-
ting memory at the input of this topology presents a difficulty in training and model generalization
because the high-dimensional input implies a huge number of extra parameters. For example, if a
neural ensemble contains 100 neurons with 10 delays of memory, and the TDNN topology contains
5 hidden PEs, 5000 free parameters are introduced in the input layer alone. Large data sets and slow
learning rates are required to avoid overfitting [ 24 ]. Untrained weights can also add variance to the
testing performance thus decreasing accuracy.
As in any neural network topology, the issues of optimal network size, learning rates, stopping
criterion need to be addressed for good performance. There are well-established procedures to control
each one of these issues [ 24 ]. The size of the input layer delay line can be established from the knowl-
edge of the signal structure. In BMIs, it is well established that one second of neural data preceding
the motor event should be used [ 8 ]; however, the optimal embedding and the optimal delay should
still be established using the tools from dynamic modeling. The rule of thumb to build the architec-
ture is to always start simple (i.e., the smallest number of hidden PEs, which as we saw defines the
size of the projection space). As far as training is concerned, one has to realize that both the bases and
3 Backpropagation is a simple application of the chain rule that propagates the gradients through the topology.
 
Search WWH ::




Custom Search