Biomedical Engineering Reference
In-Depth Information
of sensitivity, we will first examine sensitivities through the FIR filter that will serve as the “control
model” throughout this topic. The procedure for deriving the sensitivity for a feedforward topol-
ogy is an application of the chain rule [
26
]. For the case of the FIR filter, differentiating the output
with respect to the input [see (
3.8
)] directly yields a sensitivity with respect to each neuronal input
i
in (
4.29
).
¶
¶
=
y
x
(4.29)
j
w
10
(
i
- +
1
)
1 10
:
(
i
- +
1
)
10
,
j
i
Hence, a neuron's importance can be determined by simply reading the corresponding weight
value
3
in the trained model, if the input data for every channel is power-normalized. Because this is
not the case for neural data, the neuron importance is estimated in the vector Wiener filter by mul-
tiplying the absolute value of a neuron's sensitivity with the standard deviation of its firing computed
over the data set
4
as in (
4.30
). To obtain a scalar sensitivity value for each neuron, the weight values
are also averaged over the 10 tap delays and three output dimensions.
s
1
2
3
1
10
10
å å
(4.30)
Sensitivity
i
=
w
i
10
(
i
- =
1
)
k
,
j
j
=
1
k
=
1
The procedure for deriving the sensitivity for a feedfoward multilayer perceptron (MLP), also
discussed in [
26
], is again a simple application of the chain rule through the layers of the network
topology as in (
4.31
):
¶
¶
y t
x t
( )
( )
=
¶
¶
y
( )
( )
t
¶
¶
y t
x t
( )
( )
2
2
1
(4.31)
y
t
1
In the case of a nonlinear, dynamical system like the RMLP, the formulation must be modified
to include time. Because the RMLP model displays dependencies over time that results from feed-
back in the hidden layer, we must modify this procedure [
26
]. Starting at each time
t
, we compute the
sensitivities in (
4.31
) as well as the product of sensitivities clocked back in time. For example, using
the RMLP feedforward equations [see (
3.29
) and (
3.30
)], we can compute at
t
= 0, the chain rule
shown in (
4.32
).
D
t
is the derivative of the hidden layer nonlinearity evaluated at the operating point
shown in (
4.33
). Notice that at
t
= 0 there are no dependencies on
y
1
. If we clock back one cycle, we
must now include the dependencies introduced by the feedback, which is shown in (
4.34
). At each
3
In this analysis, we consider the absolute values of the weights averaged over the output dimensions and the 10-tap
delays per neuron.
4
By multiplying the model weights by the firing standard deviation, we have modified the standard definition of
sensitivity; however, for the remainder of this analysis we will refer to this quantity as the model sensitivity.