Biomedical Engineering Reference
In-Depth Information
x
< µ is the step-size used for converting
a continuous-time leaky integrator into a discrete time difference equation, C is the time constant,
and a is the decay rate [ 42 ]. The point-wise nonlinear function
=
(
µ
Ca
)
x
+
µ
C
(
f
(
W
T
in
u
+
Wx
+
W
d
)
) , where
0
<
1
+
1
1
n
n
n
+
n
b
n
f
(.)
is chosen to be the standard
tanh sigmoid (i.e., f (.) = tanh(.)). Note that, if
, µ and a are all equal to unity, the RNNs default
to the conventional nonlinear PE [ 33 ] without memory. From a signal processing point of view,
the reservoir creates a set of bases functions to represent the input, whereas the static mapper finds
the optimal projection in this space. There are obvious similarities of this architecture to kernel
machines, except that the kernels here are time functions (Hilbert spaces).
We will now give the conditions under which an ESN can be “useful,” which Jaeger aptly
calls as the “echo state property.” Loosely stated, the echo state property says that the current state
is uniquely defined by the past values of the inputs and also the desired outputs if there is feedback.
A weaker condition for the existence of echo states is to have the spectral radius of the matrix
ˆ
,
W
+ µ Ca ) less than unity [ 42 ]. Another aspect critical for the success of ESN is to
construct a sparse matrix w . This will ensure that the individual state outputs have different repre-
sentations of the inputs and desired outputs, or in other words, the span of the representation space
is sufficiently rich to construct the mapping to the desired response (hand trajectory).
=
µ
W
(1
I
3.2.2.2 design of the ESN. One of the important parameters in the design of the ESN is the
number of RNN units in the reservoir. In our experience with BMI simulations, Monte Carlo
simulations varying the number of parameters and monitoring performance provided the best
technique for selecting the dimensionality of the reservoir. Here, we chose N = 800, where in-
creasing N further did not result in any significant improvement in performance. The input
weight matrix i W was fully connected with all the weights fixed to unity. The recurrent connec-
tion matrix w was sparse, with only 1% of the weights (randomly chosen) being nonzero. More-
over, we fix all the nonzero weights to a value 0.5. Further, each RNN is a gamma delay operator
with parameters [ a , C , μ ] = [1, 0.7, 1]. The next aspect is to set the spectral radius, which is crucial
for this problem because it controls the dynamics and memory of the echo states. Higher values
are required for slow output dynamics and vice versa [ 42 , 45 ]. For the experiments in this sec-
tion, we utilized a single ESN whose spectral radius was tuned to 0.79. Marginal changes (<1%)
in performance (both X and Y ) were observed when this parameter was altered by
10± . We
also turned off the connections from the output to the RNN reservoir and the direct connections
between the inputs and the linear mapper. The network state is set to zero initially. The training
inputs were forced through the network, and the states were updated. The first 400 echo state
outputs were discarded as transients. The remaining 5000 state outputs were used to train the
linear mapper.
%
 
Search WWH ::




Custom Search