Information Technology Reference
In-Depth Information
6.2.2.1
Introduction
The LSTM network was proposed in [103]. It uses gates preventing the neu-
ron's non-linearities (the transfer functions) from making the error informa-
tion pertaining to long time lapses vanish. Note that the use of gates is only
one possibility to avoid this problem that affects traditional RNNs; other
approaches can be found in [95]. In the following brief discussion we con-
sider the original LTSM approach [103]; other LSTM versions can be found
in [82, 81, 171].
An overview of the LSTM network is in Fig. 6.14: it is a RNN with the hid-
den layer feeding back into itself as well as into the output layer. Apart from
the usual neurons, the hidden layer contains LSTM memory blocks. In fact, in
the current discussion, the hidden layer is constituted only of these memory
blocks. The main element of the LSTM is the memory block: a memory block
is a set of memory cells and two gates (see Fig. 6.15). The gates are called
input and output gates. Each memory cell (see Fig. 6.16) is composed of a
central linear element called the CEC (Constant Error Carousel), two multi-
plicative units that are controlled by the block's input and output gates, and
two non-linear functions g (
). The CEC is responsible for keeping the
error unchanged for an arbitrarily long time lapse. The multiplicative units
controlled by the gates decide when the error should be updated. An LSTM
network can have an arbitrary number of memory blocks and each block may
have an arbitrary number of cells. The input layer is connected to all the
gates and to all the cells. The gates and the cells have input connections
from all cells and all gates.
The LSTM topology is represented by ( d : a : b ( c ): e ) where d is the
number of input features, a the number of neurons in the hidden layer, b the
number of memory blocks, c is a comma separated list of the number of cells
in each block and e is the number of output layer neurons.
·
) and h (
·
Memory blocks
x 1
x d
Input layer
Hidden layer
Output layer
Fig. 6.14 LSTM network overview: the hidden layer can have LSTM memory
blocks and simple neurons, although in this section only LSTM memory blocks are
used.
Search WWH ::




Custom Search