Applications - Minimum Error Entropy Classification - page 159

Information Technology Reference

In-Depth Information

6.2.2.1

Introduction

The LSTM network was proposed in [103]. It uses gates preventing the neu-

ron's non-linearities (the transfer functions) from making the error informa-

tion pertaining to long time lapses vanish. Note that the use of gates is only

one possibility to avoid this problem that affects traditional RNNs; other

approaches can be found in [95]. In the following brief discussion we con-

sider the original LTSM approach [103]; other LSTM versions can be found

in [82, 81, 171].

An overview of the LSTM network is in Fig. 6.14: it is a RNN with the hid-

den layer feeding back into itself as well as into the output layer. Apart from

the usual neurons, the hidden layer contains LSTM memory blocks. In fact, in

the current discussion, the hidden layer is constituted only of these memory

blocks. The main element of the LSTM is the memory block: a memory block

is a set of memory cells and two gates (see Fig. 6.15). The gates are called

input and output gates. Each memory cell (see Fig. 6.16) is composed of a

central linear element called the CEC (Constant Error Carousel), two multi-

plicative units that are controlled by the block's input and output gates, and

two non-linear functions g (

). The CEC is responsible for keeping the

error unchanged for an arbitrarily long time lapse. The multiplicative units

controlled by the gates decide when the error should be updated. An LSTM

network can have an arbitrary number of memory blocks and each block may

have an arbitrary number of cells. The input layer is connected to all the

gates and to all the cells. The gates and the cells have input connections

from all cells and all gates.

The LSTM topology is represented by ( d : a : b ( c ): e ) where d is the

number of input features, a the number of neurons in the hidden layer, b the

number of memory blocks, c is a comma separated list of the number of cells

in each block and e is the number of output layer neurons.

·

) and h (

·

Memory blocks

x 1

x d

Input layer

Hidden layer

Output layer

Fig. 6.14 LSTM network overview: the hidden layer can have LSTM memory

blocks and simple neurons, although in this section only LSTM memory blocks are

used.

Next Page

Minimum Error Entropy Classification

Search WWH ::

Custom Search

Home