Information Technology Reference
In-Depth Information
the layers are connected to a single output layer, which therefore has access to
context information in both directions along all dimensions.
Multidimensional LSTM (MDLSTM) is the generalization of bidirectional
LSTM to multidimensional data.
3.6 Hierarchical Subsampling Recurrent Neural Networks
Hierarchical subsampling is a common technique in computer vision [45] and oth-
er domains with large input spaces. The basic principle is to iteratively re-
represent the data at progressively lower resolutions, using a hierarchy of feature
extractors. The features extracted at each level are subsampled and used as input
to the next level. The number and complexity of the features typically increases as
one climbs the hierarchy. This is much more efficient for high-resolution data than
a single `flat' feature extractor, since most of the computations are carried out on
low resolution feature maps, rather than, for example, raw pixels.
A well-known connectionist hierarchical subsampling architecture is Convolu-
tional Neural Networks [46]. Hierarchical subsampling is also possible with
RNNs, and hierarchies of MDLSTM layers have been applied to offline handwrit-
ing recognition [47]. Hierarchical subsampling with LSTM is equally useful for
long 1D sequences, such as raw speech data or online handwriting trajectories
with a high sampling rate.
From the point of view of handwriting recognition, the most interesting aspect
of hierarchical subsampling RNNs is that they can be applied directly to the raw
input data (offline images or online point-sequences) without any normalization or
feature extraction.
4 Experiments
The experiments have been performed with the freely available RNNLIB tool by
Alex Graves. 2 This tool implements the network architecture and furthermore pro-
vides examples for the recognition of several scripts.
4.1 Comparison with HMMs on the IAM Databases
The aim of the first experiments was to evaluate the performance of the complete
RNN handwriting recognition system, illustrated in Figure 6, for both online and
offlne handwriting. In particular we wanted to see how it compared to an HMM-
based system. The online and offline databases used were the IAM-OnDB and the
IAM-DB respectively (see above). Note that these do not correspond to the same
handwriting samples: the IAM-OnDB was acquired from a whiteboard, while the
IAM-DB consists of scanned images of handwritten forms.
2
http://sourceforge.net/projects/rnnl/
Search WWH ::




Custom Search