Information Technology Reference
In-Depth Information
have an x -coordinate in the vicinity of the current point are considered. Addition-
ally the points must have a minimal distance to the lines to be considered as part
of an ascender or descender. The corresponding distances are set to a predefined
fraction of the corpus height.
context map: the two-dimensional vicinity of the current point is transformed to
a 3×3 map. The number of black points in each region is taken as a feature val-
ue. So we obtain altogether nine features of this type.
2.3 Our Offline System
To extract the feature vectors from the offline images, a sliding window approach
is used. The width of the window is one pixel, and nine geometrical features are
computed at each window position. Each text line image is therefore converted to
a sequence of 9-dimensional vectors. The nine features are as follows:
The mean gray value of the pixels
The center of gravity of the pixels
The second order vertical moment of the center of gravity
The positions of the uppermost and lowermost black pixels
The rate of change of these positions (with respect to the neighboring windows)
The number of black-white transitions between the uppermost and lowermost
pixels
The proportion of black pixels between the uppermost and lowermost pixels.
For a more detailed description of the offline features, see [17].
In the next phase indicated in Fig. 1, a classification system is applied which
generates a list of candidates or even a recognition lattice. This step and the last
step, the postprocessing, are described in the next section.
3 Neural Network Based Recognition
The main focus of this chapter is the recently introduced Neural Network classifier
based on CTC combined with Bidirectional or Multidimensional LSTM. This Sec-
tion describes the different aspects of the architecture and gives brief insights into
the algorithms behind.
3.1 Recurrent Neural Networks (RNNs)
Recurrent neural networks (RNNs) are a connectionist model containing a self-
connected hidden layer. One benefit of the recurrent connection is that a `memory'
of previous inputs remains in the network's internal state, allowing it to make use
Search WWH ::




Custom Search