Neural Networks for Handwriting Recognition - Computational Intelligence Paradigms in Advanced Pattern Classification

Information Technology Reference

In-Depth Information

the movement of the pen-tip, is captured, while in the off-line case only an image

of the text is available. Because of the greater ease of extracting relevant features,

online recognition generally yields better results [1]. Another crucial division is

that between the recognition of isolated characters or words, and the recognition

of whole lines of text. Unsurprisingly, the latter is substantially harder, and the ex-

cellent results that have been obtained for e.g. digit and character recognition [2],

[3] have never been matched for complete lines. Lastly, handwriting recognition

can be split into cases where the writing style is constrained in some way—for

example, only hand printed characters are allowed—and the more challenging

scenario where it is unconstrained. Despite more than 40 years of handwriting

recognition research [2], [3], [4], [5], developing a reliable, general-purpose sys-

tem for unconstrained text line recognition remains an open problem.

1.1 State-of-the-Art

A well known testbed for isolated handwritten character recognition is the

UNIPEN database [6]. Systems that have been found to perform well on UNIPEN

include: a writer-independent approach based on hidden Markov models [7]; a hy-

brid technique called cluster generative statistical dynamic time warping

(CSDTW) [8], which combines dynamic time warping with HMMs and embeds

clustering and statistical sequence modeling in a single feature space; and a sup-

port vector machine with a novel Gaussian dynamic time warping kernel [9]. Typ-

ical error rates on UNIPEN range from 3% for digit recognition, to about 10% for

lower case character recognition.

Similar techniques can be used to classify isolated words, and this has given

good results for small vocabularies (e.g., a writer dependent word error rate of

about 4.5% for 32 words [10]). However an obvious drawback of whole word

classification is that it does not scale to large vocabularies.

For large vocabulary recognition tasks, such as those considered in this chapter,

the usual approach is to recognize individual characters and map them onto com-

plete words using a dictionary. Naively, we could do this by presegmenting words

into characters and classifying each segment. However, segmentation is difficult

for cursive or unconstrained text, unless the words have already been recognized.

This creates a circular dependency between segmentation and recognition that is

often referred to as Sayre's paradox [11]. Nonetheless, approaches have been pro-

posed where segmentation is carried out before recognition. Some techniques for

character segmentation, based on unsupervised learning and data-driven methods,

are given in [3]. Other strategies first segment the text into basic strokes, rather

than characters. The stroke boundaries may be defined in various ways, such as

the minima of the velocity, the minima of the y-coordinates, or the points of max-

imum curvature. For example, one online approach first segments the data at the

minima of the y-coordinates then applies self-organizing maps [12]. Another, off-

line, approach [13] uses the minima of the vertical histogram for an initial estima-

tion of the character boundaries and then applies various heuristics to improve the

segmentation.

Computational Intelligence Paradigms in Advanced Pattern Classification

Search WWH ::

Custom Search

Home