Information Technology Reference
In-Depth Information
the movement of the pen-tip, is captured, while in the off-line case only an image
of the text is available. Because of the greater ease of extracting relevant features,
online recognition generally yields better results [1]. Another crucial division is
that between the recognition of isolated characters or words, and the recognition
of whole lines of text. Unsurprisingly, the latter is substantially harder, and the ex-
cellent results that have been obtained for e.g. digit and character recognition [2],
[3] have never been matched for complete lines. Lastly, handwriting recognition
can be split into cases where the writing style is constrained in some way—for
example, only hand printed characters are allowed—and the more challenging
scenario where it is unconstrained. Despite more than 40 years of handwriting
recognition research [2], [3], [4], [5], developing a reliable, general-purpose sys-
tem for unconstrained text line recognition remains an open problem.
1.1 State-of-the-Art
A well known testbed for isolated handwritten character recognition is the
UNIPEN database [6]. Systems that have been found to perform well on UNIPEN
include: a writer-independent approach based on hidden Markov models [7]; a hy-
brid technique called cluster generative statistical dynamic time warping
(CSDTW) [8], which combines dynamic time warping with HMMs and embeds
clustering and statistical sequence modeling in a single feature space; and a sup-
port vector machine with a novel Gaussian dynamic time warping kernel [9]. Typ-
ical error rates on UNIPEN range from 3% for digit recognition, to about 10% for
lower case character recognition.
Similar techniques can be used to classify isolated words, and this has given
good results for small vocabularies (e.g., a writer dependent word error rate of
about 4.5% for 32 words [10]). However an obvious drawback of whole word
classification is that it does not scale to large vocabularies.
For large vocabulary recognition tasks, such as those considered in this chapter,
the usual approach is to recognize individual characters and map them onto com-
plete words using a dictionary. Naively, we could do this by presegmenting words
into characters and classifying each segment. However, segmentation is difficult
for cursive or unconstrained text, unless the words have already been recognized.
This creates a circular dependency between segmentation and recognition that is
often referred to as Sayre's paradox [11]. Nonetheless, approaches have been pro-
posed where segmentation is carried out before recognition. Some techniques for
character segmentation, based on unsupervised learning and data-driven methods,
are given in [3]. Other strategies first segment the text into basic strokes, rather
than characters. The stroke boundaries may be defined in various ways, such as
the minima of the velocity, the minima of the y-coordinates, or the points of max-
imum curvature. For example, one online approach first segments the data at the
minima of the y-coordinates then applies self-organizing maps [12]. Another, off-
line, approach [13] uses the minima of the vertical histogram for an initial estima-
tion of the character boundaries and then applies various heuristics to improve the
segmentation.
Search WWH ::




Custom Search