Information Technology Reference
In-Depth Information
Subsequently, features are extracted from the normalized data. This particular
step is needed because the recognizers need numerical data as their input. Howev-
er, no standard method for computing the features exists in the literature. One
common method in offline recognition of handwritten text lines is the use of a
sliding window moving in the writing direction over the text. Features are
extracted at every window position, resulting in a sequence of feature vectors. In
the case of online recognition the points are already available in a time-ordered
sequence, which makes it easier to get a sequence of feature vectors in writing or-
der. If there is a fixed size of the input pattern, such as in character or word recog-
nition, one feature vector of a constant size can be extracted for each pattern.
Fig. 2 Features of the vicinity
2.2 Our Online System
In the system described in this chapter state-of-the-art feature extraction methods
are applied to extract the features from the preprocessed data. The feature set input
to the online recognizer consists of 25 features which utilize information from
both the real online data stored in XML format, and pseudo offline information
automatically generated from the online data. For each (x, y) -coordinate recorded
by the acquisition device a set of 25 features are extracted, resulting in a sequence
of 25-dimensional vectors for each given text line. These features can be divided
into two classes. The first class consists of features extracted for each point by
considering the neighbors with respect to time. The second class takes the offline
matrix representation into account, i.e., it is based on spatial information. The fea-
tures of the first class are the following:
pen-up/pen-down : a boolean variable indicating whether the pen-tip touches the
board or not. Consecutive strokes are connected with straight lines for which
this feature has the value false.
Search WWH ::




Custom Search