Database Reference
In-Depth Information
sequence of 2D points recorded from a person's writing on a special tablet.
Each digit is originally given as a sequence
d
x 1 ,y 1 ,t 1 )
,...,
x k ,y k ,t k )of
=(
(
points in the
th point during
writing a digit. In order to transform such a sequence of points into a string,
we first resample the given data points such that the distance between any
consecutive pair of points has a constant value ∆. That is,
xy
-plane, where
t k is the time of recording the
k
d
is transformed
d =(¯
into sequence
x 1 ,
¯
y 1 )
,...,
x l ,
y l )where
¯
|
x i +1 ,
¯
y i +1 )
x i ,
y i )
¯
|
=∆for
i
=1
,...,l−
1. Then, a string
s
=
a 1 a 2 ···a l− 1 is generated from sequence
d where
y i +1 ). In our exper-
iments we fixed ∆ = 7 so that the digit strings have between 40 and 100
points.
The costs of the edit operations are defined as follows:
a i is the vector pointing from (¯
x i ,
¯
y i )to(¯
x i +1 ,
¯
c
(
a → ε
)=
c
. Notice that the minimum cost of a
substitution is equal to zero (if and only if
(
ε → a
)=
|a|
=∆,
c
(
a → b
)=
|a − b|
a
=
b
), while the maximum cost is
2∆. The latter case occurs if
are parallel and have opposite direction.
We conducted experiments using 10 samples of digit 1, 2 and 3 each,
and 98 samples of digit 6, see Figure 2 for an illustration (for space
reasons only ten samples of digit 6 are shown there). The results of
four algorithms are shown in Figure 3: the genetic algorithm [17] (GA,
Section 5.2.2), the dynamic algorithm [18] (Section 5.3), the greedy algo-
rithm [4] (Section 5.2.1), and the set median. For comparison purpose the
consensus error
a
and
b
E S and the computation time
t
in seconds on a SUN Ultra60
workstation are also listed in Figure 3. Note that the consensus errors of
digit 6 are substantially larger than those of the other digits because of
the definition of consensus error as the sum, but not the average, of the
distances to all input samples.
The best results are achieved by GA, followed by the dynamic approach.
Except for digit 1, the greedy algorithm reveals some weakness. Looking
at the median for digits 2, 3 and 6 it seems that the iterative process ter-
minates too early, resulting in a string (digit) much shorter than it should
be. The reason lies in the simple termination criterion defined in [4]. It
works well for the (short) words used there, but obviously encounters di-
culties in dealing with longer strings occurring in our study. At first glance,
the dynamic approach needs more computation time than the greedy algo-
rithm. But one has to take into account that the recorded time is the total
time of the dynamic process of adding one sample to the existing set each
time, starting from a set consisting of the first sample. Therefore, totally
9 (97) generalized medians have effectively been computed for digits 1/2/3
(6). Taking the actual computation time for each update step of the whole
Search WWH ::




Custom Search