Information Technology Reference
In-Depth Information
1.
Construct a CA with N×N cells and initialize all cells with 0. Restet the time (iteration)
counter ( t = 0 ).
2.
Force the output of nine cells in the middle of the CA (denoted by index p , as depicted in
Fig. 8.8 ) as following:
y p
(
t
)
s
t
10
p
3.
4.
Update the cells of the CA.
t = t+1.
! Tt STOP.
ELSE GOTO 2.
6. X = Y(t) (the cellular automata terminal pattern).
5.
IF
9
Fig. 8.8. Distribution of signal samples into the CA array (here the numbers represent the
cell index p, and N = 11)
Among other possible schemes for distributing the signal samples of the sliding
window on the CA cells, the above was found to be the most convenient.
Let us consider a CA with ID = 49 (later we will see on what basis it was se-
lected) and the utterance “one”. Because our CA cells are binary the continuously
valued original samples
1
are binarized such that:
. The
x
( t
)
s
(
t
)
1
2
sign
x
t
evolution of the CA configured as EMTM is shown in Fig. 8.9.
Using a binarized instead of the original speech signal may increase the classi-
fication error, although is known that such signals can still be correctly recognized
by humans. Since this is the most adverse condition to test our method, we expect
even better results using improved schemes.
As seen in Fig. 8.9 in the first 500 iterations the binary pattern within the CA
evolves quite rapidly as a growing structure with a slow explosion (limited by
choosing the exponent of growth U < 1.4). Then the pattern changes slightly
revealing accumulation of the specific variations in the signal sequence. It follows
that for a given size N of the CA, there is an optimum number T of signal samples
in the sequence. A smaller number of samples will produce a simple CA pattern in
the middle with insufficient information in it, while a larger value will add no es-
sential information to the CA pattern (as can be easily seen in Fig. 8.9) for t = 991 .
For the task of isolated word recognition (with lengths varying around 2,000
samples), the above scheme works quite well, particularly when the sounds
sequences are rather different (as “one” and “zero” in our examples). For better
accuracy, the signal temporal sequence S may be spilt in an equal number m (e.g.
m = 4) of sub-sequences (
S K ) such that each subsequence has an optimal
length. The resulting feature vector is now formed of all resulting binary patterns
1 ,
m
^
<
S
,
<
S
`
. This method, of splitting the signal sequence, is inspired
K
1
m
Search WWH ::




Custom Search