Digital Signal Processing Reference
In-Depth Information
recognizer offers a recognition rate near to the human performance. It is
suitable for difficult recognition scenarios (e.g. fluent speech with a large
vocabulary or spontaneous speech). Usually, HMMs demand a high
modeling effort and a floating-point arithmetic for the necessary
computational precision. The component costs to perform HMMs are high
and often oversized for simple control applications with a small number of
commands.
A further speech recognition technique is based on artificial neural
networks (ANN). An ANN is suitable to handle static patterns and self-
adapting processes. Low-cost solutions sometimes employ ANN techniques.
Except using the more complex TDNN approach (Time Delay Neural
Network), these solutions usually do not achieve satisfactory recognition
accuracy.
Recognizers using the principle of Dynamic Time Warping (DTW)
require less computational precision and modeling effort than HMM. A
major drawback is the increasing memory demand, if DTW recognizers are
speaker-independently trained. Generally, DTW recognizers can achieve a
similar recognition accuracy than HMM recognizers.
2.2
ASD algorithm
The patented Associative-Dynamic (ASD) recognizer was developed at
the Dresden University of Technology to provide a very cost-efficient and
simple recognizer alternative [1]. It requires ultra-low resources and it is
suitable to most command and control tasks in mobile applications. It can be
implemented at low-cost processor platforms. Several measures support the
memory reduction and the low processing load:
Reduced feature dimensions by a discriminative network without loss in
classification accuracy. An associative network at the front-end of the
classifier transforms the primary feature vectors x (describing the object
to classify and coming from the analyzer in equidistant time intervals),
into secondary feature vectors y with reduced dimension and improved
discrimination properties. By this transformation, the input pattern is
adapted to the statistical characteristics of the reference knowledge of the
classifier. The transformation weights are optimized for a given
recognition task in a training step by an evolutionary procedure [1].
Task-dependent distance operators. There is a choice of distance
operators by which optimal performance of the classifier for a given
recognition task and under varying accuracy conditions (fixed- vs.
floating point) can be achieved. Local distances are calculated by
applying the distance operator on each input- and reference vector pair.
Search WWH ::




Custom Search