Digital Signal Processing Reference
In-Depth Information
The SNR values are derived by the following equation:
where X is the mean power of disturbed signal (presence of speech -
averaged over entire utterance), N is the power of the noise (no presence of
speech), and SNR is the resulting subtraction coefficient.
Experiments with speech material from a running car at different speed
conditions (50 km/h, 90 km/h, 120 km/h) for streets with a plain surface
showed similar recognition results like the fan-sound experiments.
For streets with cobbled surface, the WRR decreased by about 5 % per
condition, because the background noise was not that stationary anymore.
Therefore, the detection accuracy of voice activity boundaries was decreased,
and this led - besides a poorer noise estimation - to a worse word recognition
rate.
A second improvement was gained by using a stationarity detector in the
VAD and by switching the word detectors' parameters corresponding to the
estimated stationarity of the background noise in order to make the detection
for voice events more or less sensitive.
5.
ISSUES OF HARDWARE DEPENDENT
IMPLEMENTATION
For the porting and optimal functionality on a low cost hardware
platform, several requirements have to be met. The optimizations aim at a
reduction of memory, processing load and need for computational accuracy.
Intending to run the ASD recognizer on several platforms, it was
implemented in ANSI C, which makes the code highly portable. First it was
implemented on a 32 bit floating point DSP with 60 MHz, later 16 bit fixed
point DSP were used. Processor specific changes of the code base where
applied rarely in order to speed core routines (assembly subroutines) or to
interface hardware components such as codec, displays, UART or CAN.
Vendors of car equipment can use the speech recognizer as a single OEM-
board (see Figure 12-7) or as a sub-application on their own processor.
Search WWH ::




Custom Search