NOISE ROBUST SPEECH RECOGNITION USING PROSODIC INFORMATION - DSP for In-Vehicle and Mobile Systems

Digital Signal Processing Reference

In-Depth Information

Figure 9-4. Comparison of the digit error rates by SP-HMM and S-HMM-DV for each speaker.

In this experiment, 10dB exhibition-hall noise was added to the test set.

were optimized for each noise condition. Digit accuracies were improved in all

kinds of noise and prosodic feature conditions. It can be seen that SP-HMM-

DV showed the best performance, which means that the effects of the

and the maximum accumulated voting value are additive. The best improvement

of 4.5% from 45.3% to 49.8% is observed in the condition when exhibition-hall

noise was added at 10dB SNR and the prosodic feature P-DV was used.

In Figure 9-4, the digit recognition accuracies by S-HMM and SP-HMM-DV

are shown for each speaker. In this experiment, 10dB exhibition-hall noise was

added to the test set. The improvement was observed for every speaker, which

means that the proposed method is useful for speaker-independent recognition.

Figure 9-5 shows the improvement of digit recognition accuracy as a function

of the prosodic stream weight

at each SNR. Results for four kinds of noises

Search WWH ::

Custom Search

Home