NOISE ROBUST SPEECH RECOGNITION USING PROSODIC INFORMATION - DSP for In-Vehicle and Mobile Systems

Digital Signal Processing Reference

In-Depth Information

Figure 9-2. An example of the prosodic features in Japanese connected digit speech for a male

speaker's utterance, “9053308” “3797298”, with 20dB SNR white noise.

In this paper, two kinds of prosodic features and their combination, P-D, P-V,

and P-DV, are investigated:

P-D:

P-V: maximum accumulated voting value

P-DV:

+ maximum accumulated voting value

These three kinds of prosodic features are combined with segmental features for

each frame. Therefore, three kinds of segmental-prosodic feature vectors are

built and evaluated.

Search WWH ::

Custom Search

Home