NOISE ROBUST SPEECH RECOGNITION USING PROSODIC INFORMATION - DSP for In-Vehicle and Mobile Systems - page 141

Digital Signal Processing Reference

In-Depth Information

2.

EXTRACTION USING THE HOUGH

TRASNFORM

2.1

Hough Transform

The Hough transform is a technique to robustly extract parametric patterns,

such as lines, circles, and ellipses, from a noisy image[3].

The Hough transform method to extract a significant line from an image on

the

plane can be formulated as follows. Suppose the image consists of

pixels at

Every pixel on the

plane is transformed to

a line on the

plane as

A brightness value of the pixel on the plane is accumulated at every

point on the line. This process is called “voting” to the plane. After voting

for all the pixels, the maximum accumulated voting value on the

plane is

detected, and the peak point

is transformed to a line on the

plane by

the following equation:

2.2 Extraction Using the Hough Transform

Cepstral peaks extracted independently for each short period of speech have

been widely used to extract values. This method often causes errors, includ-

ing half pitch, double pitch and drop outs, for noisy speech. Since contours

have temporal continuity in voiced periods, the Hough transform, taking ad-

vantage of its continuity, applied to time-cepstrum images is expected to have

robustness in extracting pitch in the noisy environment.

Speech waveforms are sampled at 16kHz and transformed to 256 dimensional

cepstra. A 32ms-long Hamming window is used to extract frames every 10ms.

For reducing noise effects of a high frequency domain, we extract and use time-

cepstrum images which are limited to 60~256 dimensions and liftered accord-

ing to the following formula:

where is the original cepstrum and is the liftered cepstrum.

To the liftered time-cepstrum image, a nine-frame moving window is applied

at every frame interval to extract an image for line information detection. The

time-cepstrum image is used as the pixel brightness image for the Hough trans-

form. An

value is obtained from a cepstrum index of the center point for the

Next Page

DSP for In-Vehicle and Mobile Systems

Search WWH ::

Custom Search

Home