Pitch Estimation and Voiced–Unvoiced Classification of Speech - Digital Speech: Coding for Low Bit Rate Communication Systems

Digital Signal Processing Reference

In-Depth Information

the normalized TA for a pitch candidate τ is given by

− τ −

s(n)s(n

+ τ)

R T (τ )

(6.18)

−

s 2 (n)

s 2 (n

+ τ)

which differs from the autocorrelation method discussed earlier in the limits

of the summations (the earlier method was more like a cross-correlation). The

TA has been widely used for PDAs due to its relatively good performance

especially over noisy speech signals [2]. Autocorrelation can also be used in

the frequency domain to bring out spectral similarities which are mainly due

to the pitch frequency spacing of the harmonics. If the spectrum of windowed

speech is given by S(m)

A(m)e jθ(m)

1, where A(m) and

θ(m) are the magnitude and phase of the normalized spectral autocorrelation

(SA), R S (τ ) can be defined as

for 0

≤

−

− ω τ

M/ 2

A z (m)A z (m

ω τ )

, for T (l)

T (u)

R S (τ )

≤

(6.19)

M/ 2

−

ω τ

A z (m) M/ 2 − ω τ

A z (m

ω τ )

0 and T (u 0 are the lower and upper limits

for the pitch search. In equation (6.19), the zero-crossing spectrum A z (m) is

given by

,and T (l)

where ω τ

M/τ

0 . 5

A z (m)

A(m)

−

gA(m)

(6.20)

where A(m) is the spectral envelope of A(m) . The envelope may be estimated

using the peak-picking method [17, 18]. The magnitude spectrum, A(m) ,is

converted into the zero-crossing spectrum A z (m) to make it feasible for the

autocorrelation defined in equation (6.19). The gain, g , is calculated as:

M/ 2

A(m)A(m)/ M/ 2

A(m)A(m)

(6.21)

In equation (6.20), the logarithmic spectrum can also be considered to obtain

a zero-crossing spectrum. However, the SA with the logarithmic spectrum

produces a high correlation ratio for large lags, τ ,closeto T (u)

(small

Digital Speech: Coding for Low Bit Rate Communication Systems

Search WWH ::

Custom Search

Home