Harmonic Speech Coding - Digital Speech: Coding for Low Bit Rate Communication Systems

Digital Signal Processing Reference

In-Depth Information

fully unvoiced and the last being fully voiced. One method of deciding the

frequency marker is placing it at the end of the last voiced harmonic of

the spectrum, i.e. all the voiced harmonics are included in the voiced band

of the spectrum. A better solution for determining the frequency marker,

based on a soft decision process is described in [9]. The harmonic amplitudes

are estimated using equations (8.11) and (8.13) for voiced and unvoiced

harmonics respectively, however the LPC residual is used instead of the

speech signal. The LPC parameters are quantized and interpolated in the LSF

domain. The shape of the harmonic amplitudes is vector-quantized and the

gain is scalar-quantized separately.

At the receiving end, speech is synthesized with parameter interpolation

based on pitch cycle waveform (PCW). First, intermediate PCWs for the cur-

rent subframe are generated by interpolating the quantizedmodel parameters

of the last and current subframes. The excitation signal e i (n) ,0

≤

n < T 0 ,i ,for

the i th PCW is produced as

V c

e i (n

n i )

A e,i (l) cos

{

lω 0 ,i (n

−

n i )

}

A e,i (l) cos

{

lω 0 ,i (n

−

n i ) +

U [

− π , π ]

}

(8.25)

V c +

where H is the total number of harmonics, ω 0 ,i

2 π/T 0 ,i and U [

−

π , π ]

denotes a random number with uniform distribution between

− π and π .The

start position n i for the i th PCW is given by

−

n i =

n 0 +

T 0 ,j

(8.26)

where n 0 is the start position corresponding to the last position of the previous

subframe. The interpolated pitch T 0 ,i for the i th PCW is calculated as

α i T (t − 1 )

α i )T (t)

T 0 ,i

( 1

−

(8.27)

where T (t 0 is the received pitch of the t th subframe. The interpolation factor α i

is defined as

G (t) N i

α i =

(8.28)

G (t − 1 ) (N

−

N i ) +

G (t) N i

where N is the subframe size, G ( · )

is the received gain, and N i is the PCW

position defined by,

0 . 25 (T (t − 1 )

T (t 0 )

N i =

n i +

(8.29)

Digital Speech: Coding for Low Bit Rate Communication Systems

Search WWH ::

Custom Search

Home