Analysis by Synthesis LPC Coding - Digital Speech: Coding for Low Bit Rate Communication Systems

Digital Signal Processing Reference

In-Depth Information

by the pitch prediction, resulting in a better system. However at low bit-rates

(increased vector sizes), during voiced onsets and transitions where the pitch

cannot build up fast enough to track the changes, the speech quality dete-

riorates significantly. The advantage of algebraic codebooks also reduces at

low bit-rates (i.e. at around 4.8 kb/s) as the number of pulse combinations

need to be severely restricted in order to allocate fewer bits for the secondary

excitation which results in distorted speech. Other important issue at low

bit-rates is the amount of noise added to speech from the secondary excitation

during steady state voiced regions. A constrained gain approach [27] helps to

produce cleaner voiced speech by limiting the power of secondary excitation

during steady state voiced regions. This section describes an adaptive code-

book excitation where the excitation pulse-positioning is made adaptive with

the pitch lag computed for the same subframe. This can be seen as a subset

of the algebraic codebook approach where the pulse positions are severely

restricted but made adaptive with respect to the pitch so as to increase their

chances of positioning them to locations where they are needed most.

In pitch adaptive mixed excitation (PAME), the static codebook is split into

two parts. The first part is made adaptive with respect to the pitch lag as

follows. The excitation buffer is filled with a unit sample amplitude every D

samples starting from the first location. The rest of the vector elements are set

to zero. During the search of the codebook, this vector is synthesized and its

phase position is determined by shifting its synthetic response one sample at

atimefor D

1 times. Each phase position is then treated as a new excitation

vector. In order to guard against pitch-doubling errors in the LTP search, if the

lag D is greater than 2 D min the same process is applied again by placing the

excitation pulses every D/ 2 samples. The total number of excitation vectors

searched is then found by adding the total phase positions considered. This

is similar to regular pulse excitation with the decimation factor of D and D/ 2.

After selecting the best excitation vector from the pitch-adaptive section of

the codebook using C a phase positions, the search continues in the second

part of the codebook which is fixed and contains centre-clipped overlapping

excitation. Here, a further C f

−

=

−

C a vectors are searched and the best

performing vector index from the overall search process is transmitted to

the receiver. At the receiver, after decoding the pitch lag, the corresponding

excitation vector is decoded.

By forcing the secondary excitation to have pitch structure, it is possible

to match voiced onsets more accurately. This is because the pitch predictor

memory builds more quickly to track the incoming periodicity more accu-

rately and the secondary excitation provides the required periodicity where

the pitch predictor fails. This, of course, depends on the accurate computation

of the periodicity by the pitch predictor in the first place. Many other adapta-

tion schemes may be used to accurately place the secondary excitation pulses

every pitch period. The pitch predictor lag adaptation is useful because it

C

Digital Speech: Coding for Low Bit Rate Communication Systems

Search WWH ::

Custom Search

Home