Analysis by Synthesis LPC Coding - Digital Speech: Coding for Low Bit Rate Communication Systems - page 242

Digital Signal Processing Reference

In-Depth Information

Table 7.5 Typical 5-pulse algebraic codebook

tracks for a 40-sample subframe

Track

Pulse number

Possible locations

1

i 0

0,5,10,15,20,25,30,35

2

i 1

1,6,11,16,21,26,31,36

3

i 2

2,7,12,17,22,27,32,37

4

i 3

3,8,13,18,23,28,33,38

5

i 4

4,9,14,19,24,29,34,39

together they are able to form most of the combinations necessary for ade-

quate excitation. Since the selected pulse positions will usually correspond

to the remaining major pulses, which will usually have somewhat similar

magnitudes (expected after removing the pitch predictor contribution), the

pulse amplitudes are also restricted to having the same amplitude, usually set

to

1. However, in order to have efficient coding of the formations (indices)

of the excitation vectors and enable fast search, the overall combination of the

nonzero samples is usually restricted to four or five interleaved tracks. Only

one or two nonzero pulses with either positive or negative signs are placed in

each track. Table 7.5 shows typical five-pulse interleaved track positions in a

40-sample excitation subframe. Using the possibilities shown in Table 7.5, the

codebook vector x(n) is formed by setting only five unity pulses in a possible

40-sample vector with all other locations being set to zero.

±

x(n)

=

s 0 δ(n

−

m 0 )

+

s 1 δ(n

−

m 1 )

+

s 2 δ(n

−

m 2 )

+

s 3 δ(n

−

m 3 )

+

s 4 δ(n

−

m 4 ),

n

=

0 , ... , 39

(7.82)

where s i and m i are the sign and position of the i th pulse and δ( 0 ) represents

unity pulse amplitude.

The total possible number of excitation vector combinations that an alge-

braic codebook can produce is quite large. Therefore full searching of

all possible excitations becomes prohibitive for real-time implementations.

However, algebraic codebooks are designed to reduce this complexity sig-

nificantly. Having got the synthetic output for each excitation vector, the

cross-correlation of the synthesized signal with the target signal (LPC and

perceptual-weighting filters' memory response and the pitch predictor con-

tribution removed from weighted input speech) and the synthesized signal

energy need to be computed. The best excitation sequence is then selected by

maximizing:

( d t x k ) 2

x t

A k =

(7.83)

k x k

Next Page

Digital Speech: Coding for Low Bit Rate Communication Systems

Search WWH ::

Custom Search

Home