Digital Signal Processing Reference
In-Depth Information
Table 7.5 Typical 5-pulse algebraic codebook
tracks for a 40-sample subframe
Track
Pulse number
Possible locations
1
i 0
0,5,10,15,20,25,30,35
2
i 1
1,6,11,16,21,26,31,36
3
i 2
2,7,12,17,22,27,32,37
4
i 3
3,8,13,18,23,28,33,38
5
i 4
4,9,14,19,24,29,34,39
together they are able to form most of the combinations necessary for ade-
quate excitation. Since the selected pulse positions will usually correspond
to the remaining major pulses, which will usually have somewhat similar
magnitudes (expected after removing the pitch predictor contribution), the
pulse amplitudes are also restricted to having the same amplitude, usually set
to
1. However, in order to have efficient coding of the formations (indices)
of the excitation vectors and enable fast search, the overall combination of the
nonzero samples is usually restricted to four or five interleaved tracks. Only
one or two nonzero pulses with either positive or negative signs are placed in
each track. Table 7.5 shows typical five-pulse interleaved track positions in a
40-sample excitation subframe. Using the possibilities shown in Table 7.5, the
codebook vector x(n) is formed by setting only five unity pulses in a possible
40-sample vector with all other locations being set to zero.
±
x(n)
=
s 0 δ(n
m 0 )
+
s 1 δ(n
m 1 )
+
s 2 δ(n
m 2 )
+
s 3 δ(n
m 3 )
+
s 4 δ(n
m 4 ),
n
=
0 , ... , 39
(7.82)
where s i and m i are the sign and position of the i th pulse and δ( 0 ) represents
unity pulse amplitude.
The total possible number of excitation vector combinations that an alge-
braic codebook can produce is quite large. Therefore full searching of
all possible excitations becomes prohibitive for real-time implementations.
However, algebraic codebooks are designed to reduce this complexity sig-
nificantly. Having got the synthetic output for each excitation vector, the
cross-correlation of the synthesized signal with the target signal (LPC and
perceptual-weighting filters' memory response and the pitch predictor con-
tribution removed from weighted input speech) and the synthesized signal
energy need to be computed. The best excitation sequence is then selected by
maximizing:
( d t x k ) 2
x t
A k =
(7.83)
k x k
Search WWH ::




Custom Search