Digital Signal Processing Reference
In-Depth Information
The starting position n (t + 1 )
0
for the next subframe is updated as
n (t + 1 )
(n (t)
T (t)
=
+
0 ,I ) % N
(8.30)
0
I
where % is the modulo operator and I is the total number of PCWs. The
voicing cut-off index, V c ,isgivenby
V (t 1 )
,V (t)
V c =
max
{
}
(8.31)
c
c
The interpolated amplitude, A e,i (l) ,forthe l th harmonic is computed as
α i A (t 1 )
α i )A (t e (l),
if V (t 1 ) (l)
V (t) (l),
(l)
+
( 1
=
e
A (t 1 )
if V (t 1 ) (l)
1& V (t) (l)
A e,i (l)
=
(l),
=
=
0
e
A (t e (l),
if V (t 1 ) (l)
0& V (t) (l)
=
=
1
(8.32)
where V ( · ) (l) is the voicing information for the l th harmonic and 1 and 0 in
the voicing comparison denote voiced and unvoiced, respectively. The LPC
coefficient for the i th PCW is interpolated in the same way as, obtaining
the interpolated pitch. Finally, the normalized speech signal
s i (n) is recon-
structed by exciting the LPC synthesis filter h i (n) with the signal e i (n) in
equation (8.25), as
˜
˜
s i (n) =
e i (n)
h i (n)
(8.33)
where
s i (n) , the required
memory for e i (n) , n < 0, can be obtained from e i 1 (n) or the excitation signal
of the last subframe. The synthesized speech signal s i (n) for the i th PCW is
produced by compensating for the gain as
is the convolution operator. In calculation of
˜
T 0 ,i
T 0 ,i
˜
s i (n)
=
G i
s i (n)
(8.34)
1
s i (n)
0 ˜
n
=
where G i is the interpolated gain based on the relative position of the PCW in
the subframe. Concatenation of each PCW in equation (8.34) forms the final
speech signal.
The above description of excitation generation is based on the sinusoidal
synthesis of voiced and random noise generation of unvoiced parts of the
excitation. However, in practice, a DFT-based method (with the DFT size
equal to the pitch period), where the unvoiced frequencies would have
random phases, can be used to generate both voiced and unvoiced parts
jointly [10, 11].
Search WWH ::




Custom Search