Databases Reference
In-Depth Information
Pitch
Aperiodic
flag
Adaptive
spectral
enhancement
Pulse
generation
Shaping
filter
Fourier
magnitudes
LPC
synthesis
filter
Noise
generator
Shaping
filter
Gain
Pulse
dispersion
filter
Synthesized speech
F I GU R E 18 . 9
Block diagram of MELP decoder.
1. The input is first filtered using a low-pass filter with a cutoff of 1kHz.
2. The normalized autocorrelation is then computed for lags between 40 and 160 samples.
The normalized autocorrelation r
(τ )
is defined as
c
τ (
0
,τ)
r
(τ ) =
c τ (
,
)
c τ (τ, τ )
0
0
where
τ/ 2 + 79
c τ (
,
) =
m
n
y k + m y k + n
τ/
2
80
The first estimate of the pitch P 1 is obtained as the value of
that maximizes the normalized
autocorrelation function. This value is refined by looking at the signal filtered using a filter
with a passband in the 0-500Hz range. This stage uses two values of P 1 , one from the current
frame and one from the previous frame, as candidates. The normalized autocorrelation values
are obtained for lags from five samples less to five samples more than the candidate P 1 values.
The lags that provide the maximum normalized autocorrelation value for each candidate are
used for fractional pitch refinement . The idea behind fractional pitch refinement is that if the
maximum value of r
τ
(τ )
is found for some
τ =
T , then the maximum could be in the interval
(
T
1
,
T
]
or
[
T
,
T
+
1
)
. The fractional offset is computed using
c T ( 0 , T + 1 ) c T ( T , T ) c T ( 0 , T ) c T ( T , T + 1 )
c T ( 0 , T + 1 ) [ c T ( T , T ) c T ( T , T + 1 ) ]+ c T ( 0 , T ) [ c T ( T + 1 , T + 1 ) c T ( T , T + 1 ) ]
=
(26)
 
Search WWH ::




Custom Search