Digital Signal Processing Reference
In-Depth Information
10.2.1 ITU-TG.729B/G.723.1AVAD
As an extension to the G.729 speech coder, ITU-T SG16 released G.729 Annex
B in order to support DTX bymeans of VAD, CNI, and CNG. G.729B conducts
a VAD decision every frame of 10ms, using four different parameters:
a full-band energy difference, E f
=
E f
E f
a low-band energy difference, E l
=
E l
E l
= i = 0 (LSF i
LSF i ) 2
a spectral distortion, LSF
a zero-crossing rate difference, ZC
=
ZC
ZC
where E f , E l , LSF i ,and ZC are the full-band energy, low-band energy, i th
line spectral frequency, and zero-crossing rate of the input signal. E f , E l ,
LSF i ,and ZC are the noise characterizing parameters updated using the
background noise.
The block diagram of G.729B VAD is shown in Figure 10.3. The input
parameters for the VAD can be obtained from the input signal or from
the intermediate values of the speech encoder. Subsequently, the difference
parameters, E f , E l , LSF ,and ZC , are computed from the input and
noise parameters. A decision of voice activity is conducted over a four-
dimensional hyper-space, based on a region classification technique, followed
by a hangover scheme. The noise parameters are updated based on a first
order autoregressive (AR) scheme, if the full-band energy difference is less
than a certain fixed threshold. ITU-T G.723.1A VAD has a structure similar
to G.729B VAD.
10.2.2 ETSIGSM-FR/HR/EFRVAD
The VAD algorithms of ETSI GSM-FR, -HR, and -EFR have a common struc-
ture, in which the predictive residual energy is compared with an adaptive
∆∆
E f , E l
LSF,
ZCR
E f , E l ,
LSF, ZCR
Differential
Parameters
Computation
No
Multi-Boundary
VAD Decision
E f < 15dB
E f , E l ,LSF, ZCR
Yes
Active/
Inactiv e
Noise
Parameters
Update
Hangover
Figure 10.3 Block diagram of ITU-T G.729B VAD
 
Search WWH ::




Custom Search