Digital Signal Processing Reference
In-Depth Information
5.8 MSVQ Performance Analysis
In order to compare the relative performance of various MSVQs, quantizers
have been trained using the same training database, which has the following
characteristics:
MIRS and FLAT filtered speech in various languages are included
Only speech-active regions are included
LSFs are extracted with an update rate of 20ms, over a 200 sample Ham-
ming window
A bandwidth expansion factor of 0.994 is applied to the LP coefficients
prior to LSF conversion
50 000 sets of LSF coefficients are included
The speech database used is rather small to produce quantizers with good
performance in real-life applications. Typically, a speech database of over
1 000 000 LSF vectors is used for training codebooks for actual applications.
However, for the purpose of comparing performances of various quantization
schemes, the smaller speech database is adequate in providing indicative
results. Additionally, it significantly reduces the time required to train the
quantizers, which is prohibitive for the bigger database (several weeks of
computing are usually required for typical codebook training with the larger
database).
5.8.1 CodebookStructures
For a given bit rate, MSVQ and SVQ codebooks can differ in the number
of stages and in the vector splits. The actual structure of the quantizer
affects complexity and memory storage, as discussed earlier, but also affects
performance. Typically, the more structure imposed on the codebooks, the
lower the complexity and storage, but also the poorer the performance.
All of the SVQ and MSVQ quantizers have been trained using 24 bits, for
various numbers of stages, from 2 to 5. The configurations used are shown in
Table 5.7. The results are plotted in Figure 5.11. As expected, the performance
is directly linked to the amount of structure present in the codebook.
5.8.2 SearchTechniques
In order to compare the performance of various types of searches available
for a given codebook, an MSVQ codebook of 21 bits, using three stages of
7 bits each, has been trained. It uses no prediction and the search algorithm
used during training was a sequential search (SS). The performance of the
codebook was then measured using SS, FS, and TS with values of M from
2 to 32. The WMSE, average SD, and number of outliers at 2 dB are plotted
Search WWH ::




Custom Search