BANDWIDTH EXTENSION OF TELEPHONY SPEECH - Adaptive Signal Processing: Next Generation Solutions

Digital Signal Processing Reference

In-Depth Information

has been used. An initial estimated broadband spectral envelope is generated by

multiplying the matrix W i opt ( n ) with the cepstral representation of the mean value

compensated input narrowband spectral envelope c nb ( n ) m i opt ( n ),nb . Afterwards the

broadband mean value m i opt ( n ),bb is added, resulting in

c bb ( n ) ¼W i opt ( n ) ( c nb ( n ) m i opt ( n ),nb ) þm i opt ( n ),bb :

(7 : 56)

After transformation of the vector containing the cepstral coefficients into a corre-

sponding all-pole filter coefficients vector a bb ( n ) stability needs to be checked.

Depending on the result either the result of the linear mapping operation or the corre-

sponding entry of the broadband predictor codebook entry a i opt ( n ),bb is output as a bb ( n )

a i opt ( n ),bb,

if instability was detected,

a bb ( n ) ¼

(7 : 57)

a bb ( n ),

else :

Note that stability of the broadband predictor codebook entries can be ensured during

the training stage of the codebook.

The training of this combined approach can be split into two separate training

stages. The training of the codebook is independent of the succeeding linear mapping

vectors and matrices and can be conducted as described in the previous section. Then

the entire training data is grouped into N i ,cb sets containing all feature vectors classified

to the specific codebook entries c i,nb . Now for each subset of the entire training

material a single mapping matrix and two mean vectors (narrowband and wideband)

are trained according to the method described in Section 7.5.2. In Figure 7.18 the func-

tion principle of the combined approach using a preclassification by a codebook and

afterwards doing an individual linear mapping corresponding to each codebook entry

is illustrated as an example. The little dots at the bottom of each diagram represent data

points in the two-dimensional input feature space (for this example we limit ourselves

to a two-dimensional space). The big dot represents the centroid which is the Euclidean

mean over all input vectors. The surface in Figure 7.18 a represents the mapping of the

input vectors to one feature of the output vectors—this would be the desired function

for the extrapolation task. In pure codebook approaches we map the output feature of

all vectors that fall into one cell [see part ( b ) of Figure 7.18]. If we now combine a

codebook classification with linear mapping as illustrated in part ( c ) a plane is

placed according to the data points within each cell with minimum overall distance

to the original surface resulting in less error when processing an input vector which

is close to a cell border. When comparing the approximations of parts ( b ) and ( c )

with the true mapping as depicted in part ( a ) the improvement due to the linear post-

processing stage is clearly visible.

Beside a postprocessing with linear mapping also individually trained neural net-

works can be utilized. Again, an improved performance can be achieved. Even if

the computational complexity does not increase that much the memory requirements

of combined approaches are significantly larger compared to single approaches.

Adaptive Signal Processing: Next Generation Solutions

Search WWH ::

Custom Search

Home