Digital Signal Processing Reference
In-Depth Information
broadband envelope are computed for all frames of large database and averaged finally
N 1
D Eu,log ¼
d Eu,log ( n ) :
(7 : 59)
The distance measure presented above is applied usually only on the spectral
envelopes of the original and the estimated signal, meaning that we can set
S ext ( e jV , n ) ¼ A bb ( e jV , n ),
(7 : 60)
S bb ( e jV , n ) ¼ A bb ( e jV , n ) :
(7 : 61)
For this reason they are an adequate measure for evaluating the vocal tract transfer
function estimation. If the entire bandwidth extension should be evaluated distance
measures which take the spectral fine structure as well as the characteristics of the
human auditory system into account need to be applied. 9 For deriving such a distance
measure we first define the difference between the squared absolute values of the esti-
mated and the original broadband spectra of the current frame as
D ( e jV , n ) ¼ 20 log 10 jS ext ( e jV , n ) j 20 log 10 jS bb ( e jV , n ) j:
(7 : 62)
One basic characteristic of human perception is that with increasing frequency the
perception resolution decreases. We can take this fact into account by adding an
exponential decay of a weighting factor for increasing frequency. Another basic
characteristic is, that if the magnitude of the estimated spectrum is above the mag-
nitude of the original one, there will occur bothersome artifacts. In the other case the
estimated spectrum has less magnitude than the original one. This does not lead to
artifacts that are as bothersome as the ones that occur when the magnitude of
the estimated signal is above the original one. This characteristic implies the use
of a non-symmetric distortion measure which we simply call spectral distortion
measure (SDM)
2 p
j ( e jV , n ) dV:
d SDM ( n ) ¼
(7 : 63)
9 If the logarithmic Euclidean distance is applied directly on the short-term spectra no reliable estimate is
possible any more. For example, a zero at one frequency supporting point in one of the short-term spectra
and a small but nonzero value in the other would result in a large distance even if no or nearly no difference
would be audible.
Search WWH ::

Custom Search