Digital Signal Processing Reference
In-Depth Information
and the LPC memory. At the onsets, even though the pitch pulses may be
irregular due to the unsettled pitch of the vocal cords, they are quite strong
and the residual energy is concentrated around them. Resonating segments
and dispersed pulses do not occur at the onsets. Therefore the only difficulty
at the onsets is in identifying the correct pulses and, as long as the pulse
identification process is successful, SWPM can maintain the continuity of the
harmonic phases at the onsets. The pitch pulse detection algorithm described
is capable of accurate detection of the pitch pulses at the onsets as described
in Section 9.4.1. Furthermore at the onsets, waveform coding preserves the
waveform similarity, which also ensures the correct LPC memory, since LPC
memory contains the past synthesized speech samples. Therefore the mode
transition at the onsets is relatively easier and SWPM guarantees a smooth
mode transition at the onsets. However at the offsets, the presence of weak
pitch pulses is a common feature and the highly resonant impulse response
LPC filter carries on the phase changes caused by the past excitation signal,
especially when the LPC filter gain is high. Therefore, the audible switching
artifacts remain at some of the offset mode transitions. These need to be
treated as special cases.
At the resonant tails the LPC residual looks like random noise, and the
pitch pulses are not clearly identifiable. In those cases AbS techniques can be
applied directly on the speech signal to synchronize the synthesized speech.
This process is applied only for the frames, which follow a harmonic frame
and have been classified as transitions.
Synthesized speech is generated by shifting the pitch pulse location (PPL)
at the end of the synthesis frame,
τ/ 2 around the synthesis frame boundary
with a resolution of one sample, where τ is the pitch period. The location
which gives the best cross-correlation between the synthesized speech and
the original speech is selected as the refined PPL. The pitch pulse shape
is set equal to the pitch pulse shape of the previous frame. The excitation
and the synthesized speech corresponding to the refined PPL are input to
the closed-loop transition detection algorithm, and form the harmonic signal
if the transition detection algorithm classifies the corresponding frame as
harmonic, otherwise waveform coding is used.
±
9.5.3 Offset TargetModification
The SWPM minimizes the phase discontinuities at the mode transitions, as
described in Section 9.5.2. However at some mode transitions such as the
offsets after female vowels, which have dispersed pulses, audible phase
discontinuities still remain. These discontinuities may be eliminated by trans-
mitting more phase information. This section describes a more economical
solution to remove those remaining phase discontinuities at the offsets,
which does not need the transmission of additional information. The pro-
posed method modifies some of the harmonic phases of the first frame of
Search WWH ::




Custom Search