Graphics Reference
In-Depth Information
and Shafer, 1978]. First, we compute LPC coefficients, then find the roots of
the linear predictor polynomial and choose formants from the roots.
5.1.2 An efficient real-time speech-driven animation system based on
formant analysis
We have shown the effectiveness of formant features in speech-driven anima-
tion in a real-time system. The architecture of the system is shown in Figure 5.5.
The close relationship of formant features and vowel-like sounds has enabled
Figure 5.5.
The architecture of a real-time speech-driven animation system based on formant
analysis.
a reasonably good linear audio-to-visual mapping. Ideally, if we assume a
one-to-one mapping from formants to vocal tract shape, a direct one-to-one
mapping from formants to mouth shape can be derived since mouth is part of
vocal tract. In practice that assumption doesn't hold because variability and
uncertainty is introduced by many factors in speech production. For example,
estimated formants of unvoiced speech generally cannot give a correct descrip-
tion of the vocal tract shape. Different speakers have different formants for the
same mouth shape due to the differences of the internal vocal tract. Fortunately,
research has shown that the direct mapping can be approximated for vowel-like
sounds. From research that has been done to measure the formant frequen-
cies of vowel-like sounds, it is known that the first two formant frequencies
Search WWH ::




Custom Search