Digital Signal Processing Reference
In-Depth Information
Table 3.5 Linguistic details of text prompts of IITKGP-SESC: Scheme for groping of words and
syllables while extracting prosodic parameters
Sen.
Syllables
ini.
mid.
fin.
syl. in
syl. in
syl. in
words in the word
syl. wds. wds. wds.
initial
middle
final
sequence
words
words
words
S1
3
6
+
4
+
3
13
1
1
1
2
+
2
+
21
+
2
+
11
+
1
+
1
S2
5
1
+
2
+
2
+
4
+
4
13
2
2
1
2
+
0
+
12
+
2
+
21
+
2
+
1
S3
5
4
+
2
+
3
+
4
+
3
16
1
2
2
1
+
2
+
12
+
1
+
22
+
3
+
2
S4
4
4
+
4
+
3
+
3
14
1
1
2
1
+
2
+
11
+
2
+
12
+
2
+
2
S5
6
1
+
2
+
2
+
3
+
2
+
313 2 2 2 2
+
0
+
12
+
1
+
22
+
1
+
2
S6
5
4
+
2
+
5
+
3
+
3
17
2
1
2
2
+
2
+
21
+
3
+
12
+
2
+
2
S7
5
2
+
5
+
2
+
3
+
2
14
2
2
1
2
+
3
+
22
+
1
+
21
+
0
+
1
S8
3
3
+
4
+
4
11
1
1
1
1
+
1
+
11
+
2
+
11
+
2
+
1
S9
3
5
+
3
+
3
11
1
1
1
1
+
3
+
11
+
1
+
11
+
1
+
1
S10 5
1
+
2
+
6
+
3
+
2
14
2
1
2
2
+
0
+
11
+
4
+
12
+
1
+
2
S11 6
2
S12 4 2 + 2 + 4 + 4 12 2 1 1 2 + 0 + 21 + 2 + 11 + 2 + 1
S13 5 2 + 3 + 4 + 3 + 5 17 2 2 1 2 + 1 + 22 + 3 + 21 + 3 + 1
S14 4 3 + 2 + 3 + 3 11 1 2 1 1 + 1 + 12 + 1 + 21 + 1 + 1
S15 4 2 + 3 + 3 + 3 11 1 2 1 1 + 0 + 12 + 2 + 21 + 1 + 1
Fin. Final, Ini. Initial, Mid. Middle, No. Number, Sen. Sentences, Syl. Syllables, Wds. Words
2
+
5
+
4
+
1
+
3
+
318 2 2 2 2
+
3
+
22
+
2
+
12
+
2
+
and co-articulation constraints, words in each group are divided into initial, middle,
and final syllables. The last 3 columns of Table 3.5 indicate the number of initial,
middle, and final syllables present in initial, middle, and final words. Here the syllable
division is carried out using the following principle. (a) If the word contains more
than 2 syllables, then the first syllable of the word is considered as the initial syllable,
the last syllable of the word is considered as the final syllable, and the remaining
syllables are treated as the middle syllables. (b) If the word contains 2 syllables, then
they are treated as the initial and final syllables. (c) If the word consists of a single
syllable, then that syllable is treated as the initial syllable. The English transcriptions
of the text prompts of the Telugu database (IITKGP-SESC) are given in Table 3.6 .
The unicode set for Telugu alphabet is available at [ 5 ].
The process of extracting word level global and local prosodic features is similar
to the method of extracting utterance level global and local prosodic features. The
length of the feature vectors for word level global prosodic features is kept as 13
(1-duration, 6-pitch, and 6-energy). Here, the parameter normalized pause duration
is not included as the feature, since only one or two words are used for feature
extraction. Slopes of the pitch and energy contours are computed by considering
the first and last syllables of the specific words. The length of the feature vectors
for word level local prosodic features is fixed to be 15 for pitch and energy. This is
derived by re-sampling the original prosody contours obtained over the words. The
length of local duration vector is fixed at 6, which is equal to the maximum number
of syllables in a word of IITKGP-SESC. The length of the local duration vector, at
 
Search WWH ::




Custom Search