Digital Signal Processing Reference
In-Depth Information
Figure 3-2. Collection timeline of spoken dialog corpus.
3.1
Speaker Selection and Collection Timeline for the
Spoken Dialog Corpus
The nature of the dialog task and the lexicon and grammar which describe
a speaker's utterances vary significantly according to the gender and age of
the speaker. Therefore, it is desirable to balance the gender and age range of
the speakers in the set of experimental subjects. We collected the spoken
dialog corpus from 250 speakers, consisting of 137 males and 113 females.
The age distribution of speakers was also balanced in the five age brackets of
20's, 30's, 40's, 50's, and 60's years old. All were residents of the Tokyo
Metropolitan Area and 235 of them held driver's licenses; fifty of the subjects
had prior experience with car navigation systems.
We divided the 250 speakers into five groups, G1 to G5, consisting of
approximately 50 speakers per group. The spoken dialog corpus was collected
from G1 to G5, in order, according to the timeline depicted in Figure 3-2. The
numbers of speakers in G1 to G5 are also shown under the arrows in the
figure. The number of speakers in each group differed because it was
necessary to arrange the data collection according to the individual schedules
of the subjects. After collecting data from each group, we improved the
experimental setup and the task configuration before proceeding with the next
group. The most significant improvements were introduced after collecting
data from G1. Therefore, in this article we refer to the data collection from G1
as Phase I, and the data collection from G2 through G5 as Phase II. The next
section discusses the difference of the experimental setup and task
configuration between Phase I and II.
Search WWH ::




Custom Search