A SPOKEN DIALOG CORPUS FOR CAR TELEMATICS SERVICES - DSP for In-Vehicle and Mobile Systems

Digital Signal Processing Reference

In-Depth Information

Figure 3-2. Collection timeline of spoken dialog corpus.

3.1

Speaker Selection and Collection Timeline for the

Spoken Dialog Corpus

The nature of the dialog task and the lexicon and grammar which describe

a speaker's utterances vary significantly according to the gender and age of

the speaker. Therefore, it is desirable to balance the gender and age range of

the speakers in the set of experimental subjects. We collected the spoken

dialog corpus from 250 speakers, consisting of 137 males and 113 females.

The age distribution of speakers was also balanced in the five age brackets of

20's, 30's, 40's, 50's, and 60's years old. All were residents of the Tokyo

Metropolitan Area and 235 of them held driver's licenses; fifty of the subjects

had prior experience with car navigation systems.

We divided the 250 speakers into five groups, G1 to G5, consisting of

approximately 50 speakers per group. The spoken dialog corpus was collected

from G1 to G5, in order, according to the timeline depicted in Figure 3-2. The

numbers of speakers in G1 to G5 are also shown under the arrows in the

figure. The number of speakers in each group differed because it was

necessary to arrange the data collection according to the individual schedules

of the subjects. After collecting data from each group, we improved the

experimental setup and the task configuration before proceeding with the next

group. The most significant improvements were introduced after collecting

data from G1. Therefore, in this article we refer to the data collection from G1

as Phase I, and the data collection from G2 through G5 as Phase II. The next

section discusses the difference of the experimental setup and task

configuration between Phase I and II.

Search WWH ::

Custom Search

Home