CONSTRUCTION AND ANALYSIS OF A MULTI-LAYERED IN-CAR SPOKEN DIALOGUE CORPUS - DSP for In-Vehicle and Mobile Systems

Digital Signal Processing Reference

In-Depth Information

All of our recorded dialogues are transcribed into text in compliance with

a set of criteria established for the Corpus of Spontaneous Japanese (CSJ)

[13]. In Table 1-4‚ we tabulate many statistical data associated with our

dialogue corpus. As it can be observed from the first row‚ we have collected

more than 187 hours of speech data corresponding to approximately one

million morpheme dialogue units.

2.2

Task Domains

We have categorized the sessions into several task domains. In Figure 1-3‚

we show the breakdown of major task domains. It is easy to see that

approximately forty percent of the tasks are related to restaurant information

retrieval‚ which is consistent with earlier studies. In the sections to follow‚ we

will use only the data from the restaurant task. Our findings for other tasks

and driver behavioral data will be discussed later Chapters 17 and 19.

Search WWH ::

Custom Search

Home