USE OF MULTIPLE SPEECH RECOGNITION UNITS IN AN IN-CAR ASSISTANCE SYSTEM - DSP for In-Vehicle and Mobile Systems

Digital Signal Processing Reference

In-Depth Information

Here and in the following the various language models are named

according to the domain (denoted by letter D) covered by data used in the

training phase. If a LM contains geographical classes, its name includes

information about the cluster (denoted by letter C) which contributes the lists

of names (cities, streets, hotels, etc.) used to expand the classes. Therefore,

for example, Dgl-Cgl denotes the LM trained on the global ( gl ) domain with

classes expanded with the global ( gl ) lists of names.

There are different options for building smaller LMs that contribute to

provide the complete coverage of the application domains foreseen in the

VICO system. A simple solution is to reduce the contents of the classes

associated to the large lists (cities, streets, hotels, etc.) introducing some

geographic clusters and building several LMs, each one covering only a

reduced area: in our setup Trentino has been divided in 7 geographic clusters

(C1,C2,C3,C4,C5,C7 ).

Another possible strategy in order to exploit different recognition units is

to build LMs not containing the classes associated to the big lists. This idea

derives from the observation than a generic dialogue contains a relatively low

number of sentences including the pronunciation of a noun associated to a

big list: this leads to the introduction of 2 further small LMs, namely Dge and

Dcmd, that have been trained removing from the corpus the sentences with

geographic class contents (e.g. cities, streets, hotels, POIs). In particular

Dcmd is a very restricted LMs (the vocabulary size is 130) and it should

handle only confirmation/refusal expressions and short commands to the

system.

Table 6-2 shows the results for these new LMs: Dgl denotes the original

global LM while the suffix Ci specifies the geographic cluster covered. The

higher WRRs obtained with Dgl-C1 are motivated by the fact that the WOZ

material regards geographic items mainly associated to C1 , i.e. the Trento city

area, where the acquisition took place. It is worth mentioning that although

WRR of Dge and Dcmd is rather low, the relative string recognition rate

shows that these LMs cover adequately a relevant part of the corpus.

Search WWH ::

Custom Search

Home