Using Immersive Simulations to Develop Intercultural Competence - Culture and Computing

Information Technology Reference

In-Depth Information

utterances in the language. As course developers author scenario dialogs and other

learning materials, the example utterances in these materials are entered into the

database and then used to create the language models. The speech recognition system

runs using the Julius speech recognition decoder, an open-source speech recognition

engine developed under the leadership of Kyoto University [6].

Next, the agent evaluates and interprets the communicative intent of the learner's

utterance and gesture. Communicative intents are represented using a library of

communicative acts, derived from speech act theory originating in the work of Austin

[1], and further developed by Traum and Hinkelman [13]. Each communicative act

has a core function, i.e., the illocutionary function of the utterance (to greet, inform,

request, etc.), and grounding function, i.e., the role of the utterance in coordinating

the conversation (e.g., to initiate, continue, acknowledge, etc.). The grounding

functions help to determine the current dialog context, which in turn can influence

how subsequent utterances are interpreted. At each point in the dialog, the agent is

expecting to hear and respond to one of a set of possible communicative acts, which

changes over the course of the conversation. If the learner says something that is not

appropriate at that stage of the conversation, e.g., greeting a character at the end of the

conversation instead of the beginning, the agent will act as if the learner said

something odd that does not make sense.

Note that the interpretation of the utterance and gesture depends upon the particular

culture being modeled. The mapping from utterances to communicative acts is

specified for each language. Some gestures have meaning only in certain cultures,

e.g., placing the palm of the right hand over the heart in greeting only has meaning in

Islamic countries. Some gestures are appropriate only in some social contexts; for

example, American culture and Arab cultures differ as to when it is acceptable to

shake hands with the opposite sex, or kiss the cheek of someone of the same sex.

Depending upon the type of dialog exercise, the agent may not just evaluate the

appropriateness of the learner's communication, but also identify and classify the

learner's mistakes. Operational Pashto and goEnglish both include so-called mini-

dialog exercises, in which learners practice individual conversational turns with a

non-player character and receive feedback regarding any mistakes they may have

made. Detected errors include grammatical errors, semantic errors (e.g., confusing

words with similar meanings), and pragmatic errors (e.g., inappropriate use of

expressions of politeness, honorifics, etc.).

Once the learner's input is interpreted, the intent planning stage occurs, in which

each agent in the conversation decides how to respond. Intent planning is challenging

because it must address multiple conflicting needs: accuracy, versatility, authorability,

and run-time performance. The agents should choose communicative acts that are

culturally appropriate, e.g., that match the dialog examples created in the cultural data

development process described in section 4. However the agent models cannot simply

follow the example dialogs as scripts, but need to be versatile enough to respond in a

culturally appropriate way regardless of what the learners might say. The agent

modeling language needs to be powerful enough to achieve such versatility, yet be

authorable by instructional designers who lack the computer science background

required for sophisticated agent programming languages. It also is important for the

intent-planning module to have good runtime performance, so that the intent planning

Culture and Computing

Search WWH ::

Custom Search

Home