Talking Topically to Artificial Dialog Partners: Emulating Humanlike Topic Awareness in a Virtual Agent - Agents and Artificial Intelligence

Information Technology Reference

In-Depth Information

Ta b l e 1 . List of predefined main categories adequate for our dialog scenario

Main Category

Science Economics

Family Education

Studies Literature

Mass media Music

Arts Health

Ecology Digital media

Sports Occupations

Fashion Food and drink

Leisure Transport

Intimate relationships Regions

The next step is the preprocessing of the corpus in that incomplete sentences and ex-

pressions are completed to adapt the recorded utterances to the conditions given by the

fact that human-sided utterances are based on keyboard inputs. Then, we will accom-

plish the evaluation by automatically identifying the dialog topics and topic shifts within

the CUBE-G interactions by means of our proposed method to subsequently compare

the results with the manual annotations included in the corpus. If showing promising

performance, a user study evaluating the application of emulated human topic aware-

ness in the agent Max' conversational behavior will be scheduled next.

6

Related Work

A lot of work has been carried out on offline topic identification. A prevalent model was

developed in the context of the Topic Detection and Tracking (TDT) research program

[20]. Within the TDT research, Allan determined five tasks (i.e., Story Segmentation,

First Story Detection, Cluster Detection, Tracking, and Story Link Detection) for de-

tecting the several topics outlined in a text-based newscast. Further offline approaches

compute the coherence between documents via similarity measures (e.g., [21,22]). Oth-

ers rank Wikipedia articles according to their relevance to a given text fragment, for

example via text classification algorithms [13] or by simply exploiting the Wikipedia

article titles and categories [23]. One recent approach uses the Wikipedia category net-

work as a conceptual taxonomy and derives a directed acyclic graph for each document

by mapping terms to a concept in the category network [24].

Approaches for the online identification of topics in natural language dialogs are

rare. One work realizing a “Dynamic Topic Tracking” of natural language conversa-

tions between a human and a robot roughly adapted the five tasks from the TDT project

(see above) to make the robot more situation aware in human-robot interaction [25].

Thereby the amount of topics and the according topic names are created dynamically by

gathering the topic names from content words most occurring in the dialog utterances.

On the contrary, existing taxonomies can serve as a source for topic labels, for exam-

ple derived from the online encyclopedia Wikipedia [8,16]. Furthermore, conversation

clusters visually highlight topics discussed in conversations using Explicit Semantic

Analysis based on Wikipedia articles [26].

Agents and Artificial Intelligence

Search WWH ::

Custom Search

Home