Graphics Reference
In-Depth Information
on the fly . Fourth , we have argued elsewhere (Thórisson, 2008) that
conversational skills—and by extension cognitive skills—allow for
a high interconnectivity between its many functions; that they are a
large, heterogeneous, densely coupled system (HeLD). The design of
such HeLDs requires new architectural principles—standard software
development methods will simply not suffice as they result in rigid
systems and require more manpower for longer extended periods
than any typical university or research lab is capable of securing. As a
result, both the underlying software and conceptual architecture 5 must
be highly modular, expandable and malleable. This approach puts a
greater emphasis on methodology than is typical, but we believe it
to be one of the few ways of actually achieving the integration of the
many mechanisms necessary for creating a system approaching the
flexibility and generality of real-world real-time human dialogue. It
may also be considered of a “practical” nature, as it makes continuous
expansion of the architecture more tractable for a small team. We have
found architectural structure and makeup to greatly influence not only
what kinds of operations it supports but also the speed of development
and manageability. We see architectural design as a necessary part of
any effort to develop dialogue systems intended to (incrementally)
approach human dialogue skills.
The architecture described below thus rests on three main
theoretical pillars. The first is a distributed-systems perspective, 6 the
second relates to architectural software methodology, and the third is
an underlying theory of turn-taking in multimodal real-time dialogue,
outlined in Thórisson (2002b), emphasizing real-time negotiation as a
key principle in turn-taking. In our approach, turn-taking negotiation
is managed by time-dependent “cognitive contexts” (also called
“fluid states” and “schema”) that, for each participant, hold which
perceptions and decisions are relevant or appropriate at each particular
point in time, and represent the disposition of the system at any point
in the dialogue, e.g. whether we might expect the other to produce a
certain turn-taking cue, whether it is relevant to generate a particular
behavior (e.g. volume increase in the voice upon interruption by the
other, etc.).
5 By “architecture” we mean the structure and operation of the system as a whole,
containing many identifiable interacting parts whose organization essentially dictates
how the system acts as a whole. The difference between software architecture and
conceptual architecture is often subtle, but essentially is a separation between the
operation of the particular software on the particular hardware and the behavior of
the dialogue system it implements.
6 By “distributed” we mean a system with multiple semi-independent processes that
can be run on multiple CPUs, computers, and/or clusters.
 
Search WWH ::




Custom Search